• Issue #01
  • Issue #02
  • Issue #03
  • Issue #04
  • Issue #05
  • Issue #06
  • Issue #07
  • Issue #08
Issue #08
Contents
editorial
KOFI AGAWU
African Art Music and the Challenge of Postcolonial Composition
PAUL ZILUNGISELE TEMBE
China’s Effective Anti-Corruption Campaign
DILIP M. MENON
Changing Theory: Thinking Concepts from the Global South
BEN WATSON
Talking about music
Theme AI in Africa
blk banaana
An (Other) Intelligence
VULANE MTHEMBU
Umshini Uyakhuluma (The Machine Speaks) – Africa and the AI Revolution: Exploring the Rapid Development of Artificial Intelligence on the Continent.
OLORI LOLADE SIYONBOLA
A Brief History of Artificial Intelligence in Africa
CHRIS EMEZUE & IYANUOLUWA SHODE
AI and African Languages: Empowering Cultures and Communities
NOLAN OSWALD DENNIS
Toward Misrecognition. | Project notes for a haunting-ting
SLINDILE MTHEMBU
AI and documenting black women's lived experiences: Creating future awareness through AI-generated sonics and interpretive movement for the future of freeing suffering caused on black bodies.
ALEXANDRA STANG
Artificially Correct? How to combat bias and inequality in language use with AI
BAKARY DIARRASSOUBA
Bambara: The Jeli (Griot) Project
ROY BLUMENTHAL
Artificial Intelligence and the Arcane Art of the Prompt
AI GENERATED
"AI on Artificial Intelligence in Africa" and "Exploring its impact on Art and Creativity"
JULIA SCHNEIDER
AI in a biased world
MBANGISO MABASO
Bana Ba Dinaledi: Telling African Stories using Generative AI Art.
ALEX TSADO & BETTY WAIREGI
African AI today
BOBBY SHABANGU
Using Artificial Intelligence to expand coverage of African content on Wikipedia
DARRYL ACCONE
Welcome to The End of Beauty: AI Rips the Soul Out of Chess
VULANE MTHEMBU & ChatGPT
Hello ChatGPT - A conversation with OpenAI's Assistant
DIMITRI VOUDOURIS
Evolution of Sιήκ
STEFANIE KASTNER
Beyond the fact that most robots are white: Challenges of AI in Africa
MARTIJN PANTLIN
Some notes from herri’s full stack web developer on the AI phenomenon
galleri
THANDIWE MURIU
4 Universal Truths and selected Camo
ZENZI MDA
Four Portals
TIISETSO CLIFFORD MPHUTHI
Litema
NESA FRÖHLICH
Agapanthus artificialis: Biodiversität im digitalen Raum. Vierteilige Serie, Johannesburg 2022.
STEVEN J FOWLER
2 AI collaborations and 9 asemic scribbles
PATRICIA ANN REPAR
Integrating Healing Arts and Health Care
SHERRY MILNER
Fetus & Host
borborygmus
JANNIKE BERGH
BCUC = BANTU CONTINUA UHURU CONSCIOUSNESS
GWEN ANSELL
Jill Richards: Try, try, try...
VULANE MTHEMBU & HEIKKI SOINI
Nguni Machina remixed
AFRICAN NOISE FOUNDATION
Perennial fashion – noise (After Adorno).
RAJAT NEOGY
Do Magazines Culture?
NDUMISO MDAYI
Biko and the Hegelian dialectic
LEHLOHONOLO MAKHELE
The Big Other
frictions
KHAHLISO MATELA
At Virtue’s Zone
DIANA FERRUS
In memory of “Lily” who will never be nameless again
VUYOKAZI NGEMNTU
Six Poems from the Shadows
SIHLE NTULI
3 Durban Poems
SIBONELO SOLWAZI KA NDLOVU
I’m Writing You A Letter You Will Never read
OMOSEYE BOLAJI
People of the Townships episode 3
claque
SIMON GIKANDI
Introducing Pelong Ya Ka (excerpt)
UNATHI SLASHA
"TO WALK IS TO SEE": Looking Inside the Heart - Sophonia Machabe Mofokeng’s Pelong ya Ka
VANGILE GANTSHO
Ilifa lothando – a Review of Ilifa by Athambile Masola
ZIZIPHO BAM
Barbara Boswell found in The Art of Waiting for Tales
WAMUWI MBAO
Hauntings: the public appearance of what is hidden
CHARL-PIERRE NAUDÉ
Dekonstruksie as gebundelde terrorisme
VUYOKAZI NGEMNTU
Ibuzwa Kwabaphambili - A Review
MPHUTLANE WA BOFELO
Taking radical optimism beyond hope - Amakomiti: Grassroots Democracy in South Africa’s Shack Settlements
PATRIC TARIQ MELLET
WHITE MISCHIEF – Our past (again) filtered through the lens of coloniality: Andrew Smith’s First People – The lost history of the Khoisan
CHANTAL WILLIE-PETERSEN
BHEKI MSELEKU: an infinite source of knowledge to draw from
JEAN MEIRING
SULKE VRIENDE IS SKAARS - a clarion call for the importance of the old and out-of-fashion
GEORGE KING
Kristian Blak String Quartets Neoquartet
ekaya
PAKAMA NCUME
A Conversation with Mantombi Matotiyana 9 April 2019
KYLE SHEPHERD
An Auto-Ethnographic Reflection on Process
PAULA FOURIE
Ghoema
DENIS-CONSTANT MARTIN
The Art of Cape Town Singing: Anwar Gambeno (1949-2022)
ESTHER MARIE PAUW
Something in Return, Act II: The Blavet-Varèse project
STEPHANUS MULLER
Afrikosmos: the keyboard as a Turing machine
MKHULU MNGOMEZULU
Ubizo and Mental Illness: A Personal Reflection
off the record
FRANK MEINTJIES
James Matthews: dissident writer
SABATA-MPHO MOKAE
Platfontein, a place the !Xun and Khwe call home
NEO LEKGOTLA LAGA RAMOUPI
A Culture of Black Consciousness on Robben Island, 1970 - 1980
NELSON MALDONADO-TORRES
Outline of Ten Theses on Coloniality and Decoloniality*
ARYAN KAGANOF
An interview with Don Laka: Monday 10 February 2003
JONATHAN EATO
Recording and Listening to Jazz and Improvised Music in South Africa
MARKO PHIRI
Bulawayo’s movement of Jah People
STEVEN BROWN
Anger and me
feedback
MUSA NGQUNGWANA
15 May 2020
ARYAN KAGANOF / PONE MASHIANGWAKO
Tuesday 21 July 2020, Monday 27 July, 2020
MARIA HELLSTRÖM REIMER
Monday 26 July 2021
SHANNON LANDERS
22 December 2022
FACEBOOK FEEDBACK
Facebook
the selektah
CHRIS ALBERTYN
Lost, unknown and forgotten: 24 classic South African 78rpm discs from 1951-1965.
hotlynx
shopping
contributors
the back page
CHRIS BRINK
Reflections on Transformation at Stellenbosch University
MARK WIGLEY
Discursive versus Immersive: The Museum is the Massage
© 2023
Archive About Contact Africa Open Institute
    • Issue #01
    • Issue #02
    • Issue #03
    • Issue #04
    • Issue #05
    • Issue #06
    • Issue #07
    • Issue #08
    #08
  • Theme AI in Africa
  • English
  • isiZulu

CHRIS EMEZUE & IYANUOLUWA SHODE

AI and African Languages: Empowering Cultures and Communities

I-AI kanye neziLimu zaseAfrika : Ukunikeza amandla amasiko kanye nemiphakathi

African intellectuals must do for their languages and cultures what all other intellectuals in history have done for theirs.

Ngugi wa Thiong’o

Izinjulabuchopho zase Afrika kumele zenzele izilimi namasiko azo lokho okwenziwa ngezinye izinjulabuchopho emlandweni zenzela amasiko nezilimi zawo.

Ngugi wa Thiong’o
A futuristic painting of a world where African languages are spoken freely and African cultures are expressed. (this image is a product of NLP; generated using Stable diffusion) | Umdwebo wekusasa womhlaba lapho izilimu zase Afrika zikhulunywa ngenkululeko kanye namasiko ase Afrika enziwa.

During the summer of 2022, Tola traveled to Vienna for a conference. The next day, she went out to get food, but she couldn’t understand the local variant of German. She only speaks English and Yoruba. Thankfully, she had Google Translate on her phone, which allowed her to communicate with the waiter and get the food she needed.

Ngehlobo lika 2022, u-Tola wahambela e-Vienna eya kwi-nkomfa. Ngosuku olulandelayo, waphuma wayozitholela ukudla, kodwa akakwazanga ukuqonda ulwimi lwakhona lwesiJalimane. Ukhuluma kuphela isingisi nesi-Yoruba. Siyabonga, wayeno Google Translate efonini yakhe.

Google Translate is one of the many applications of a revolutionizing technology that is changing the world today, Artificial intelligence. One would ask what does this mean? How can intelligence be artificial? Let me give you a short overview.

I-Google Translate ingenye yama-applications eshintsha ubuchwepheshe neshintsha umhlaba namhlanje,  Artificial intelligence. Ungazibuza ukuthi lokhu kusho ukuthini? Kwenzeka kanjani ukuthi ubuhlakani bungabi obeqiniso? Ithi ngikubonise kancane.

The term “artificial intelligence” itself was created in 1956 by a professor of the Massachusetts Institute of Technology, John McCarthy.

Itemu “artificial intelligence” ngokwalo lasungulwa ngo 1956  usolwazi wase Massachusetts Institute of Technology, u- John McCarthy.

Professor John McCarthy, Stanford University, 1967. Incazelo: Usolwazi John McCarthy, Stanford University, 1967

In simple terms, artificial intelligence (otherwise known as AI)  is defined as “getting a computer to do ‘intelligent’ things that people do”. I am sure you have seen sci-fi movies that feature robots acting as humans (haha). Human productivity increases exponentially by automating several tasks with high precision.

Ngamagama alula, i-artificial inteligence(eyaziwa futhi ngokuthi i-AI) ichazwa ngokuthi “ukwenza ikhompyutha yenze izinto ezinobuhlakani ezenziwa ngabantu”. Ngiqinisekile  unamafilimu e-sci-fi anama-robhothi aziphathisa okwabantu(ha ha). Umkhiqizo wabantu ukhuphuka ngezinga eliphezulu ngokusebenzisa imishini eshaya emhlolweni.

This field has widely grown from what it was six decades ago to becoming one of the biggest domains of technology that currently permeates almost all facets of human life. Artificial intelligence is actively present in our lives now and is playing a huge role in the Fourth Industrial Revolution. It encompasses a broad spectrum of different technologies and applications. Natural language processing, the focus of this article, is one of such applications that deals with languages and machines.

Lendima seyidlondlobale kulokhu eyayiyikho emashumini ayisithupha eminyaka edlule yaba enye yezindima ezinkulu zobuchwepheshe ethinta yonke impilo yabantu. I-Artificial intelligence ikhona ezimpilweni zethu manje futhi idlala indima enkulu kwi-Fourth Industrial Revolution. Yengamele izindawo eziningi ezahlukene zobuchwepheshe kanye nama-applications. Ukuhlunzwa kolimi lwemvelo, okuyingqikithi yalombalo, ingezinye zama-applications abhekene nezilimi kanye nemishini.

NLP is the use of language with technology

I-NLP ukusetshenziswa kolwimi kanye nobuchwepheshe.

We humans communicate through language. As members of human society, we are connected in a plethora of ways. One of these ways is through communication. With the aid of language, we are able to express ourselves. To enable computers/machines to interact with humans (which is a necessity for artificial intelligence), computers need to understand the natural languages used by humans. Natural Language Processing (NLP, for short) is a form of AI that teaches machines to read or recognize text and voice, extract value from it, and potentially convert the information into a desired output format, such as text, voice, images, and even videos. NLP is the use of language with technology. The ultimate goal of NLP is to help computers understand language as well as humans do. This is one major step in helping computers attain artificial intelligence.

Thina bantu sixhumana ngezilimi. Njengamalunga abantu, sixhumene ngezindlela eziningi. Enye yalezizindlela ukukhulumisana. Ngokusizwa ulwimi, siyakwazi ukuzwakalisa ubuthina. Ukuvumela amakhompyutha/imishini ukuthi ixhumane nabantu (okuyinto ebalulekile kwi artificial intelligence), amakhompyutha kumele aqonde izilimu zemvelo ezisetshenziswa ngabantu. I-Natural Language Processing(kafuphi i-NLP) iwuhlobo lwe AI efundisa imishini ukufunda nokubona imibhalo kanye nezwi, ihluze okubalulekile kuyona, bese iba namandla okuphendula lolo lwazi lube ilento efunekayo, njenge mibhalo, izwi, izithombe, kanye nezithombe ezinyakazayo(video). I-NLP ukusetshenziswa kolwimi ngobuchwepheshe. Inhloso enkulu ye NLP ukusiza amakhompyutha aqonde ulwimi ngendlela abantu abaliqonda ngayo. Ilona gxathu elikhulu elizosiza amakhompyutha athole I artifical intelligence.

NLP in our lives

I-NLP ezimpilweni zethu

The importance of communication has led to the widespread use of NLP. Millions of users can now turn to NLP to automate mundane tasks.

Ukubaluleka kokuxhumana sekuholele ekwandeni kokusetshenziswa kwe-NLP. Izinkulungwana zabasebenzisi sebengasebenzisa i-NLP ukwenza imisebenzi emincane,

Remember our story about Tola and Google Translate? Google Translate is one of the many applications of NLP. It is a deployment of machine translation, one of the important NLP tasks which involves translating texts from one language to another. Through this, Tola can confidently walk into a store in Vienna (and many countries) and get groceries with her mobile phone.

Uyayikhumbula indaba yethu  ngo Tola ne Google Translate? I-Google Translate iyenye yezindlela zokusebenza kwe-NLP. Iwukusebenziswa kokuhumusha komshini, enye yezindlela ezibalulekile ze-NLP ezifaka ukuhumusha imibhalo isuka kolunye ulwimi iya kolunye.

Have you watched Hustle? If you haven’t already, I strongly advise you to do so (*wink*). In the movie, protagonist Stanley (played by Adam Sandler) works for a big-time NBA club, the Philadelphia 76ers, and he travels across continents scouting for professional players for the team. He gets to Spain and is fascinated by a player who he sees playing in a street basketball match. In order to communicate with this player, his mobile phone is his his mouthpiece, as he can only speak English and the player speaks only Spanish. The mobile phone has an app which provides machine translation and speech synthesis. Speech synthesis converts written text to speech.

Usuke wayibukela i-Hustle? Uma ungakayibuki,  ngiyakweluleka ukuthi wenze njalo (*ecifa ihlo*). Kuleyo filimu, u-Stanley usebenzela iqembu elikhulu le NBA, i.-Philadelphia 76ers, uhamba amazwe ngamazwe efuna abadlali abasezingeni eliphezulu. Wafika e-Spain wahlabeka umxhwele ngumdlali odlala kulokhu okwaziwa nge-street basketball. Ekuqaleni, ukuxhumana nalomdlali, wasebenzisa ifoni yakhe, njengoba wayekhuluma isingisi kuphela umdlali ekhuluma iSipanishi. I-app yayinamandla omshini okuhumusha kanye nokuhlunza inkulumo. Ukuhlunza inkulumo kuphendula imibhalo ibe inkulumo.

“What is zero divided by zero?” This is the most frequently asked question for Siri – Apple’s speech recognition assistant. You ask Siri a question and Siri answers intelligently. Speech recognition converts speech to text.

“Kuphumani masihlukanisa u-zero ngo-zero?” Lona ngumbuzo obuzwa kakhulu ku-Siri – okuwumsizi ohluza inkulumo waka Apple. Ubuza u-Siri umbuzo bese u-Siri ephendula ngobuhlakani. Ukuhluza inkulumo kuphendula okukhulunyiwe kube umbhalo.

The image above shows how Google helps predict the user’s question. Autocomplete and autocorrect are applications of text prediction, yet another application of NLP.

Isithombe esingenhla sikhombisa ukuthi u-Google usiza kanjani ukuthola umbuzo walowo owusebenzisayo. I-autocomplete kanye ne autocorrect angama-applications ahlunza imibhalo, okuyenye yezindlela zokusebenza kwe –NLP.

These are just a few examples of the many ways NLP has changed our lives and is still helping to improve our standard of living. 

Lezi ezinye zezindlela ezibonisa izindlela eziningi i-NLP esiguqule ngayo izimpilo zethu neqhubeka ngayo ukuphucula izimpilo zethu.

What about NLP for African languages?

Sikuphi ne-NLP nezilimu zase Afrika?

As an African, it is important to communicate in our African languages. Our African languages are our identity and our culture. While Tola can conveniently communicate with the food seller in the person’s native language, German, thanks to Google Translate, it turns out that the reverse is not the case: the food seller is not able to communicate with Tola in her native language, Yoruba, because the translation performance is bad and unreliable (*sad*).

Njengomuntu wase Afrika, kubalulekile ukuxhumana ngezilimu zethu. Izilimu zethu zase Afrika ziwubuzwe kanye nosikompilo lwethu. Lapho u-Tola engaxhumanana kahle nomdayisi wokudla ngolwimi lomdayisi, isiJalimane, sibonga u-Google Translate, kuyacaca ukuthi akanakukwenza lokhu ngolwimi lakhe: umdayisi wokudla akakwazi ukuxhumana no Tola ngolwimi luka Tola, isi-Yoruba, ngenxa yokuthi ukuhumushwa kwalo akukho ezingeni futhi akwethembekile (*kuyajabhisa*)

Despite there being more than 2000 African languages, they are barely represented in our current language technologies: for example, existing speech recognition services (like Amazon’s Alexa, Apple’s Siri, and Google’s Home) do not currently support a single African language. This disparity excludes the speakers of these languages from the benefits of these artificial intelligence applications, thereby widening the existing digital divide.

Ngaphandle kokuthi sinezilwimi ezingaphezu kuka 2000 zase-Afrika, azimelelekile ngokwanele ebuchwephesheni besimanje: isibonelo, abahlunzi benkulumo abakhona (njengo Alexa we-Amazon, u-Siri we-Apple, kanye no-Google Home) awaziseki izilimu zase Afrika. Lokhu kwahlukana kuzibeka ngaphandle lezi zilimu ekuzuzeni kwi-artificial intelligence applications, lokhu kukhulisa uqhekeko lwe-digital divide.

When we dig deeper into these wonderful achievements, we discover that only a small percentage of the world’s over 7000 languages are represented in the rapidly evolving language technologies and applications. The remaining languages, called low-resource languages, are largely excluded from these language technologies. African languages fall into this category.

Uma siqhubeka siphanda ngalemiphumela emihle, sithola ukuthi iphesenti elincane lezilimu zomhlaba ezingaphezu kuka 7000 ezimelelekile ezinguqukweni zezilimu kubuchwepheshe nama applications. Izilimu ezisele, ezibizwa nge low-resource languages, zivalelwa ngaphandle kulobuchwepheshe bezilimu. Izilimu zase Afrika ziwela kulomkhakha.

The language technologies and architectures built for NLP were modeled for western languages, with little to no consideration for the linguistic features of African languages and needs of African communities. That is why, for example, even the best transcription model, OpenAI’s Whisper, does a terrible job with African languages (see below).

Ubuchwepheshe bezilimu kanye nezakhiwo ezakhiwele i- NLP zazakhelwe izilimu zasentshonalanga, kunganakekelwanga nakancane izimo zezilimu zase Afrika kanye nezidingo zemiphakathi yase Afrika. Yingakho nje, isibonelo, ngisho izihumushi ezingcono, ze-OpenAI Whisper, yenza umsebenzi omubi ngezilimu zase Afrika (bheka ngezansi).

One negative result is that Africans are often misunderstood by these models. I and Alexa are always at loggerheads. Several times I have told Alexa to play me some of my favorite Afrobeat songs, but it either plays the wrong song or tells me it does not understand me. The powerful Google Translate also falls short by giving incorrect translations for African languages. These are many of the numerous shortcomings of NLP. 

Okunye okubi ngalemiphumela ukuthi ama-Afrika awaqondwa yilama-models. Mina no Alexa sihlezi siphambana njalo. Izikhathi eziningi ngitshela u-Alexa ukuthi angidlalele umculo engiwuthandayo we Afrobeat, kodwa adlale iculo okungeyilo noma angitshele ukuthi akangiqondi. I-Google Translate enamandla nayo ikha phansi ekunikezeni ukuhumusha izilimu zase Afrika. Lezi ezinye zezindlela ehluleka ngayo i-NLP .

Another negative result of this is that these NLP models could portray Africa and the African context incorrectly. For example, in this video, a researcher shows how a text-to-image model, an NLP model trained to generate an image given a text description of it (amazing, right? ), produces the wrong image of a wedding in Sudan.

Ezinye zezindlela ezimbi ezokuthi le-NLP models ingabeka kabi i-Afrika kanye nokuqonda kwe-Afrika ngendlela okungeyiyo. Isobonelo, kule-video, umcwaningi ukhombisa ukuthi uhlobo lwemibhalo eya ezithombeni (text-to-image model), loluhlobo lwe NLP luqeqeshelwe ukwenza izithombe ngemibhalo enikeziwe eyincazelo yayo (kuyamangaza, right?), lwenza izithombe ezingeyizo zomshado eSudan.

Representing African Languages and Cultures in the Digital World

Ukumelwa kwezilimu zase Afrika namasiko kuMhlaba we Digital

Below, we discuss some of the root causes of the numerous NLP challenges faced by African languages.

Ngezansi, sidingida ezinye zezimbangela zezinqinamba ze-NLP ezibhekene nezilimu zase Afrika.

The Challenges of Digitizing African Languages and Cultures

Izingqinamba zoku-Digitizing Izilimu zaseAfrika namasiko

African languages are often underrepresented in technological environments, leading to a lack of support for these languages in many digital tools. For instance, how easy is it for you to communicate online in your native African language? Many keyboards do not have the necessary diacritical marks to properly represent African languages like Yorùbá, which uses tonal and orthographic diacritics to differentiate words with similar spellings but different meanings. Without these marks, it is very difficult to properly digitize these languages.

Izilimu zase Afrika isikhathi esiningi azivezwa ngendlela eyiyo ezindaweni zobuchwepheshe, okuholela ekungesekweni kwalezizilimu kuma-digital tools amaningi. Ukubonisa, kulula kangakanani kuwena ukuxhumana online usebenzisa ulwimi lakho lwase Afrika? Ama-keyboards amaningi awanazo izinsiza-zimpawu ezenza kubhalwe izilimu zase Afrika ngendlela njenge Yorùbá, esebenzisa i-tonal kanye ne orthographic diacritics ukwehlukanisa amagama anezipelingi ezifanayo kodwa esho izinto ezehlukene. Ngaphandle kwalezizimpawu, kunzima uku-digitize lezi zilimu ngendlela eyiyo.

Another major factor is the heavy reliance on text for digitization. In contrast, many African cultures rely heavily on oral communication, which is incompatible with text-based computing systems. This means that much of the information and knowledge transmitted through these cultures cannot be easily represented in digital form. As a result, there is a lack of African-focused content on the web, as it is difficult to represent these languages and cultures in a digital format.

Enye ingqinamba enkulu ukwencika kwimibhalo ukwenza i-digitization. Ukuqhathanisa, amasiko amaningi ase-Afrika ancike ekuxhumaneni okungabhaliwe (oral communication), okungahambisani nemibhalo yamakhompyutha. Lokhu kusho ukuthi iningi lolwazi nokwazi oludluliswa ilamasiko angeke lubekeke kalula i-digital form. Ngokomphumela, kunokungabibikho kokuqukethwe yi-internet okuqondene ne-Afrika, ngenxa yokuthi kunzima ukubeka lezizilimu namasiko kwi-digital format.

Lack of African-centric NLP training datasets

Ukungabibikho kweAfrika ku-NLP nokuqeqeshwa kolwazi

In order to build NLP models of any kind, you need lots of data to train the model. The model is only as good as the data it is trained on. Our current NLP applications are amazing for western languages like English, Spanish, French, and German because there is a lot of training data for these languages. On the other hand, the scarcity of training datasets for African languages limits researchers’ ability to conduct NLP research and develop language technologies.

Ukuze kwakheke iNLP yalolonke uhlobo, udinga inqwaba yolwazi ukuze uqeqeshe isifanekiso. Isifanekiso siyoba sihle njengolwazi esiqeqeshwe kulo. I-NLP applications ekhona kuyimanje yinhle kwizilimu zasentshonalanga njengesiNgisi, isiSpanishi, isiFrentshi kanye nesiJalimane ngoba kuningi ukuqeqeshwa kolwazi kulezizilimu. Kwelinye icala, ukungabibikho  kokuqeqeshwa kolwazi ezilimini zase Afrika kuvimba abacwaningi ukuth benze iNLP bacwaninge bakhulise ubuchwepheshe bezilimu.

Africa is becoming increasingly English-centric, partly due to her colonization by Western powers. This has led to the widespread adoption of English as the dominant language in many areas, including business, education, and technology. This dominance of English has had many negative effects, including the marginalization of other languages and cultures. It has also contributed to the globalization of Western culture and values, which can be seen in the way that English is often seen as a requirement for success in many fields. This English-centric world is a legacy of colonialism, and it continues to shape the way we think and communicate today. It has also led to more content being produced in English language and less in African languages. This pervasive low availability of training datasets in African languages is well summarized by Joshi et al. in the image below.

I-Afrika iya ngokuya incika esiNgisini, ngenxa yokucindezelwa abaseNtshonalanga. Lokhu sekwenze ukuthi kwamukelwe isiNgisi njengolimu olusebenziswa ezindaweni eziningi, okubalwa kuzo amabhizinisi, imfundo, kanye nobuchwepheshe. Lokhu kuqhwakela kwesiNgisi sekube nemiphumela engemihle, okubalwa kuyo ukucwaswa kwezinye izilimu kanye namasiko. Kuphinde kwenza ukufakwa umhlaba wonke kwamasiko aseNtshonalanga kanye namagugu, okubonakala lapho isiNgisi sibonakala njengesidingo ukuze uphumelele ezindimeni eziningi. Lokhu kuncika esiNgisini komhlaba kuyifa lencindezelo, futhi kuyaqhubeka nokubumba indlela esicabanga ngayo nesixhumana ngayo namhlanje. Sekuholele ekutheni kube nemikhiqizo yolimi lwesiNgisi kuthi eyezilimu zaseAfrika ibe mncane. Loku kusabalala kokungabibikho kokuqeqeshwa kolwazi ngezilimu zase Afrika kucutshungulwa kahle u- Joshi et al. esithombeni esingezansi.

The authors divide the digital status and ‘richness’ of languages in the context of data availability into 6 classes. We see the majority of African languages belonging to The Left Behinds category, which have little to no unlabeled training data. 

Ababhali bahlukanise i-digital status kanye nokujula kwezilimu ngomongo wokubakhona kolwazi yaba ngamakilasi angu 6. Sibona iningi lezilimu zase Afrika ziwela kulengxenye ebizwa ngokuthi The Left Behinds, ezinolwazi oluncane olungachaziwe lokuqeqeshwa.

Low discoverability of existing efforts on African languages

Amazinga aphansi okutholakala kwemizamo ngezilimu zase Afrika

The few resources (research papers and datasets) available for African languages are difficult to come by. Many times, access to language data for a specific country requires affiliation with a specific academic institution in that country. This reduces the ability of countries and institutions to combine their knowledge and datasets to achieve better performance and innovation. Existing research is frequently difficult to find because it is frequently published in smaller African conferences or journals that are not electronically accessible or indexed by research tools such as Google Scholar.

Izinsiza ezincane (Amaphepha ocwaningo kanye nolwazi) ezikhona ngezilimu zase Afrika akulula ukuzithola. Izikhathi eziningi, ukuthola ulwazi ngezilimi kulelozwe kudinga ukwencika esikhungweni semfundo esithile kulelozwe. Lokhu kunciphisa amathuba amazwe kanye nezikhungo ekuhlanganiseni ulwazi lwazo kanye nemiqingo ukuze zithole ukusebenza kancono nemikhuba eqanjwe kabusha. Ucwaningo olukhona kunzima ukuluthola ngenxa youkuthi kaningana lushicilelwe kwizinkomfa ezincane zase Afrika noma kushicilelo olungatholakali ngekhompyutha noma oluhlelwe amathuluzi ocwaningo afana no Google Scholar.

Chris Emezue of Lanfrica elaborates on this topic in the CarpentryCon 2022, hosted by The African Carpentries Community on Thursday, August 11 2022.

U-Chris Emezue we Lanfrica uchaza kabanzi ngalesisihloko ku-CarpentryCon 2022, eyayihlelwe yi-African Carpentries Community ngoLwesine, August 11 2022.

Efforts to build language technologies for African languages

Imizamo yokhwakha ubuchwepheshe kwezilimu zase Afrika

No one knows African languages better than Africans; therefore, there is a need for more Africans, especially the young stars, to learn their languages. In order to bring African languages back to life and increase their use, the government should implement language policies that support and promote these languages. This can include teaching African languages in schools, documenting texts in these languages, and recording the richness and beauty of African cultures and traditions. By embracing and supporting African languages, we can improve the state of natural language processing in Africa and create effective language technologies that suit the needs of the African people.

Akekho owazi izilimu zase Afrika njengamaAfrika; ngakhoke, sikhona isidingo samaAfrika amaningi, ikakhulukazi intsha, ukufunda izilimu zazo. Ukuze kulethwe izilimu zase Afrika empilweni kanye nokukhuliswa kokusetshenziswa kwazo, uhulumeni makenze izinqubo mgomo zolwimi ezeseka futhi zikhulise lezi zilimu. Lokhu kungafaka phakathi ukufundiswa kwezilimu zase Afrika ezikoleni, ukubeka imibhalo ngalezi zilimu, kanye nokuqopha ubunzulu kanye nobuhle bamasiko ase-Afrika. Ngokwemukela nokweseka izilimu zase-Afrika, singathuthukisa isimo sokusebenza kwezilimu e-Afrika sakhe  nobuchwepheshe bezilimu obufanele abantu base Afrika.

Science and technology should be taught extensively in schools. Also, more educational opportunities such as bootcamps, mentorship programmes and academies should be available and accessible in Africa. In western world, we hear of children that started coding at early ages. We need access to more computing resources and network and internet connectivity for African children so that they can be exposed like their western counterparts.

I-Sayensi kanye nobuchwepheshe akufundiswe ikakhulu ezikoleni. Okunye, amathuba okufundisa afana no-bootcamp, izinhlelo ze-mentorship kanye nezikole eziphakemeyo zemfundo azibe khona zitholakale e-Afrika. Entshonalanga, sizwa ngezingane eziqale i-coding zisencane. Sidinga ukuthola amakhompyutha nezinsiza kanye nenethiwekhi nokuxhumana kwe-internet kwezingane zase Afrika ukuze nazo zivuleleke emathubeni njengezase ntshonalanga.

A boy and a woman use a laptop with connectivity in Mawingu, Kenya. Umfana nentombazane basebenzisa i-laptop enokuxhumana eMawingu, Kenya

Many communities and researchers around the world are working to make it easier for African languages to be used in language technologies. One of such communities is Lanfrica. Lanfrica aims to accelerate the development of AI applications in African and under-represented regions, by creating large, high-quality African machine learning datasets and building a data-centric foundational platform to assist enterprises in accelerating their AI applications.

Imiphakathi eminingi nabacwaningi emhlabeni wonke basebenzela ukwenza lula ukusetshenziswa kwezilimu zaseAfrika kubuchwepheshe. Enye yaleyo mphakathi i-Lanfrica. I-Lanfrica ihlose ukukhulisa ukudlondlobala kwe-AI applications e-Afrika nasezifundeni ezingamelelekile, ngokusungula enkulu, nesezingeni eliphezulu imishini yase Afrika efundisa nokwakha ulwazi oluyisisekelo nkundla oluzosisa amabhizinisi ekukhuliseni i-AI applications.

During its official launch in February 2022, Lanfrica built an aggregator platform that curates and links African resources, making them discoverable. For instance, if you’re looking for resources (linguistic datasets or research papers) in a particular African language, Lanfrica will point you to the different sources on the web that have such datasets in the desired language.

Ngokuvulwa kwayo ngo February 2022, i-Lanfrica yakha i- aggregator platform ehlanganisa futhi ixhumanise izinsiza zase-Afrika, ziwenza atholakale. Ukubonisa nje, uma ufuna izinsiza (linguistic datasets noma imibhalo yocwaningo) ngolwimi oluthize lwaseAfrika, i-Lanfrica ingakukhombisa imithombo eyahlukene kwi-intanethi ezinalolulwazi ngolwimi olufunayo.

Lanfrica adopts a participatory approach by allowing the general community to contribute resources. With its Slack community, Lanfrica offers the ideal space for researchers, government officials, students, etc. working on or looking for African language resources to connect. The team also uses Twitter and a blog to spread awareness about African language resources. One of such forms of awareness is the Lanfrica Talks series that highlights and showcases language technology efforts (research, projects, software, applications, datasets, models, initiatives, etc.) geared towards under-represented languages around the world.

I-Lanfrica yamukela indlela yokubambisana ngokuvumela umphakathi ukuthi ufake izinsiza. Ne-Slack community, i-Lanfrica inikeza indawo ekahle kubacwaningi, izikhulu zikahulumeni, abafundi, etc. besebenza noma bebheka izinsiza zezilimu zase-Afrika abangazihlanganisa. Iqembu  libuye lisebenzise u-Twitter kanye ne-blog ukudlulisa ukuqwashisa ngezinsiza zezilimu zase-Afrika. Enye yalezizindlela zokuqwashisa i-Lanfrica Talks series neqhakambisa futhi ibonise imizamo yobuchwepheshe bolimu (ucwaningo, izinhlelo, i-software, applications, datasets, models, initiatives, etc,) ezibhekene nezilimu ezingamelelekile emhlabeni wonke.

It is important to note that Lanfrica isn’t the only organization attempting to develop NLP tools for African languages. In fact, there are many organizations, communities, and individuals working on this important task.

Kubalulekile ukubeka ukuthi i-Lanfrica akuyona yodwa inhlangano kuphela ezama ukukhulisa i-NLP tools ukwenzela izilimu zase-Afrika. Eqinisweni, ziningi izinhlangano, imiphakathi, kanye nabantu ngabanye abasebenza kulomsebenzin obalulekile.

Masakhane is an online, community-led, open-source research effort aimed at building and facilitating a community of NLP researchers for African languages. Masakhane roughly translates to “we build together” in isiZulu. Their goal is for Africans to shape and own these technological advances towards human dignity, well-being and equity, through inclusive community building, open participatory research and multidisciplinarity.

I-Masakhane iholwa wumphakathi kwi-intanethi, ucwaningo lwe open-source oluqondene nokwakha nokuhlumelelisa umphakathi we-NLP nabacwaningi bezilimu zase-Afrika. I-Masakhane itolikwa ngokuthi “siyakha ndawonye” ngesiZulu. Inhloso ukuthi ama-Afrika akhe futhi aphathe ukukhula kwalobuchwepheshe ekutholeni ukuhlonipheka komuntu, ukuba-kahle kanye nokulingana, ngokwakhiwsa kwemiphakathi engacwasi, ucwaningo oluvulelekile kanye ne-multidisciplinarity.

Deep Learning Indaba is an organization whose mission is to strengthen machine learning and artificial intelligence in Africa. They work towards the goal of Africans being not only observers and receivers of the ongoing advances in AI, but also active shapers and owners of these technological advances.

I-Deep Learning Indaba inhlangano enhloso yayo ukuqinisa ukufunda kwemishini kanye ne artificial intelligence e-Afrika. Basebenzela ukubona ama-Afrika bengabi kuphela izibukeli kanye nabamukeli bokuqhubeka kwe AI, kodwa bengabakhi nabaphathi balokukukhula bobuchwepheshe.

Zindi is the first data science competition platform in Africa. Zindi gives organizations and governments access to world-class machine learning and AI solutions, as well as giving African data scientists a place to learn new skills, grow, and access work opportunities.

I-Zindi ingumncintiswano wokuqala wolwazi e-Afrika. I-Zindi inikeza izinhlangano kanye nohulumeni intuba ye-world-class yokufunda kwemishini kanye nezixazululo ze-AI, kanye nokunikeza ama-data scientists ase Afrika indawo yokufunda amakhono amasha, akhule, athole namathuba omsebenzi.

Black in AI is a collaboration between multiple institutions that is focused on increasing the presence of Black individuals in the field of AI. The initiative provides a platform for sharing ideas, fostering collaborations, and discussing initiatives that can help to achieve this goal. Black in AI is a transcontinental effort, meaning that it involves participants from various regions around the world. By providing a space for Black individuals in AI to come together and work towards a common goal, Black in AI aims to help make the field of AI more diverse and inclusive.

I-Black in AI ukuhlanganyela phakathi kwezikhungo eziningi okubhekene nokwandisa ukubakhona kwabantu abaMnyama emkhakheni we AI. Loluhlelo lunikeza inkundla yokwabelana ngemiqondo, ukwakha ukubambisana, nokudingida izindlela ezingasiza kufinyelelwe kulomgomo. I-Black in AI iwumzamo ofaka amakhontinenti amaningi, okusho ukuthi ifaka abahlanganyeli abaqhamuka kumarijini omhlaba wonke. Ngokunikeza indawo abantu abaMnyama kwi AI ukuhlangana basebenzele inhloso eyodwa, i-Black in AI ihlose ukusiza yakhe inkundla ye-AI eyahlukahlukene nehlanganisayo.

DSN (Data Scientists Network) is Africa’s leading Artificial Intelligence (AI) technology enterprise committed to building Africa’s AI talents ecosystem, and developing solutions for governance, education, health, retail, and finance.

I-DSN (Data Scientists Network) ihamba phambili e-Afrika ku-Artificial Intelligence (AI) kwezobuchwepheshe nezimisele ukwakha amakhono e-AI e-Afrika, nokukhulisa izixazululo zobuhulumeni,  imfundo, ezempilo, uhwebo, kanye nezezimali.

In addition, there are numerous language-specific organizations like Lesan AI, GhanaNLP, EthioNLP, Igwebuike community, IgboNLP, YorubaNames, just to mention a few.

Ekwandiseni, kunenqwaba yezinhlangano ebhekene nezilimu njenge Lesan AI, GhanaNLP, EthioNLP, Igwebuike community, IgboNLP, YorubaNames, ukubala nje ezimbalwa.

Conclusion

Ukugoqa

In conclusion, the development of artificial intelligence applications for African languages is crucial for the preservation and promotion of these languages. Through natural language processing, we can enable machines to understand and interact with African languages, just as they do with English and other widely-spoken languages. This will open up new opportunities for communication, collaboration, and education, and will help to level the playing field for African languages in the global marketplace of ideas. We must continue to support and invest in the development of AI for African languages, and work together to build a brighter future for these rich and diverse cultures.

Ekugoqeni , ukuthuthukiswa kwe-artificial intelligence applications ezilimini zase Afrika kubalulekile ukulondoloza nokuthuthukisa lezi zilimu. Ekuhlumelelisweni kolwimi lwemvelo, singavumela imishini ukuba iqonde futhi enzelana nezilimu zase-Afrika, njengoba benza ngesiNgisi kanye nezinye izilimu ezikhulunywayo. Lokhu kuzovula amathuba ezokuxhumana, ukubambisana, kanye  nemfundo, kuzophinde  kusize ekulinganiseni inkundla ezilimini zase Afrika kumakethe yomhlaba yemiqondo. Kumele siqhubeke ukweseka nokutshala ekukhulisweni kwe-AI ezilimini zase Afrika, sisebenzisane ekwakheni ikusasa elikhanyayo lalemasiko ahlukahlukene.

Share
Print PDF
OLORI LOLADE SIYONBOLA
NOLAN OSWALD DENNIS
© 2023
Archive About Contact Africa Open Institute