CHRIS EMEZUE & IYANUOLUWA SHODE
AI and African Languages: Empowering Cultures and Communities
I-AI kanye neziLimu zaseAfrika : Ukunikeza amandla amasiko kanye nemiphakathi
African intellectuals must do for their languages and cultures what all other intellectuals in history have done for theirs.
Ngugi wa Thiong’o
Izinjulabuchopho zase Afrika kumele zenzele izilimi namasiko azo lokho okwenziwa ngezinye izinjulabuchopho emlandweni zenzela amasiko nezilimi zawo.
Ngugi wa Thiong’o
During the summer of 2022, Tola traveled to Vienna for a conference. The next day, she went out to get food, but she couldn’t understand the local variant of German. She only speaks English and Yoruba. Thankfully, she had Google Translate on her phone, which allowed her to communicate with the waiter and get the food she needed.
Ngehlobo lika 2022, u-Tola wahambela e-Vienna eya kwi-nkomfa. Ngosuku olulandelayo, waphuma wayozitholela ukudla, kodwa akakwazanga ukuqonda ulwimi lwakhona lwesiJalimane. Ukhuluma kuphela isingisi nesi-Yoruba. Siyabonga, wayeno Google Translate efonini yakhe.
Google Translate is one of the many applications of a revolutionizing technology that is changing the world today, Artificial intelligence. One would ask what does this mean? How can intelligence be artificial? Let me give you a short overview.
I-Google Translate ingenye yama-applications eshintsha ubuchwepheshe neshintsha umhlaba namhlanje, Artificial intelligence. Ungazibuza ukuthi lokhu kusho ukuthini? Kwenzeka kanjani ukuthi ubuhlakani bungabi obeqiniso? Ithi ngikubonise kancane.
The term “artificial intelligence” itself was created in 1956 by a professor of the Massachusetts Institute of Technology, John McCarthy.
Itemu “artificial intelligence” ngokwalo lasungulwa ngo 1956 usolwazi wase Massachusetts Institute of Technology, u- John McCarthy.
In simple terms, artificial intelligence (otherwise known as AI) is defined as “getting a computer to do ‘intelligent’ things that people do”. I am sure you have seen sci-fi movies that feature robots acting as humans (haha). Human productivity increases exponentially by automating several tasks with high precision.
Ngamagama alula, i-artificial inteligence(eyaziwa futhi ngokuthi i-AI) ichazwa ngokuthi “ukwenza ikhompyutha yenze izinto ezinobuhlakani ezenziwa ngabantu”. Ngiqinisekile unamafilimu e-sci-fi anama-robhothi aziphathisa okwabantu(ha ha). Umkhiqizo wabantu ukhuphuka ngezinga eliphezulu ngokusebenzisa imishini eshaya emhlolweni.
This field has widely grown from what it was six decades ago to becoming one of the biggest domains of technology that currently permeates almost all facets of human life. Artificial intelligence is actively present in our lives now and is playing a huge role in the Fourth Industrial Revolution. It encompasses a broad spectrum of different technologies and applications. Natural language processing, the focus of this article, is one of such applications that deals with languages and machines.
Lendima seyidlondlobale kulokhu eyayiyikho emashumini ayisithupha eminyaka edlule yaba enye yezindima ezinkulu zobuchwepheshe ethinta yonke impilo yabantu. I-Artificial intelligence ikhona ezimpilweni zethu manje futhi idlala indima enkulu kwi-Fourth Industrial Revolution. Yengamele izindawo eziningi ezahlukene zobuchwepheshe kanye nama-applications. Ukuhlunzwa kolimi lwemvelo, okuyingqikithi yalombalo, ingezinye zama-applications abhekene nezilimi kanye nemishini.
NLP is the use of language with technology
I-NLP ukusetshenziswa kolwimi kanye nobuchwepheshe.
We humans communicate through language. As members of human society, we are connected in a plethora of ways. One of these ways is through communication. With the aid of language, we are able to express ourselves. To enable computers/machines to interact with humans (which is a necessity for artificial intelligence), computers need to understand the natural languages used by humans. Natural Language Processing (NLP, for short) is a form of AI that teaches machines to read or recognize text and voice, extract value from it, and potentially convert the information into a desired output format, such as text, voice, images, and even videos. NLP is the use of language with technology. The ultimate goal of NLP is to help computers understand language as well as humans do. This is one major step in helping computers attain artificial intelligence.
Thina bantu sixhumana ngezilimi. Njengamalunga abantu, sixhumene ngezindlela eziningi. Enye yalezizindlela ukukhulumisana. Ngokusizwa ulwimi, siyakwazi ukuzwakalisa ubuthina. Ukuvumela amakhompyutha/imishini ukuthi ixhumane nabantu (okuyinto ebalulekile kwi artificial intelligence), amakhompyutha kumele aqonde izilimu zemvelo ezisetshenziswa ngabantu. I-Natural Language Processing(kafuphi i-NLP) iwuhlobo lwe AI efundisa imishini ukufunda nokubona imibhalo kanye nezwi, ihluze okubalulekile kuyona, bese iba namandla okuphendula lolo lwazi lube ilento efunekayo, njenge mibhalo, izwi, izithombe, kanye nezithombe ezinyakazayo(video). I-NLP ukusetshenziswa kolwimi ngobuchwepheshe. Inhloso enkulu ye NLP ukusiza amakhompyutha aqonde ulwimi ngendlela abantu abaliqonda ngayo. Ilona gxathu elikhulu elizosiza amakhompyutha athole I artifical intelligence.
NLP in our lives
I-NLP ezimpilweni zethu
The importance of communication has led to the widespread use of NLP. Millions of users can now turn to NLP to automate mundane tasks.
Ukubaluleka kokuxhumana sekuholele ekwandeni kokusetshenziswa kwe-NLP. Izinkulungwana zabasebenzisi sebengasebenzisa i-NLP ukwenza imisebenzi emincane,
Remember our story about Tola and Google Translate? Google Translate is one of the many applications of NLP. It is a deployment of machine translation, one of the important NLP tasks which involves translating texts from one language to another. Through this, Tola can confidently walk into a store in Vienna (and many countries) and get groceries with her mobile phone.
Uyayikhumbula indaba yethu ngo Tola ne Google Translate? I-Google Translate iyenye yezindlela zokusebenza kwe-NLP. Iwukusebenziswa kokuhumusha komshini, enye yezindlela ezibalulekile ze-NLP ezifaka ukuhumusha imibhalo isuka kolunye ulwimi iya kolunye.
Have you watched Hustle? If you haven’t already, I strongly advise you to do so (*wink*). In the movie, protagonist Stanley (played by Adam Sandler) works for a big-time NBA club, the Philadelphia 76ers, and he travels across continents scouting for professional players for the team. He gets to Spain and is fascinated by a player who he sees playing in a street basketball match. In order to communicate with this player, his mobile phone is his his mouthpiece, as he can only speak English and the player speaks only Spanish. The mobile phone has an app which provides machine translation and speech synthesis. Speech synthesis converts written text to speech.
Usuke wayibukela i-Hustle? Uma ungakayibuki, ngiyakweluleka ukuthi wenze njalo (*ecifa ihlo*). Kuleyo filimu, u-Stanley usebenzela iqembu elikhulu le NBA, i.-Philadelphia 76ers, uhamba amazwe ngamazwe efuna abadlali abasezingeni eliphezulu. Wafika e-Spain wahlabeka umxhwele ngumdlali odlala kulokhu okwaziwa nge-street basketball. Ekuqaleni, ukuxhumana nalomdlali, wasebenzisa ifoni yakhe, njengoba wayekhuluma isingisi kuphela umdlali ekhuluma iSipanishi. I-app yayinamandla omshini okuhumusha kanye nokuhlunza inkulumo. Ukuhlunza inkulumo kuphendula imibhalo ibe inkulumo.
“What is zero divided by zero?” This is the most frequently asked question for Siri – Apple’s speech recognition assistant. You ask Siri a question and Siri answers intelligently. Speech recognition converts speech to text.
“Kuphumani masihlukanisa u-zero ngo-zero?” Lona ngumbuzo obuzwa kakhulu ku-Siri – okuwumsizi ohluza inkulumo waka Apple. Ubuza u-Siri umbuzo bese u-Siri ephendula ngobuhlakani. Ukuhluza inkulumo kuphendula okukhulunyiwe kube umbhalo.
The image above shows how Google helps predict the user’s question. Autocomplete and autocorrect are applications of text prediction, yet another application of NLP.
Isithombe esingenhla sikhombisa ukuthi u-Google usiza kanjani ukuthola umbuzo walowo owusebenzisayo. I-autocomplete kanye ne autocorrect angama-applications ahlunza imibhalo, okuyenye yezindlela zokusebenza kwe –NLP.
These are just a few examples of the many ways NLP has changed our lives and is still helping to improve our standard of living.
Lezi ezinye zezindlela ezibonisa izindlela eziningi i-NLP esiguqule ngayo izimpilo zethu neqhubeka ngayo ukuphucula izimpilo zethu.
What about NLP for African languages?
Sikuphi ne-NLP nezilimu zase Afrika?
As an African, it is important to communicate in our African languages. Our African languages are our identity and our culture. While Tola can conveniently communicate with the food seller in the person’s native language, German, thanks to Google Translate, it turns out that the reverse is not the case: the food seller is not able to communicate with Tola in her native language, Yoruba, because the translation performance is bad and unreliable (*sad*).
Njengomuntu wase Afrika, kubalulekile ukuxhumana ngezilimu zethu. Izilimu zethu zase Afrika ziwubuzwe kanye nosikompilo lwethu. Lapho u-Tola engaxhumanana kahle nomdayisi wokudla ngolwimi lomdayisi, isiJalimane, sibonga u-Google Translate, kuyacaca ukuthi akanakukwenza lokhu ngolwimi lakhe: umdayisi wokudla akakwazi ukuxhumana no Tola ngolwimi luka Tola, isi-Yoruba, ngenxa yokuthi ukuhumushwa kwalo akukho ezingeni futhi akwethembekile (*kuyajabhisa*)
Despite there being more than 2000 African languages, they are barely represented in our current language technologies: for example, existing speech recognition services (like Amazon’s Alexa, Apple’s Siri, and Google’s Home) do not currently support a single African language. This disparity excludes the speakers of these languages from the benefits of these artificial intelligence applications, thereby widening the existing digital divide.
Ngaphandle kokuthi sinezilwimi ezingaphezu kuka 2000 zase-Afrika, azimelelekile ngokwanele ebuchwephesheni besimanje: isibonelo, abahlunzi benkulumo abakhona (njengo Alexa we-Amazon, u-Siri we-Apple, kanye no-Google Home) awaziseki izilimu zase Afrika. Lokhu kwahlukana kuzibeka ngaphandle lezi zilimu ekuzuzeni kwi-artificial intelligence applications, lokhu kukhulisa uqhekeko lwe-digital divide.
When we dig deeper into these wonderful achievements, we discover that only a small percentage of the world’s over 7000 languages are represented in the rapidly evolving language technologies and applications. The remaining languages, called low-resource languages, are largely excluded from these language technologies. African languages fall into this category.
Uma siqhubeka siphanda ngalemiphumela emihle, sithola ukuthi iphesenti elincane lezilimu zomhlaba ezingaphezu kuka 7000 ezimelelekile ezinguqukweni zezilimu kubuchwepheshe nama applications. Izilimu ezisele, ezibizwa nge low-resource languages, zivalelwa ngaphandle kulobuchwepheshe bezilimu. Izilimu zase Afrika ziwela kulomkhakha.
The language technologies and architectures built for NLP were modeled for western languages, with little to no consideration for the linguistic features of African languages and needs of African communities. That is why, for example, even the best transcription model, OpenAI’s Whisper, does a terrible job with African languages (see below).
Ubuchwepheshe bezilimu kanye nezakhiwo ezakhiwele i- NLP zazakhelwe izilimu zasentshonalanga, kunganakekelwanga nakancane izimo zezilimu zase Afrika kanye nezidingo zemiphakathi yase Afrika. Yingakho nje, isibonelo, ngisho izihumushi ezingcono, ze-OpenAI Whisper, yenza umsebenzi omubi ngezilimu zase Afrika (bheka ngezansi).
One negative result is that Africans are often misunderstood by these models. I and Alexa are always at loggerheads. Several times I have told Alexa to play me some of my favorite Afrobeat songs, but it either plays the wrong song or tells me it does not understand me. The powerful Google Translate also falls short by giving incorrect translations for African languages. These are many of the numerous shortcomings of NLP.
Okunye okubi ngalemiphumela ukuthi ama-Afrika awaqondwa yilama-models. Mina no Alexa sihlezi siphambana njalo. Izikhathi eziningi ngitshela u-Alexa ukuthi angidlalele umculo engiwuthandayo we Afrobeat, kodwa adlale iculo okungeyilo noma angitshele ukuthi akangiqondi. I-Google Translate enamandla nayo ikha phansi ekunikezeni ukuhumusha izilimu zase Afrika. Lezi ezinye zezindlela ehluleka ngayo i-NLP .
Another negative result of this is that these NLP models could portray Africa and the African context incorrectly. For example, in this video, a researcher shows how a text-to-image model, an NLP model trained to generate an image given a text description of it (amazing, right? ), produces the wrong image of a wedding in Sudan.
Ezinye zezindlela ezimbi ezokuthi le-NLP models ingabeka kabi i-Afrika kanye nokuqonda kwe-Afrika ngendlela okungeyiyo. Isobonelo, kule-video, umcwaningi ukhombisa ukuthi uhlobo lwemibhalo eya ezithombeni (text-to-image model), loluhlobo lwe NLP luqeqeshelwe ukwenza izithombe ngemibhalo enikeziwe eyincazelo yayo (kuyamangaza, right?), lwenza izithombe ezingeyizo zomshado eSudan.
Representing African Languages and Cultures in the Digital World
Ukumelwa kwezilimu zase Afrika namasiko kuMhlaba we Digital
Below, we discuss some of the root causes of the numerous NLP challenges faced by African languages.
Ngezansi, sidingida ezinye zezimbangela zezinqinamba ze-NLP ezibhekene nezilimu zase Afrika.
The Challenges of Digitizing African Languages and Cultures
Izingqinamba zoku-Digitizing Izilimu zaseAfrika namasiko
African languages are often underrepresented in technological environments, leading to a lack of support for these languages in many digital tools. For instance, how easy is it for you to communicate online in your native African language? Many keyboards do not have the necessary diacritical marks to properly represent African languages like Yorùbá, which uses tonal and orthographic diacritics to differentiate words with similar spellings but different meanings. Without these marks, it is very difficult to properly digitize these languages.
Izilimu zase Afrika isikhathi esiningi azivezwa ngendlela eyiyo ezindaweni zobuchwepheshe, okuholela ekungesekweni kwalezizilimu kuma-digital tools amaningi. Ukubonisa, kulula kangakanani kuwena ukuxhumana online usebenzisa ulwimi lakho lwase Afrika? Ama-keyboards amaningi awanazo izinsiza-zimpawu ezenza kubhalwe izilimu zase Afrika ngendlela njenge Yorùbá, esebenzisa i-tonal kanye ne orthographic diacritics ukwehlukanisa amagama anezipelingi ezifanayo kodwa esho izinto ezehlukene. Ngaphandle kwalezizimpawu, kunzima uku-digitize lezi zilimu ngendlela eyiyo.
Another major factor is the heavy reliance on text for digitization. In contrast, many African cultures rely heavily on oral communication, which is incompatible with text-based computing systems. This means that much of the information and knowledge transmitted through these cultures cannot be easily represented in digital form. As a result, there is a lack of African-focused content on the web, as it is difficult to represent these languages and cultures in a digital format.
Enye ingqinamba enkulu ukwencika kwimibhalo ukwenza i-digitization. Ukuqhathanisa, amasiko amaningi ase-Afrika ancike ekuxhumaneni okungabhaliwe (oral communication), okungahambisani nemibhalo yamakhompyutha. Lokhu kusho ukuthi iningi lolwazi nokwazi oludluliswa ilamasiko angeke lubekeke kalula i-digital form. Ngokomphumela, kunokungabibikho kokuqukethwe yi-internet okuqondene ne-Afrika, ngenxa yokuthi kunzima ukubeka lezizilimu namasiko kwi-digital format.
Lack of African-centric NLP training datasets
Ukungabibikho kweAfrika ku-NLP nokuqeqeshwa kolwazi
In order to build NLP models of any kind, you need lots of data to train the model. The model is only as good as the data it is trained on. Our current NLP applications are amazing for western languages like English, Spanish, French, and German because there is a lot of training data for these languages. On the other hand, the scarcity of training datasets for African languages limits researchers’ ability to conduct NLP research and develop language technologies.
Ukuze kwakheke iNLP yalolonke uhlobo, udinga inqwaba yolwazi ukuze uqeqeshe isifanekiso. Isifanekiso siyoba sihle njengolwazi esiqeqeshwe kulo. I-NLP applications ekhona kuyimanje yinhle kwizilimu zasentshonalanga njengesiNgisi, isiSpanishi, isiFrentshi kanye nesiJalimane ngoba kuningi ukuqeqeshwa kolwazi kulezizilimu. Kwelinye icala, ukungabibikho kokuqeqeshwa kolwazi ezilimini zase Afrika kuvimba abacwaningi ukuth benze iNLP bacwaninge bakhulise ubuchwepheshe bezilimu.
Africa is becoming increasingly English-centric, partly due to her colonization by Western powers. This has led to the widespread adoption of English as the dominant language in many areas, including business, education, and technology. This dominance of English has had many negative effects, including the marginalization of other languages and cultures. It has also contributed to the globalization of Western culture and values, which can be seen in the way that English is often seen as a requirement for success in many fields. This English-centric world is a legacy of colonialism, and it continues to shape the way we think and communicate today. It has also led to more content being produced in English language and less in African languages. This pervasive low availability of training datasets in African languages is well summarized by Joshi et al. in the image below.
I-Afrika iya ngokuya incika esiNgisini, ngenxa yokucindezelwa abaseNtshonalanga. Lokhu sekwenze ukuthi kwamukelwe isiNgisi njengolimu olusebenziswa ezindaweni eziningi, okubalwa kuzo amabhizinisi, imfundo, kanye nobuchwepheshe. Lokhu kuqhwakela kwesiNgisi sekube nemiphumela engemihle, okubalwa kuyo ukucwaswa kwezinye izilimu kanye namasiko. Kuphinde kwenza ukufakwa umhlaba wonke kwamasiko aseNtshonalanga kanye namagugu, okubonakala lapho isiNgisi sibonakala njengesidingo ukuze uphumelele ezindimeni eziningi. Lokhu kuncika esiNgisini komhlaba kuyifa lencindezelo, futhi kuyaqhubeka nokubumba indlela esicabanga ngayo nesixhumana ngayo namhlanje. Sekuholele ekutheni kube nemikhiqizo yolimi lwesiNgisi kuthi eyezilimu zaseAfrika ibe mncane. Loku kusabalala kokungabibikho kokuqeqeshwa kolwazi ngezilimu zase Afrika kucutshungulwa kahle u- Joshi et al. esithombeni esingezansi.
The authors divide the digital status and ‘richness’ of languages in the context of data availability into 6 classes. We see the majority of African languages belonging to The Left Behinds category, which have little to no unlabeled training data.
Ababhali bahlukanise i-digital status kanye nokujula kwezilimu ngomongo wokubakhona kolwazi yaba ngamakilasi angu 6. Sibona iningi lezilimu zase Afrika ziwela kulengxenye ebizwa ngokuthi The Left Behinds, ezinolwazi oluncane olungachaziwe lokuqeqeshwa.
Low discoverability of existing efforts on African languages
Amazinga aphansi okutholakala kwemizamo ngezilimu zase Afrika
The few resources (research papers and datasets) available for African languages are difficult to come by. Many times, access to language data for a specific country requires affiliation with a specific academic institution in that country. This reduces the ability of countries and institutions to combine their knowledge and datasets to achieve better performance and innovation. Existing research is frequently difficult to find because it is frequently published in smaller African conferences or journals that are not electronically accessible or indexed by research tools such as Google Scholar.
Izinsiza ezincane (Amaphepha ocwaningo kanye nolwazi) ezikhona ngezilimu zase Afrika akulula ukuzithola. Izikhathi eziningi, ukuthola ulwazi ngezilimi kulelozwe kudinga ukwencika esikhungweni semfundo esithile kulelozwe. Lokhu kunciphisa amathuba amazwe kanye nezikhungo ekuhlanganiseni ulwazi lwazo kanye nemiqingo ukuze zithole ukusebenza kancono nemikhuba eqanjwe kabusha. Ucwaningo olukhona kunzima ukuluthola ngenxa youkuthi kaningana lushicilelwe kwizinkomfa ezincane zase Afrika noma kushicilelo olungatholakali ngekhompyutha noma oluhlelwe amathuluzi ocwaningo afana no Google Scholar.
Chris Emezue of Lanfrica elaborates on this topic in the CarpentryCon 2022, hosted by The African Carpentries Community on Thursday, August 11 2022.
U-Chris Emezue we Lanfrica uchaza kabanzi ngalesisihloko ku-CarpentryCon 2022, eyayihlelwe yi-African Carpentries Community ngoLwesine, August 11 2022.
Efforts to build language technologies for African languages
Imizamo yokhwakha ubuchwepheshe kwezilimu zase Afrika
No one knows African languages better than Africans; therefore, there is a need for more Africans, especially the young stars, to learn their languages. In order to bring African languages back to life and increase their use, the government should implement language policies that support and promote these languages. This can include teaching African languages in schools, documenting texts in these languages, and recording the richness and beauty of African cultures and traditions. By embracing and supporting African languages, we can improve the state of natural language processing in Africa and create effective language technologies that suit the needs of the African people.
Akekho owazi izilimu zase Afrika njengamaAfrika; ngakhoke, sikhona isidingo samaAfrika amaningi, ikakhulukazi intsha, ukufunda izilimu zazo. Ukuze kulethwe izilimu zase Afrika empilweni kanye nokukhuliswa kokusetshenziswa kwazo, uhulumeni makenze izinqubo mgomo zolwimi ezeseka futhi zikhulise lezi zilimu. Lokhu kungafaka phakathi ukufundiswa kwezilimu zase Afrika ezikoleni, ukubeka imibhalo ngalezi zilimu, kanye nokuqopha ubunzulu kanye nobuhle bamasiko ase-Afrika. Ngokwemukela nokweseka izilimu zase-Afrika, singathuthukisa isimo sokusebenza kwezilimu e-Afrika sakhe nobuchwepheshe bezilimu obufanele abantu base Afrika.
Science and technology should be taught extensively in schools. Also, more educational opportunities such as bootcamps, mentorship programmes and academies should be available and accessible in Africa. In western world, we hear of children that started coding at early ages. We need access to more computing resources and network and internet connectivity for African children so that they can be exposed like their western counterparts.
I-Sayensi kanye nobuchwepheshe akufundiswe ikakhulu ezikoleni. Okunye, amathuba okufundisa afana no-bootcamp, izinhlelo ze-mentorship kanye nezikole eziphakemeyo zemfundo azibe khona zitholakale e-Afrika. Entshonalanga, sizwa ngezingane eziqale i-coding zisencane. Sidinga ukuthola amakhompyutha nezinsiza kanye nenethiwekhi nokuxhumana kwe-internet kwezingane zase Afrika ukuze nazo zivuleleke emathubeni njengezase ntshonalanga.
Many communities and researchers around the world are working to make it easier for African languages to be used in language technologies. One of such communities is Lanfrica. Lanfrica aims to accelerate the development of AI applications in African and under-represented regions, by creating large, high-quality African machine learning datasets and building a data-centric foundational platform to assist enterprises in accelerating their AI applications.
Imiphakathi eminingi nabacwaningi emhlabeni wonke basebenzela ukwenza lula ukusetshenziswa kwezilimu zaseAfrika kubuchwepheshe. Enye yaleyo mphakathi i-Lanfrica. I-Lanfrica ihlose ukukhulisa ukudlondlobala kwe-AI applications e-Afrika nasezifundeni ezingamelelekile, ngokusungula enkulu, nesezingeni eliphezulu imishini yase Afrika efundisa nokwakha ulwazi oluyisisekelo nkundla oluzosisa amabhizinisi ekukhuliseni i-AI applications.
During its official launch in February 2022, Lanfrica built an aggregator platform that curates and links African resources, making them discoverable. For instance, if you’re looking for resources (linguistic datasets or research papers) in a particular African language, Lanfrica will point you to the different sources on the web that have such datasets in the desired language.
Ngokuvulwa kwayo ngo February 2022, i-Lanfrica yakha i- aggregator platform ehlanganisa futhi ixhumanise izinsiza zase-Afrika, ziwenza atholakale. Ukubonisa nje, uma ufuna izinsiza (linguistic datasets noma imibhalo yocwaningo) ngolwimi oluthize lwaseAfrika, i-Lanfrica ingakukhombisa imithombo eyahlukene kwi-intanethi ezinalolulwazi ngolwimi olufunayo.
Lanfrica adopts a participatory approach by allowing the general community to contribute resources. With its Slack community, Lanfrica offers the ideal space for researchers, government officials, students, etc. working on or looking for African language resources to connect. The team also uses Twitter and a blog to spread awareness about African language resources. One of such forms of awareness is the Lanfrica Talks series that highlights and showcases language technology efforts (research, projects, software, applications, datasets, models, initiatives, etc.) geared towards under-represented languages around the world.
I-Lanfrica yamukela indlela yokubambisana ngokuvumela umphakathi ukuthi ufake izinsiza. Ne-Slack community, i-Lanfrica inikeza indawo ekahle kubacwaningi, izikhulu zikahulumeni, abafundi, etc. besebenza noma bebheka izinsiza zezilimu zase-Afrika abangazihlanganisa. Iqembu libuye lisebenzise u-Twitter kanye ne-blog ukudlulisa ukuqwashisa ngezinsiza zezilimu zase-Afrika. Enye yalezizindlela zokuqwashisa i-Lanfrica Talks series neqhakambisa futhi ibonise imizamo yobuchwepheshe bolimu (ucwaningo, izinhlelo, i-software, applications, datasets, models, initiatives, etc,) ezibhekene nezilimu ezingamelelekile emhlabeni wonke.
It is important to note that Lanfrica isn’t the only organization attempting to develop NLP tools for African languages. In fact, there are many organizations, communities, and individuals working on this important task.
Kubalulekile ukubeka ukuthi i-Lanfrica akuyona yodwa inhlangano kuphela ezama ukukhulisa i-NLP tools ukwenzela izilimu zase-Afrika. Eqinisweni, ziningi izinhlangano, imiphakathi, kanye nabantu ngabanye abasebenza kulomsebenzin obalulekile.
Masakhane is an online, community-led, open-source research effort aimed at building and facilitating a community of NLP researchers for African languages. Masakhane roughly translates to “we build together” in isiZulu. Their goal is for Africans to shape and own these technological advances towards human dignity, well-being and equity, through inclusive community building, open participatory research and multidisciplinarity.
I-Masakhane iholwa wumphakathi kwi-intanethi, ucwaningo lwe open-source oluqondene nokwakha nokuhlumelelisa umphakathi we-NLP nabacwaningi bezilimu zase-Afrika. I-Masakhane itolikwa ngokuthi “siyakha ndawonye” ngesiZulu. Inhloso ukuthi ama-Afrika akhe futhi aphathe ukukhula kwalobuchwepheshe ekutholeni ukuhlonipheka komuntu, ukuba-kahle kanye nokulingana, ngokwakhiwsa kwemiphakathi engacwasi, ucwaningo oluvulelekile kanye ne-multidisciplinarity.
Deep Learning Indaba is an organization whose mission is to strengthen machine learning and artificial intelligence in Africa. They work towards the goal of Africans being not only observers and receivers of the ongoing advances in AI, but also active shapers and owners of these technological advances.
I-Deep Learning Indaba inhlangano enhloso yayo ukuqinisa ukufunda kwemishini kanye ne artificial intelligence e-Afrika. Basebenzela ukubona ama-Afrika bengabi kuphela izibukeli kanye nabamukeli bokuqhubeka kwe AI, kodwa bengabakhi nabaphathi balokukukhula bobuchwepheshe.
Zindi is the first data science competition platform in Africa. Zindi gives organizations and governments access to world-class machine learning and AI solutions, as well as giving African data scientists a place to learn new skills, grow, and access work opportunities.
I-Zindi ingumncintiswano wokuqala wolwazi e-Afrika. I-Zindi inikeza izinhlangano kanye nohulumeni intuba ye-world-class yokufunda kwemishini kanye nezixazululo ze-AI, kanye nokunikeza ama-data scientists ase Afrika indawo yokufunda amakhono amasha, akhule, athole namathuba omsebenzi.
Black in AI is a collaboration between multiple institutions that is focused on increasing the presence of Black individuals in the field of AI. The initiative provides a platform for sharing ideas, fostering collaborations, and discussing initiatives that can help to achieve this goal. Black in AI is a transcontinental effort, meaning that it involves participants from various regions around the world. By providing a space for Black individuals in AI to come together and work towards a common goal, Black in AI aims to help make the field of AI more diverse and inclusive.
I-Black in AI ukuhlanganyela phakathi kwezikhungo eziningi okubhekene nokwandisa ukubakhona kwabantu abaMnyama emkhakheni we AI. Loluhlelo lunikeza inkundla yokwabelana ngemiqondo, ukwakha ukubambisana, nokudingida izindlela ezingasiza kufinyelelwe kulomgomo. I-Black in AI iwumzamo ofaka amakhontinenti amaningi, okusho ukuthi ifaka abahlanganyeli abaqhamuka kumarijini omhlaba wonke. Ngokunikeza indawo abantu abaMnyama kwi AI ukuhlangana basebenzele inhloso eyodwa, i-Black in AI ihlose ukusiza yakhe inkundla ye-AI eyahlukahlukene nehlanganisayo.
DSN (Data Scientists Network) is Africa’s leading Artificial Intelligence (AI) technology enterprise committed to building Africa’s AI talents ecosystem, and developing solutions for governance, education, health, retail, and finance.
I-DSN (Data Scientists Network) ihamba phambili e-Afrika ku-Artificial Intelligence (AI) kwezobuchwepheshe nezimisele ukwakha amakhono e-AI e-Afrika, nokukhulisa izixazululo zobuhulumeni, imfundo, ezempilo, uhwebo, kanye nezezimali.
In addition, there are numerous language-specific organizations like Lesan AI, GhanaNLP, EthioNLP, IgboNLP, YorubaNames, just to mention a few.
Ekwandiseni, kunenqwaba yezinhlangano ebhekene nezilimu njenge Lesan AI, GhanaNLP, EthioNLP, IgboNLP, YorubaNames, ukubala nje ezimbalwa.
Conclusion
Ukugoqa
In conclusion, the development of artificial intelligence applications for African languages is crucial for the preservation and promotion of these languages. Through natural language processing, we can enable machines to understand and interact with African languages, just as they do with English and other widely-spoken languages. This will open up new opportunities for communication, collaboration, and education, and will help to level the playing field for African languages in the global marketplace of ideas. We must continue to support and invest in the development of AI for African languages, and work together to build a brighter future for these rich and diverse cultures.
Ekugoqeni , ukuthuthukiswa kwe-artificial intelligence applications ezilimini zase Afrika kubalulekile ukulondoloza nokuthuthukisa lezi zilimu. Ekuhlumelelisweni kolwimi lwemvelo, singavumela imishini ukuba iqonde futhi enzelana nezilimu zase-Afrika, njengoba benza ngesiNgisi kanye nezinye izilimu ezikhulunywayo. Lokhu kuzovula amathuba ezokuxhumana, ukubambisana, kanye nemfundo, kuzophinde kusize ekulinganiseni inkundla ezilimini zase Afrika kumakethe yomhlaba yemiqondo. Kumele siqhubeke ukweseka nokutshala ekukhulisweni kwe-AI ezilimini zase Afrika, sisebenzisane ekwakheni ikusasa elikhanyayo lalemasiko ahlukahlukene.