5G | Artificial Intelligence | ICT4SDG | ITU-T Standards | Standards
December 16, 2020

Advancing education and speech recognition in Nigeria with AI and machine learning

By James Agajo, Abdullahi Sani Shuaibu, and Blessed Guda, Federal University of Technology, Nigeria

As part of our participation in the ITU Focus Group on Machine Learning for Future Networks including 5G (FG‑ML5G), in March 2019, our WINEST (Wireless Networks and Embedded Systems Technologies) Research Group launched a study on “use cases and solutions for migrating to IMT‑2020/5G networks in emerging markets”.

Our focus was to determine how machine learning could help emerging markets to leapfrog technology generations to take advantage of emerging and future networks while optimizing energy consumption, network coverage, and communication overheads.

Improving education in Africa

This study led us to propose the “AI‑Based Classroom” project, designed to improve education for young pupils in Africa. Students are driving the project, under the guidance of James Agajo, Associate Professor and Head of WINEST Research Group in the Department of Computer Engineering at the Federal University of Technology in Minna, Nigeria.

With AI‑based natural language processing (NLP), classroom conversations between pupils and teachers are processed at the network edge to extract keywords while maintaining speaker anonymity. 

These keywords are transmitted to a trained classifier in the central server, which is able to recommend captivating media content, providing students with intuitive examples, supporting a teacher’s explanations. The media content is then shared on a digital display in the classroom.

The system is designed to augment the efforts of elementary school teachers rather than attempting to replace them.

An efficient speech recognition library is a critical prerequisite for the development of an AI‑based classroom. This proved very difficult to find.

Automatic speech recognition for Africa

We were in search of a speech recognition library which was able to function locally, meet users’ privacy concerns, and was freely available. In view of the extraordinary number of languages spoken across Nigeria, and Africa at large, we also needed a library able to perform well in processing the English language accented in many different ways.

We evaluated many software libraries, but none of them succeeded in meeting all of these requirements. This led to the launch of a new WINEST Research Group project in February 2020 to develop a new speech recognition framework able to meet the unique requirements of the AI‑Based Classroom project. The project evolved from the discussions sparked by our presentation of the AI‑Based Classroom.

The project evolved from the discussions sparked by our presentation of the AI‑Based Classroom project at the 7th Regional Workshop on “Standardization of future networks towards building a better-connected Africa” in Abuja, Nigeria, 3–4 February 2020, convened by ITU’s standardization expert group for future networks and cloud computing, ITU Telecommunication Standardization Sector (ITU–T) Study Group 13.

The expert feedback provided by the Abuja workshop motivated our launch of a pilot project in Nigeria to develop an African automatic speech recognition (ASR) system.

We are collecting speech data and developing the ASR engine to deliver a prototype able to guide the development of a system ready for market deployment. We have developed the “Wazobia” mobile application, to support the necessary data collection, where Nigerian “voice donors” read displayed text aloud and donate the recording — anonymously.

“Wazobia” is an amalgam of three words meaning “come” in Yoruba (wa), Hausa (zo) and Igbo (bia), Nigeria’s three largest linguistics groups.

The speech data is stored on our server as “unvalidated” by default, pending the crowd-sourced validation of this data by volunteers via the Wazobia mobile app. This validation results in Boolean evaluations of the accuracy of the ASR engine’s transcriptions of recorded speech.

Architecture of the African automatic speech recognition project

To date, the project has collected over three hours of speech corpus from over 170 voice donors.

The African ASR system development phase includes data pre-processing, training and software design. The project uses the Wav2letter++ ASR toolkit and looks to Facebook’s AI research article as a reference implementation.

We are progressing with the segmentation and pre-processing of the collected data for supervised and semi-supervised machine learning settings, but thus far the African ASR project only accepts English as an input language.

We aim to introduce African languages as inputs as the ASR project develops, and we plan to stimulate this key avenue of innovation by submitting our speech corpus of African languages to future ITU challenges on AI and Machine Learning in 5G and beyond.

AI and machine learning to help Africa manage pandemics

In future ITU AI/machine learning in 5G challenges we also plan to propose a new Bluetooth®-enabled contact-tracing application supported by machine learning.
This pandemic tracing application (PTA) project aims to build exposure-risk prediction models trained from anonymized user data, as explained in the table below:

Data collection for pandemic tracing application contact detection
 Straight line distance between user equipment
 Bluetooth signal strength
User equipment model
Operating system version
 Indoor/outdoor (based on ambient light)
 Radio-frequency interference (wireless local area network)

The proposed PTA deployment scenario would incorporate data collected from users with Bluetooth® discoverable, not requiring all users to install the application. However, a more detailed picture of the environment can be achieved by incorporating data from mobile devices’ gyroscopes and accelerometers when two Bluetooth®-connected devices both have the PTA installed.

Architecture of the Pandemic Tracing Application

The development of the proposed PTA will follow these guiding principles:

1. The contact tracing will be generic with configurable parameters to accommodate future pandemics.
2. It will reuse relevant features of existing frameworks but customize these features for application in Africa.
3. It will include privacy-preserving mechanisms by design.
4. The application will determine the appropriate scope of data sharing beyond a user’s device-based user-indicated preferences with respect to privacy.

The training data will be specific to pandemics, as recommended by the World Health Organization (WHO) and other health authorities.

The resulting exposure-risk prediction models will also be specific to pandemics (trained and deployed from a central server).

We are now focused on collecting the required data and we plan to submit this data to future ITU challenges on AI and Machine Learning in 5G and beyond.

Image credit: Federal University of Technology, Minna, Nigeria

  • Was this article Helpful ?
  • yes   no
© International Telecommunication Union 1865-2018 All Rights Reserved.
ITU is the United Nations' specialized agency for information and communication technology. Any opinions expressed and statistics presented by third parties do not necessarily reflect the views of ITU.

Advancing education and speech recognition in Nigeria with AI and machine learning

Send this to a friend