The Development of Digital Transcription Device Through ESP32 Integration: An Assistive Communication Tool for the Hearing Impaired

RUBIX Bayla; ELISHA  Gimpayan; PRECIOUS  Bravante; GABRIEL  Ballesteros; KARLYNNE  Yapana; GJUAN  Salvador; CYRUS  Urbano; JULIE  Real

Authors

RUBIX Bayla Philippine School Doha https://orcid.org/0009-0002-4286-0853
ELISHA Gimpayan Senior High School Department, Philippine School Doha, Qatar https://orcid.org/0009-0001-3195-7548
PRECIOUS Bravante Senior High School Department, Philippine School Doha, Qatar https://orcid.org/0009-0005-1606-9103
GABRIEL Ballesteros Senior High School Department, Philippine School Doha, Qatar https://orcid.org/0009-0007-8585-8565
KARLYNNE Yapana Senior High School Department, Philippine School Doha, Qatar https://orcid.org/0009-0002-8914-5689
GJUAN Salvador Senior High School Department, Philippine School Doha, Qatar https://orcid.org/0009-0007-0783-6731
CYRUS Urbano Senior High School Department, Philippine School Doha, Qatar https://orcid.org/0009-0007-8889-9653
JULIE Real Senior High School Department, Philippine School Doha, Qatar https://orcid.org/0000-0002-3096-2628

Keywords:

Communication, ESP32, Hearing-Impaired, Microcontroller, Transcription

Abstract

Hearing impairment is a significant public health concern that continues to affect populations worldwide. The objective of this study is to provide hearing-impaired individuals with a real-time communication system that provides an additional communication support option, integrates artificial intelligence to ensure accurate communication, and addresses Sustainable Development Goal 10: reducing inequalities. The study aims to align with Qatar National Vision 2030 and current hearing impairment statistics in Qatar. This study employed a quantitative experimental research design to develop a Digital Transcription Device for the hearing-impaired as an assistive communication tool, integrating the ESP32. The study's results support the device’s effectiveness and accuracy, with a rapid response time of 3.56-9.17 seconds for segmented conversations. The device achieved 100\% accuracy with word counts ranging from 5 to 15. The effective distance of the Digital Transcription Device was found to achieve 100% accuracy at distances from 1 meter to 5 meters. Based on the findings, the Digital Transcription Device successfully enables accurate communication between hearing-impaired individuals by accurately recording the average time for words to display, the accuracy of the words displayed, and the maximum effective distance of the device, with minimal discrepancies.

Downloads

Download data is not yet available.

References

L. M. Haile, K. Kamenov, P. S. Briant, A. U. Orji, J. D. Steinmetz, A. Abdoli, and Rana, “Hearing loss prevalence and years lived with disability, 1990–2019: findings from the global burden of disease study,” The Lancet, vol. 397, pp. 996–1009, 2019.

S. A. Zarghami, “A decade of sustainable development goals: A cluster-based evaluation through four theoretical lenses,” Journal of Cleaner Production, vol. 526, 2025.

G. Girotto, M. Mezzavilla, K. Abdulhadi, D. Vuckovic, D. Vozzi, M. Khalifa Alkowari, and Badii, “Consanguinity and hereditary hearing loss in Qatar,” Human Heredity, vol. 77, no. 1–4, pp. 175–182, 2014.

A. Alduais, H. Alfadda, and H. S. Alarifi, “Prevalence of hearing impairment in Saudi Arabia: Pathways to early diagnosis, intervention, and national policy,” Healthcare, vol. 13, no. 16, 2025.

L. Pragt, P. Van Hengel, D. Grob, and J. W. A. Wasmann, “Preliminary evaluation of automated speech recognition apps for the hearing impaired and deaf,” Frontiers in Digital Health, vol. 4, 2022.

N. A. Alit, M. S. Ellias, and A. D. Ahmad, “Technology application in teaching and learning for hearing impaired students: A recent systematic review,” International Journal of Education, vol. 10, no. 58, pp. 541–561, 2025.

A. Ali and S. Renals, “Word error rate estimation for speech recognition: e-WER,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, vol. 2, pp. 20–24, Short Papers, 2018.

J. A. Real, K.-E. Ceradoy, L. Fortuna, J. R. Gallarte, K. N. Soriano, A. F. Emperio, N. M. Carlos, and D. C. Camia, “The making of object recognition eyeglasses for the visually impaired using image AI,” International Journal of Innovative Science and Research Technology, vol. 9, pp. 1012–1017, Apr. 2024.

S. Di Leo, L. De Cicco, and S. Mascolo, “Real-time speech-to-text on edge: A prototype system for ultra-low latency communication with AI-powered NLP,” Information (Basel), vol. 16, p. 685, Aug. 2025.

L. Z. Adornado, D. K. Latorre, A. I. Serrano, and L. K. L. Mohammad Elyjah Masukat, “The use of TensorFlow action recognition as the main component in making a sign language translator speaker for speech-impaired people,” International Journal of Innovative Science and Research Technology, vol. 9, pp. 1203–1210, Apr. 2024.

A. Apriyanto, A. Intes, S. R. Yahya, S. Hanim, and A. K. Alhamdani, “Supporting inclusivity through an automatic transcription application to improve hearing skills for the deaf,” Journal International of Lingua and Technology, vol. 3, no. 2, pp. 425–440, 2024.

M. E. McCue and A. M. McCoy, “Harnessing big data for equine health,” Equine Veterinary Journal, vol. 51, pp. 429–432, July 2019.

A. Valladares-Poncela, P. Fraga-Lamas, and T. M. Fernández-Caramés, “On-device automatic speech recognition for IIoT and extended reality industrial metaverse applications,” Engineering Proceedings, vol. 82, no. 1, 2024.

G. Nooka Raju, M. Sreedhar, and P. M. K. Prasad, “Speech-to-text conversion wearable smart glasses for the hearing-impaired using ESP32,” DIBH, vol. 53, pp. 187–194, Apr. 2025.

N. Zoghlami, “Insights into L2 connected speech segmentation: A gating experiment with listeners of different English proficiency levels,” Australian Journal of Applied Linguistics, vol. 6, no. 2, pp. 94–113, 2023.

M. I. Mandel and J. Barker, “Multichannel spatial clustering for robust far-field automatic speech recognition in mismatched conditions,” in Interspeech 2016, ISCA, Sept. 2016.

D. Nolte, R. Hjoj, T. Sánchez Pacheco, A. Huang, and P. König, “Investigating proxemic behaviors towards individuals, pairs, and groups in virtual reality,” Virtual Reality, vol. 29, no. 2, 2025.

D. Joshy, J. Unnikrishnan, N. Yadhu, and V. Krishnaveni, “ECHOLENS: Smart glasses for real-time speech display for deaf individuals,” International Journal of Advanced Research and Computer Communication Engineering, vol. 13, June 2024.