Real-time speech-to-text translation in Spanish secondary classrooms: a mixed-methods study on refugee student inclusion

  • Ricardo Scott Department of Developmental Psychology and Didactics, University of Alicante, Alicante, Spain
  • Clara Vila Secondary Education Teachers of Public Schools of the Valencian Community, Alicante, Spain
  • Daniel Pérez-Alcaraz Secondary Education Teachers of Public Schools of the Valencian Community, Alicante, Spain
  • Olga Vaello Secondary Education Teachers of Public Schools of the Valencian Community, Alicante, Spain
  • José Manuel Pérez-Torres Secondary Education Teachers of Public Schools of the Valencian Community, Alicante, Spain
  • Ricardo Ibanco-Cañete Department of Developmental Psychology and Didactics, University of Alicante, Alicante, Spain
  • Jorge Brotons-Mas Institute of Neurosciences UMH-CSIC, Alicante, Spain / Cardenal Herrera Oria University, Elche, Spain
  • Cristina de-la-Peña International University of La Rioja, Madrid, Spain
  • María José Álvarez-Alonso Alfonso X El Sabio University, Madrid, Spain
  • Teresa Pozo-Rico Department of Developmental Psychology and Didactics, University of Alicante, Alicante, Spain https://orcid.org/0000-0002-5849-4600
Keywords: AI transcription processing, refugee education, language barriers, educational inclusion

Abstract

Following the 2022 invasion of Ukraine, thousands of Ukrainian children enrolled in schools across Europe. In Spain, most lacked prior knowledge of Spanish. This study examines whether real-time speech-to-text translation technology (STTT) can reduce classroom language barriers. Two activities – a fable reading and a neuroscience lecture – were conducted with 12–15-year-old Spanish-speaking students (n = 23) and Ukrainian students unfamiliar with Spanish but bilingual in Ukrainian and Russian. Using PowerPoint 365, the teacher’s speech was transcribed and translated into Russian – which at the time was far more reliably supported by automatic translation tools than Ukrainian – and projected onto a shared classroom display. Although this choice was based on technical and pedagogical criteria, it later drew some resistance, reflecting the sociopolitical sensitivities surrounding language use in wartime contexts. Comprehension was assessed using content-specific questionnaires. Ukrainian students scored lower than their Spanish peers but significantly higher than a control group (n = 22; p < 0.001; Cliff’s delta indicated large effect sizes). Qualitative analysis of teacher interviews highlighted improvements in comprehension and inclusion, along with implementation challenges. Taken together, these findings indicate that STTT has the potential to support newly arrived refugee students and help address multilingual education challenges.

Downloads

Download data is not yet available.

References


AbuJarour, S. (2022). Integration through education: Using ICT in education to promote the social inclusion of refugees in Germany. Journal of Information Systems Education, 33(1), 51–60. Retrieved from https://aisel.aisnet.org/jise/vol33/iss1/7




Agrawal, A. et al. (2023, October 9–13). All translation tools are not equal: Investigating the quality of language translation for forced migration. In 2023 IEEE 10th International Conference on Data Science and Advanced Analytics (DSAA) (pp. 1–10). IEEE. https://doi.org/10.1109/DSAA60987.2023.10302481




Alvarez-Alonso, M. J. et al. (2021). Boys-specific text-comprehension enhancement with dual visual-auditory text presentation among 12–14 years-old students. Frontiers Psychology, 12, 574685. https://doi.org/10.3389/fpsyg.2021.574685




Berner, K., & Alves, A. N. (2023). A scoping review of literature using speech recognition technologies by individuals with disabilities in multiple contexts. Disability and Rehabilitation: Assistive Technology, 18(7), 1139–1145. https://doi.org/10.1080/17483107.2021.1986583




Bernstein, J. et al. (1990). Automatic evaluation and training in English pronunciation. ICSLP, 90, 1185–1188. https://doi.org/10.21437/ICSLP.1990-313




Besters-Dilger, J. (2023). Language policy in Ukraine-overview and analysis. Ukrainian Analytical Digest, 1, 2–6. https://doi.org/10.3929/ethz-b-000623475




Chen, K. T. C. (2022). Speech-to-text recognition in University English as a Foreign Language Learning. Education and Information Technologies, 27(7), 9857–9875. https://doi.org/10.1007/s10639-022-11016-5




Clark, D. (2020). Tech and me: An autoethnographic account of digital literacy as an identity performance. Research in Learning Technology, 28, 2389. https://doi.org/10.25304/rlt.v28.2389




Creswell, J. W. (2015). Educational Research: Planning, Conducting, and Evaluating Quantitative and Qualitative Research. Pearson.




Dai, Y., & Wu, Z. (2023). Mobile-assisted pronunciation learning with feedback from peers and/or automatic speech recognition: A mixed-methods study. Computer Assisted Language Learning, 36(5–6), 861–884. https://doi.org/10.1080/09588221.2021.1952272




Dew, K. N. et al. (2018). Development of machine translation technology for assisting health communication: A systematic review. Journal of Biomedical Informatics, 85, 56–67. https://doi.org/10.1016/j.jbi.2018.07.018




Ehsani, F., & Knodt, E. (1998). Speech technology in computer-aided language learning: Strengths and limitations of a new CALL paradigm. Language Learning & Technology, 2(1), 54–73. https://doi.org/10.64152/10125/25032




Furui, S. et al. (2004). Speech-to-text and speech-to-speech summarization of spontaneous speech. IEEE Transactions on Speech and Audio Processing, 12(4), 401–408. https://doi.org/10.1109/TSA.2004.828699




Gao, J. et al. (2022). Review of the application of intelligent speech technology in education. Journal of China Computer-Assisted Language Learning, 2(1), 165–178. https://doi.org/10.1515/jccall-2022-0004




Gernsbacher, M. A. (2015). Video captions benefit everyone. Policy Insights from the Behavioral and Brain Sciences, 2(1), 195–202. https://doi.org/10.1177/2372732215602130




Huang, Y. M., Shadiev, R., & Hwang, W. Y. (2016). Investigating the effectiveness of speech-to-text recognition applications on learning performance and cognitive load. Computers & Education, 101, 15–28. https://doi.org/10.1016/j.compedu.2016.05.011




Instituto Nacional de Estadística (INE). (2024). Flujos de estudiantes ucranianos por meses. Retrieved October 3, 2025, from https://public.tableau.com/app/profile/instituto.nacional.de.estad.stica/viz/FlujosUcranianosMeses/Dashboard4




Ivanova, O. (2013). Bilingualism in Ukraine: Defining attitudes to Ukrainian and Russian through geographical and generational variations in language practices. Sociolinguistic Studies, 7(3), 249–272. https://doi.org/10.1558/sols.v7i3.249




Jeon, J., Lee, S., & Choi, S. (2023). A systematic review of research on speech-recognition chatbots for language learning: Implications for future directions in the era of large language models. Interactive Learning Environments, 32(8), 4613–4631. https://doi.org/10.1080/10494820.2023.2204343




Kapel Lev-ari, R., Aloni, R., & Ben-ari, A. (2024). Understanding the dyadic mental health of refugee parents and children after fleeing the 2022 Ukraine war. Psychological Trauma: Theory, Research, Practice, and Policy. https://doi.org/10.1037/tra0001715




Klatt, D. H. (1987). Review of text-to-speech conversion for English. The Journal of the Acoustical Society of America, 82(3), 737–793. https://doi.org/10.1121/1.395275




Li, J. (2022). Recent advances in end-to-end automatic speech recognition. APSIPA Transactions on Signal and Information Processing, 11(1), e8. https://doi.org/10.1561/116.00000050




Matthew, G. (2020). The effect of adding same-language subtitles to recorded lectures for non-native, English speakers in e-learning environments. Research in Learning Technology, 28, 2340. https://doi.org/10.25304/rlt.v28.2340




Matre, M. E., & Cameron, D. L. (2022). A scoping review on the use of speech-to-text technology for adolescents with learning difficulties in secondary education. Disability and Rehabilitation: Assistive Technology, 19(3), 1103–1116. https://doi.org/10.1080/17483107.2022.2149865




McKechnie, J. et al. (2018). Automated speech analysis tools for children’s speech production: A systematic literature review. International Journal of Speech-Language Pathology, 20(6), 583–598. https://doi.org/10.1080/17549507.2018.1477991




McTear, M. F. (2002). Spoken dialogue technology: Enabling the conversational user interface. ACM Computing Surveys (CSUR), 34(1), 90–169. https://doi.org/10.1145/505282.505285




Mehrish, A. et al. (2023). A review of deep learning techniques for speech processing. Information Fusion, 99(19), 101869. https://doi.org/10.1016/j.inffus.2023.101869




Morgan, D. L. (2023). Exploring the use of artificial intelligence for qualitative data analysis: The case of ChatGPT. International Journal of Qualitative Methods, 22, 1–10. https://doi.org/10.1177/16094069231211248




Nickolai, D., Schaefer, E., & Figueroa, P. (2024). Aggregating the evidence of automatic speech recognition research claims in CALL. System, 121, 103250. https://doi.org/10.1016/j.system.2024.103250




OpenAI. (2024). ChatGPT [Large language model]. Retrieved from https://chat.openai.com/




Osokina, O. et al. (2023). Impact of the Russian invasion on mental health of adolescents in Ukraine. Journal of the American Academy of Child & Adolescent Psychiatry, 62(3), 335–343. https://doi.org/10.1016/j.jaac.2022.07.845




Reddy, V. M., Vaishnavi, T., & Kumar, K. P. (2023, July 19–21). Speech-to-text and text-to-speech recognition using deep learning. In 2023 2nd International Conference on Edge Computing and Applications (ICECAA) (pp. 657–666). IEEE.




Save the Children. (2023). Back to School 2023–2024: Report on Education for Children Displaced by the Conflict in Ukraine at the Start of the Second School Year. Save the Children.




Scott, R. et al. (2022). Transcripción simultánea de voz a texto en el aula como medio de inclusión lingüística. En: Satorre Cuerda, R. (Ed.). El Profesorado, Eje Fundamental de la Transformación de la Docencia Universitaria. Octaedro, 416–427. ISBN 978-84-19506-52-8




Sedgwick, P., & Greenwood, N. (2016). Understanding the Hawthorne effect. BMJ, 2015, 351. https://doi.org/10.1136/bmj.h4672




Sethiya, N., & Maurya, C. K. (2025). End-to-end speech-to-text translation: A survey. Computer Speech & Language, 90, 101751. https://doi.org/10.1016/j.csl.2024.101751




Shadiev, R., Chen, X., & Altinay, F. (2024). A review of research on computer-aided translation technologies and their applications to assist learning and instruction. Journal of Computer Assisted Learning, 40(6), 3290–3323. https://doi.org/10.1111/jcal.13072




Shadiev, R., Chien, Y.-C., & Huang, Y.-M. (2020). Enhancing comprehension of lecture content in a foreign language as the medium of instruction: Comparing speech-to-text recognition with speech-enabled language translation. SAGE Open, 10(3), 215824402095317. https://doi.org/10.1177/2158244020953177




Shadiev, R., Huang, Y. M., & Hwang, J. P. (2017a). Investigating the effectiveness of speech-to-text recognition applications on learning performance, attention, and meditation. Educational Technology Research and Development, 65, 1239–1261. https://doi.org/10.1007/s11423-017-9516-3




Shadiev, R. et al. (2017b, August 1–4). Are STR & CAT-generated texts useful for comprehension of lecturing content in a foreign language? In 2017 10th International Conference on Ubi-Media Computing and Workshops (Ubi-Media) (pp. 1–6). IEEE. https://doi.org/10.1109/umedia.2017.8074121




Shadiev, R. et al. (2014). Review of speech-to-text recognition technology for enhancing learning. Journal of Educational Technology & Society, 17(4), 65–84.




Shadiev, R., & Liu, J. (2023). Review of research on applications of speech recognition technology to assist language learning. ReCALL. 35(1), 74–88. https://doi.org/10.1017/S095834402200012X




Shadiev, R., & Sun, A. (2019). Using texts generated by STR and CAT to facilitate student comprehension of lecture content in a foreign language. Journal of Computing in Higher Education, 32(3), 561–581. https://doi.org/10.1007/s12528-019-09246-7




Shew, A. (2020). Ableism, technoableism, and future AI. IEEE Technology and Society Magazine, 39(1), 40–85. https://doi.org/10.1109/MTS.2020.2967492




Smotrova, T. (2009). Globalization and English language teaching in Ukraine. TESOL Quarterly, 43(4), 727–732. https://doi.org/10.1002/j.1545-7249.2009.tb00200.x




Sun, W. (2023). The impact of automatic speech recognition technology on second language pronunciation and speaking skills of EFL learners: A mixed methods investigation. Frontiers in Psychology, 14, 1210187. https://doi.org/10.3389/fpsyg.2023.1210187




The World Bank. (2022). Displaced Education in Ukraine: Impact and Responses. The World Bank.




Ulum, Ö. G. (2025). Refugee voices unheard: Bridging the communication divide between Turkish police and refugees. Journal of Immigrant & Refugee Studies, 22, 1–22. https://doi.org/10.1080/15562948.2025.2529482




Warschauer, M., & Healey, D. (1998). Computers and language learning: An overview. Language Teaching, 31(2), 57–71. https://doi.org/10.1017/S0261444800012970
Published
2025-11-05
How to Cite
Scott , R., Vila , C., Pérez-Alcaraz , D., Vaello , O., Pérez-Torres , J. M., Ibanco-Cañete , R., Brotons-Mas , J., de-la-Peña , C., Álvarez-Alonso , M. J., & Pozo-Rico , T. (2025). Real-time speech-to-text translation in Spanish secondary classrooms: a mixed-methods study on refugee student inclusion. Research in Learning Technology, 33. https://doi.org/10.25304/rlt.v33.3418
Section
Original Research Articles