FURI | Spring 2025

Developing Robust Transcription Models for Ambulance Call Data Using Domain Randomization Techniques

Health icon, disabled. A red heart with a cardiac rhythm running through it.

The researcher evaluated whisper’s speech-to-text model’s performance for hospital transcriptions. Initially, the model achieved a WER (word error rate) close to 53%. The researcher then worked on implementing domain randomization by adding noise to the dataset and fine-tuning the whisper model with this new dataset, achieving a WER of close to 47%. This has shown that techniques like domain randomization and data augmentation can effectively improve transcription accuracy. In the future, the researcher will work on exploring other speech-to-text models and refining noise injection techniques like adding real-world noise to the dataset (ambulance and traffic noises, etc.).

View the poster

Student researcher

Sai Shiva Satwik Mallajosyula

Computer science

Hometown: Visakhapatnam, Andhra Pradesh, India

Graduation date: Spring 2025

More projects from the current symposium

Anushka Tiwari

Computer engineering

Vision-Based End-to-End Trajectory Prediction for Autonomous Vehicles using the Waymo Open Dataset

Training AI on Waymo data to predict autonomous vehicle trajectories with attention mechanisms that show why the model decides what it does should lead to safer roads.

Mentor: Bharatesh Chakravarthi