Senior Research Scientist at Google DeepMind
Google DeepMind, July 2025 – Current
Working on something new.
Google DeepMind, July 2025 – Current
Working on something new.
Meta, November 2023 – June 2025
Developing new audio-visual speech models. My main project has been Automated Dubbing for Instagrams Reels. I have also written papers on Real-time Audio-visual Speech Enhancement (Interspeech 2024) Contextual Speech Extraction (ICASSP 2025), Unified Speech Recognition (NeurIPS 2024), Expressive Facial Animation (CVPR 2025), and others which are currently under review.
Sony, September 2023 – November 2023
Developed a novel video-to-audio synthesis model, establishing a colaboration between two teams at Sony R&D.
Meta, June 2022 – September 2022
Continued my work in audio-visual speech enhancement from the previous internship, resulting in a new research paper: LA-VocE: Low-SNR Audio-visual Speech Enhancement using Neural Vocoders.
Meta, March 2022 – June 2022
Continued my work in audio-visual speech enhancement from the previous internship, and developed new collaborations with other researchers.
Meta, August 2021 – December 2021
Developed a collaboration between my team at Meta (led by Maja Pantic) and another team at Meta Reality Labs Audio Research (led by Vamsi Krishna Ithapu), which was focused on audio-visual speech enhancement.
Imperial College London, January 2020 – April 2020
Worked as a teaching assistant for the Introduction to Machine Learning (70050) course, led by Dr. Josiah Wang at Imperial College London.
Imperial College London, February 2019 – October 2019
Worked as a research assistant in the Intelligent Behaviour Understanding Group (IBUG) at Imperial College London, supervised by Maja Pantic and Björn Schuller. Developed research projects on video-to-speech synthesis and audio-visual self-supervised learning.