Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Page Not Found

Page not found. Your pixels are in another canvas.

Jupyter notebook markdown generator

Posts

Future Blog Post

less than 1 minute read

Published: January 01, 2199

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published: August 14, 2015

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published: August 14, 2014

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published: August 14, 2013

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published: August 14, 2012

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

academics

Reviewing for conferences and journals

Published: June 01, 2025

I have reviewed for the following conferences:

education

BSc in Information Systems and Computer Engineering

Instituto Superior Técnico, September 2014 – July 2017

Final Grade Average: 18.00 / 20
Awarded three academic excellence diplomas (one for each academic year)

MSc in Advanced Computing

Imperial College London, October 2017 – October 2018

Modules: 68.2/100 (Merit)
Individual Research Project (Master’s thesis): 75/100 (Distinction)

PhD in Computing

Imperial College London, October 2019 – July 2023

Working on generative modeling and self-supervised learning on audio-visual speech.

experience

Research Assistant

Imperial College London, February 2019 – October 2019

Worked as a research assistant in the Intelligent Behaviour Understanding Group (IBUG) at Imperial College London, supervised by Maja Pantic and Björn Schuller. Developed research projects on video-to-speech synthesis and audio-visual self-supervised learning.

Teaching Assistant - Introduction to Machine Learning

Imperial College London, January 2020 – April 2020

Worked as a teaching assistant for the Introduction to Machine Learning (70050) course, led by Dr. Josiah Wang at Imperial College London.

Developed a collaboration between my team at Meta (led by Maja Pantic) and another team at Meta Reality Labs Audio Research (led by Vamsi Krishna Ithapu), which was focused on audio-visual speech enhancement.

Part-time Researcher at Meta

Meta, March 2022 – June 2022

Continued my work in audio-visual speech enhancement from the previous internship, and developed new collaborations with other researchers.

Research Internship at Meta

Meta, June 2022 – September 2022

Continued my work in audio-visual speech enhancement from the previous internship, resulting in a new research paper: LA-VocE: Low-SNR Audio-visual Speech Enhancement using Neural Vocoders.

Research Internship at Sony R&D

Sony, September 2023 – November 2023

Developed a novel video-to-audio synthesis model, establishing a colaboration between two teams at Sony R&D.

Postdoctoral Researcher at Meta

Meta, November 2023 – June 2025

Developing new audio-visual speech models. My main project has been Automated Dubbing for Instagrams Reels. I have also written papers on Real-time Audio-visual Speech Enhancement (Interspeech 2024) Contextual Speech Extraction (ICASSP 2025), Unified Speech Recognition (NeurIPS 2024), Expressive Facial Animation (CVPR 2025), and others which are currently under review.

Research Scientist at Google DeepMind

Google DeepMind, July 2025 – Current

Working on something new.

extra_curricular

Guitar, Jazz Band and Music Theory

Interartes, August 2010 – September 2015

Took classes at Interartes and played with a Jazz band for 4 years, frequently in front of live audiences. During this time I was also taught the fundamentals of Music Theory and Harmony.

STT Security Group

Instituto Superior Técnico, February 2016 – July 2017

Was a member of the Computer Software Security group (STT) representing Instituto Superior Técnico and led by professor Pedro Adão, where I frequently engaged in CTF (Capture the Flag) to score points for our team.

Landmarks - Jazz Band at Meta

Meta, March 2022 – September 2022

Organized a Jazz band with my team at Meta named Landmarks, where I played bass every week.

ASPARaGus - A Speech & Audio Reading Groupuscule

Imperial College London, March 2022 – June 2023

I attend and present in a bi-weekly research group with some of my colleagues from Imperial College London. We usually focus on deep learning applied to speech and audio. Some of my presentations (including slides) are available in the Talks section.

Squash Club

Imperial College London, January 2022 – June 2025

I organize a squash club with weekly matches.

online_presence

Academic Twitter Account

Published: November 01, 2020

I started my own academic Twitter account where I regularly publish updates on my research.

Reddit discussion - Are ResNets as good as it gets?

Published: May 13, 2021

Posted on r/machinelearning about my experience with ResNets, and my lack of success with other recent vision models such as EfficientNet.
[post]

Interview with VICE - Tech Companies Are Training AI to Read Your Lips

Published: June 14, 2021

Gave an interview for VICE, conducted by Todd Feathers, where I discussed the field of lip reading and our role as AI researchers at Imperial College London.
[article]

publications

LiRA: Learning Visual Speech Representations from Audio through Self-Supervision

Pingchuan Ma and Rodrigo Mira (equal contribution), Stavros Petridis, Björn W. Schuller, Maja Pantic

Published in Interspeech, 2021

[paper] [3-minute video introduction] [bib]

End-to-End Video-to-Speech Synthesis using Generative Adversarial Networks

Rodrigo Mira, Konstantinos Vougioukas, Pingchuan Ma, Stavros Petridis, Björn W Schuller, Maja Pantic

Published in IEEE Transactions on Cybernetics, 2022

[project page] [paper] [bib]

Leveraging Real Talking Faces via Self-Supervision for Robust Forgery Detection

Alexandros Haliassos, Rodrigo Mira, Stavros Petridis, Maja Pantic

Published in CVPR, 2022

[paper] [code] [bib]

SVTS: Scalable Video-to-Speech Synthesis

Rodrigo Mira, Alexandros Haliassos, Stavros Petridis, Björn W Schuller, Maja Pantic

Published in Interspeech, 2022

[project page] [paper] [bib]

Jointly Learning Visual and Auditory Speech Representations from Raw Data

Alexandros Haliassos, Pingchuan Ma, Rodrigo Mira, Stavros Petridis, Maja Pantic

Published in ICLR, 2023

[paper] [bib]

Automated composition of Galician Xota—tuning RNN-based composers for specific musical styles using deep Q-learning

Rodrigo Mira, Eduardo Coutinho, Emilia Parada-Cabaleiro, Björn W Schuller

Published in PeerJ Computer Science, 2023

[paper] [code] [bib]

LA-VocE: Low-SNR Audio-visual Speech Enhancement using Neural Vocoders

Rodrigo Mira, Buye Xu, Jacob Donley, Anurag Kumar, Stavros Petridis, Vamsi Krishna Ithapu, Maja Pantic

Published in ICASSP, 2023

[project page] [paper] [bib]

Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models

Antoni Bigata Casademunt, Rodrigo Mira, Nikita Drobyshev, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic

Published in BMVC, 2023

[project page] [paper] [code] [bib]

BRAVEn: Improving Self-Supervised Pre-training for Visual and Auditory Speech Recognition

Alexandros Haliassos and Andreas Zinonos (equal contribution), Rodrigo Mira, Stavros Petridis, Maja Pantic

Published in ICASSP, 2024

[paper] [code] [slides] [bib]

RT-LA-VocE: Real-Time Low-SNR Audio-Visual Speech Enhancement

Honglie Chen and Rodrigo Mira (equal contribution), Stavros Petridis, Maja Pantic

Published in Interspeech, 2024

[paper] [bib]

Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs

Alexandros Haliassos, Rodrigo Mira, Honglie Chen, Zoe Landgraf, Stavros Petridis, Maja Pantic

Published in NeurIPS, 2024

[paper] [code] [presentation by Alex] [bib]

Contextual Speech Extraction: Leveraging Textual History as an Implicit Cue for Target Speech Extraction

Minsu Kim and Rodrigo Mira (equal contribution), Honglie Chen, Stavros Petridis, Maja Pantic

Published in ICASSP, 2025

[paper] [project page] [code] [presentation] [bib]

KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation

Antoni Bigata, Michał Stypułkowski, Rodrigo Mira, Stella Bounareli, Konstantinos Vougioukas, Zoe Landgraf, Nikita Drobyshev, Maciej Zieba, Stavros Petridis, Maja Pantic

Published in CVPR, 2025

[paper] [project page] [code] [presentation by Antoni] [bib]

KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution

Antoni Bigata, Rodrigo Mira, Stella Bounareli, Michał Stypułkowski, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic

Published in arXiv, 2025

[paper] [project page] [code] [bib]

talks

Show & Tell Demo - Video-driven Speech Reconstruction using GANs

Published: May 04, 2020

Short talk where I introduced our new video-to-speech model and showed its performance using an interactive demo.
[recording] [slides]

Workshop Presentation - End-To-End Video-To-Speech Synthesis using Generative Adversarial Networks with Multiple Critics

Published: June 20, 2021

Short talk where I presented our new end-to-end video-to-speech model (later published in IEEE Trans. on Cybernetics).
[recording (presentation at 42:23, questions at 57:38)] [slides] [short paper]

Reading Group Presentation - Bootstrap your own latent, A new approach to self-supervised Learning

Published: June 03, 2022

Short presentation where I talked about past, current and future trends in self-supervised learning, including BYOL.
[slides]

Research Presentation with the President of Portugal

Published: June 11, 2022

I presented a summary of my PhD’s research contributions to the president of Portugal, Marcelo Rebelo de Sousa, as well as many other attendees at Imperial College London.
[poster] [article]

Workshop Presentation - SVTS: Scalable Video-to-Speech Synthesis - Extended Abstract

Published: June 19, 2022

Short talk where I presented our new scalable video-to-speech model (later published in Interspeech 2022).
[recording (presentation at 1:34:10)] [slides] [short paper]

Reading Group Presentation - SVTS, Scalable Video-to-Speech Synthesis

Published: July 29, 2022

Short talk where I presented our new scalable video-to-speech model (later published in Interspeech 2022).
[recording] [slides]

Oral Presentation - SVTS: Scalable Video-to-Speech Synthesis

Published: September 21, 2022

Oral presentation about our new scalable video-to-speech model (SVTS).
[recording] [slides] [paper]

Reading Group Presentation - Neural vocoders, from GANS to Diffusion models

Published: November 11, 2022

Short presentation where I talked about neural vocoders, and how they are moving away from GANs and towards diffusion models.
[slides]

Reading Group Presentation - EnCodec, High Fidelity Neural Audio Compression

Published: March 24, 2023

Short presentation where I talked about EnCodec, a new neural codec that achieves state-of-the-art results on clean/noisy speech and music at 24/48 kHz (mono and stereo). It also has open-source code and outperforms Google’s approach - LyraV2.
[slides]

Workshop Presentation - LA-VocE and RAVEn

Published: June 19, 2023

Short talks where I presented my new papers on audio-visual self-supervised learning (RAVEn, published in ICLR 2023) and audio-visual speech enhancement (LA-VocE, published in ICASSP 2023).
[RAVEn recording] (presentation at 1:59:57) [LA-VocE recording (presentation at 5:10:19)] [RAVEn slides] [LA-VocE slides] [RAVEn short paper] [LA-VocE short paper]

Rodrigo Mira

Sitemap

Pages

Posts

academics

education

experience

extra_curricular

online_presence

publications

talks