Research Staff, Voice AI Foundations
Remote
Full Time
#AI
#Speech Recognition
#Research
#Learning
#Foundation
#Audio
#Generative Models
#Data Pipelines
#Experimental Design
#Optimization
At Deepgram, we are building the world’s most advanced voice AI platform. We empower over 200,000 developers to create speech-to-text, text-to-speech, and full speech-to-speech applications with unmatched accuracy and speed. Having processed over 50,000 years of audio and transcribed more than 1 trillion words, we are the industry leader in understanding voice. We are currently cash-flow positive and growing rapidly, and we are looking for passionate individuals to help us solve the fundamental challenges of audio AI to make human-machine interaction truly universal.
The role
We are seeking a Senior Research Staff member to join our team on a full-time basis. This is a remote position open to candidates anywhere. In this role, you will pioneer new paradigms in audio AI, specifically focusing on Latent Space Models to overcome the data, scale, and cost limitations that currently hinder voice technology. You will work at the intersection of theory and practice, driving innovation that has the potential to transform how the world interacts with machines.
Core responsibilities
- Develop next-generation neural audio codecs capable of extreme low bit-rate compression while maintaining high-fidelity reconstruction across massive, diverse audio datasets.
- Design steerable generative models that synthesize the full range of human expression, from casual conversation to complex multi-speaker scenarios, using codec latent representations.
- Architect hardware-aware training schemes and inference algorithms that enable cost-efficient processing of billion-hour datasets and support real-time interaction for millions of concurrent users.
Skills and experience
To be successful in this role, you should possess a strong mathematical foundation and a drive to solve complex, unsolved problems. We look for the following qualifications:
- Deep expertise in AI, speech recognition, and foundation model architectures, with a proven ability to scale training across multiple modalities.
- A strong background in statistical learning theory and the ability to bridge novel mathematical formulations with efficient, practical implementation.
- Experience designing and managing data pipelines for massive datasets, along with a track record of conducting rigorous experimental design to validate architectural innovations.
- Proficiency in optimization techniques, including a solid understanding of hardware constraints and how to tune models for real-world deployment.
- A history of research publications or open-source contributions that have significantly advanced the state of the art in audio or generative models.
Compensation and benefits
We believe in supporting our team members wherever they are located. As part of our commitment to our employees, we offer the following:
- Remote work flexibility, allowing you to contribute from anywhere in the world.
How to apply
If you are obsessed with solving difficult technical problems and are energized by the prospect of pioneering new approaches in voice AI, we want to hear from you. Please submit your application to join our team, and let us know how your background in research and systems engineering can help us push the boundaries of what is possible with voice technology.
Deepgram
6 views
Markets







