{"access":{"advertiser_pricing_url":"https://aidevboard.com/pricing","catalog_url":"https://aidevboard.com/api/v1/catalog","description":"Public read endpoints are open and free. API keys are optional for stable agent identity and keyed hourly throttling.","docs_url":"https://aidevboard.com/docs","mode":"open","register_url":"https://aidevboard.com/api/v1/register"},"degraded":false,"estimated":false,"has_next":true,"jobs":[{"id":"f47b2b52-9138-4056-a197-783873a96c39","company_id":"f5ee7284-a657-4da2-b351-cb806a3681cd","title":"Member of Technical Staff - Voice Model","slug":"member-of-technical-staff-voice-model-5b5f6cb9","description":"SpaceXAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge.  Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company’s mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates. \n ABOUT THE ROLE:\n You will join the Grok Voice Model team to help build the world’s best voice AI. We deliver smooth, natural, low-latency spoken interactions — expressive, multilingual, and reliable across devices and real-time scenarios. We own the full training pipeline: massive data curation, premium audio processing, frontier speech-language pre-training, and intensive post-training to push quality, speed, and stability to the limit.\n Our goal: make talking to AI feel like conversing with the most charming, kind, and knowledgeable person imaginable. We’re seeking exceptionally smart, execution-oriented engineers to help us get there.\n RESPONSIBILITIES:\n \n Design and execute large-scale speech data curation and processing pipelines, including collection of diverse real-world audio, synthetic data generation, and automated annotation workflows to enable high-quality model training and evaluation.\n Work on pre-training and post-training of speech-language models, with targeted enhancements through supervised fine-tuning, reinforcement learning, and other techniques to ensure Grok Voice responses are accurate, factually grounded, natural and idiomatic in spoken style, conversational in tone, and fluent across multiple languages.\n Build and iterate a comprehensive evaluation framework covering objective metrics (accuracy, quality, latency, expressiveness), human preference studies, content factuality assessments, real-time interaction quality, and experimentation infrastructure to measure and improve performance.\n Work closely with product teams to integrate voice models into applications and real-time environments, define spoken interaction specifications, and handle the full lifecycle from prototype to global-scale deployment for stable, low-latency, delightful voice experiences.\n \n BASIC QUALIFICATIONS:\n \n Python expert with deep proficiency in writing clean, efficient code for AI/ML systems.\n Hands-on experience processing large-scale datasets using tools like Spark and Ray for cleaning, augmentation, and feature extraction.\n Proficiency in pre-training and post-training speech-language models using JAX/PyTorch, including supervised fine-tuning, reinforcement learning, and optimizations for accuracy, factuality, natural spoken style, detail, and multilingual fluency.\n Ability to set up and run rigorous evaluation pipelines: objective metrics, human preference studies, content factuality checks, and iterative A/B testing to drive model improvements.\n Experience building or working with large-scale distributed training and inference systems on Kubernetes.\n Proactive, self-driven attitude — ready to grind in a fast-paced, high-caliber team to deliver outstanding voice AI experiences.\n \n COMPENSATION AND BENEFITS:\n $150,000 - $450,000 USD\n Base salary is just one part of our total rewards package at SpaceXAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short \u0026 long-term disability insurance, life insurance, and various other discounts and perks.\n SpaceXAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice .","salary_min":150000,"salary_max":450000,"location":"Palo Alto, CA","workplace":"onsite","remote_scope":"not_remote","job_type":"full-time","experience_level":"lead","tags":["speech","fine-tuning","reinforcement-learning","distributed-systems","pytorch","pre-training"],"apply_url":"https://job-boards.greenhouse.io/xai/jobs/5051966007","is_featured":true,"is_sticky":false,"status":"active","published_at":"2026-03-16T20:39:18Z","expires_at":"2026-08-14T14:04:44.897369Z","created_at":"2026-04-13T09:38:43.3144Z","updated_at":"2026-07-15T14:04:45.027875Z","company_name":"xAI","company_slug":"xai","company_logo_url":"https://www.google.com/s2/favicons?domain=x.ai\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/f47b2b52-9138-4056-a197-783873a96c39"},{"id":"0003f63a-b2b2-44e0-b588-7a3de39a2516","company_id":"28040a6c-6f94-41a4-b15a-f2e4520188ff","title":"Agent Experience Designer, Agentic Voice","slug":"agent-experience-designer-agentic-voice-00a4cb3f","description":"About Dialpad Dialpad is the AI-native business communications platform. We unify calling, messaging, meetings, and contact center on a single platform - powered by AI that understands every conversation in real time. \n More than 70,000 companies around the globe, including WeWork, Asana, NASDAQ, AAA Insurance, COMPASS Realty, Uber, Randstad, and Tractor Supply, rely on Dialpad to build stronger customer connections using real-time, AI-driven insights. \n We’re now leading the shift to Agentic AI: intelligent agents that don’t just analyze conversations but take action by automating workflows, resolving customer issues, and accelerating revenue in real time. Our DAART initiative (Dialpad Agentic AI in Real Time) is redefining what a communications platform can do. \n Visit dialpad.com to learn more. \n Being a Dialer At Dialpad, AI isn’t just a feature; it’s how our teams do their best work every day. We put powerful AI tools in every employee’s hands so they can move faster, think bigger, and achieve more. \n We believe every conversation matters. And we’ve built the platform that turns those conversations into insight and action, for our customers and ourselves. \n We look for people who are intensely curious and hold themselves to a high bar. Our ambition is significant, and achieving it requires a team that operates at the highest level. We seek individuals who embody our core traits: Scrappy, Curious, Optimistic, Persistent, and Empathetic . \n Your role As an Agent Experience Designer — Agentic Voice, you’ll own the voices, personalities, and interactions that make an AI agent feel intuitive, empathetic, and human. We are going all-in on agentic AI under one core idea: stop answering, start resolving. A voice agent that resolves is only as good as the experience it delivers, and designing that entire voice experience is your job. \n Reporting directly to the VP of AI Products, you’ll collaborate hand-in-hand with our AI engineers to shape model judgment through prompts and flow orchestration, rather than hard-coded branches. You’ll also help create a centralized persona system, voice standards, and the universal quality bar that forward-deployed VX designers will apply account-by-account in the field. \n In addition, you’ll help bring a deep sense of behavioral and emotional design to our platform, ensuring our agents have the taste, pacing, and vocabulary to sound truly competent and empathetic across both happy paths and high-stakes moments. \n This position has the opportunity to be based in our San Ramon, US office.\n What you’ll do \n \n Own the agent's global voice, character, and personality, maintaining personal consistency across every vertical we ship. \n Own the standard handoff patterns and design systems, ensuring seamless transitions where context is fully preserved when an agent passes a caller to a human. \n Own the universal platform quality bar, defining and measuring Consistency, Fluency, and Latency (CFL) and tying personal decisions directly to core metrics like resolution, containment, and sentiment. \n Make the voice palette and establish house standards for pacing, prosody, and emphasis that forward-deployed teams will use to build brand-specific experiences. \n Partner with AI engineers to orchestrate behavior, escalation instincts, confirmation patterns, and graceful recovery workflows using advanced prompting rather than rigid dialogue trees. \n Research and design for distinct behavioral and emotional user states, ensuring the agent adapts seamlessly whether interacting with a patient disputing a bill or a dispatcher tracing a late delivery. \n \n Skills you’ll bring \n \n Experience: 5+ years of dedicated experience shaping voice user interfaces (VUI), character writing, conversation design, or complex conversational/agentic systems. \n Bachelor's degree in Linguistics, Communication, Psychology, Design, or equivalent practical experience. \n Demonstrated experience shaping voice user interfaces (VUI), character writing, or complex conversational/agentic systems. \n Fluency with LLM-based agent behaviors, prompt engineering, and prompt orchestration (knowing how design choices alter model outputs without relying on code). \n Fluency with Text-to-Speech (TTS) controls, including voice selection, SSML tuning, pacing, and emphasis to set broad platform standards. \n An exceptional portfolio that highlights voice systems, written persona standards, and interactive logic rather than just static flow diagrams. \n Experience in regulated, high-stakes verticals (e.g., healthcare, financial services, legal) is a strong plus. \n Strong taste and an ear for dialogue—the ability to articulate a character on a page and translate it into consistent AI behavior under pressure. \n \n For exceptional talent based in California, the target base salary range for this position is posted below. Our salary ranges are determined by role, level, and location. The range displayed on each job posting","salary_min":147000,"salary_max":186000,"location":"San Ramon, US","workplace":"onsite","remote_scope":"not_remote","job_type":"full-time","experience_level":"senior","tags":["agents","speech","llm","healthcare"],"apply_url":"https://job-boards.greenhouse.io/dialpad/jobs/8633475002","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-07-14T20:04:06Z","expires_at":"2026-08-14T14:23:00.471126Z","created_at":"2026-07-15T14:23:00.567276Z","updated_at":"2026-07-15T14:23:00.567276Z","company_name":"Dialpad","company_slug":"dialpad","company_logo_url":"https://www.google.com/s2/favicons?domain=dialpad.com\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/0003f63a-b2b2-44e0-b588-7a3de39a2516"},{"id":"dc54d254-d290-4242-871b-1d784c72c570","company_id":"0fc88a91-688e-421d-917d-4880569dd976","title":"Principal Research \u0026 Engineering, Realtime Voice AI","slug":"principal-research-engineering-realtime-voice-ai-9c626749","description":"About Inflection AI \n Inflection AI is a Public Benefit Corporation empowering people with human-centered, emotionally intelligent AI. We’re shaping the future of AI by combining emotional intelligence (EQ) and raw intelligence (IQ) to elevate people’s potential. Inflection AI created Pi, the world’s first emotionally intelligent AI, to help people work through decisions, emotions, and challenges. Pi is a personal AI agent powered by Inflection AI’s foundation model, proving that AI can be personal, empathetic, and contextually aware.\n About the Role \n Voice is becoming the highest-stakes interface for AI, where quality depends on speed, naturalness, interruption handling, emotional nuance, and reliability in real-world conditions. We are looking for a hands-on technical leader to define and build Inflection’s realtime Voice AI stack across speech models, streaming systems, voice-agent runtime, and evaluation. This person will help shape how emotionally intelligent AI shows up in spoken interactions, partnering across research, engineering, product, and design to deliver voice agents that feel responsive, trustworthy, and useful in enterprise settings. \n What You’ll Do \n \n Establish the technical roadmap for Inflection's realtime Voice AI stack, encompassing streaming ASR, TTS, speech-to-speech, speech LLMs, turn-taking, barge-in, latency, and reliability.\n Utilize a 1,000 GPU cluster to support performance benchmarking and extensive experimentation.\n Determine build-vs-buy-vs-train strategies for core audio, speech, and realtime interaction components.\n Direct research and engineering efforts focused on speech quality, naturalness, expressiveness, emotional fit, controllability, and production readiness.\n Collaborate with infrastructure, product, design, and agentic AI teams to deploy voice agents for enterprise workflows.\n Develop evaluation systems measuring voice quality through metrics such as clarity, emotional appropriateness, interruption handling, task success, user preference, latency, and reliability, moving beyond standard WER.\n Refine production voice behavior by debugging across runtime, model, evaluation, data, and product layers.\n Mentor, and Coach a team specializing in speech research, audio infrastructure, realtime systems, and evaluation.\n \n What We’re Looking For \n \n Experience leading or serving as a principal Research and Engineering contributor to realtime voice, speech, audio AI, or conversational AI systems in production.\n Experience with one or more of: streaming ASR, TTS, speech-to-speech systems, speech LLMs, audio tokenization, multimodal models, barge-in, low-latency inference, or realtime agents.\n Strong technical judgment across both speech modeling and production systems.\n Ability to define voice quality in terms of user and customer outcomes, not only offline model metrics.\n Experience designing or using evaluation systems that capture real user experience.\n Strong product intuition for natural, trustworthy, emotionally appropriate voice interactions.\n Ability to lead senior technical talent while staying close to the code, architecture, and debugging work.\n Have a bachelor’s degree or equivalent in a related field to the offered position requirements\n \n Employee Pay Disclosures \n At Inflection AI, we aim to attract and retain the best employees and compensate them in a way that appropriately and fairly values their individual contributions to the company. For this role, Inflection AI estimates a starting annual base salary to fall within the range of $400,000 to $550,000 , depending on a candidate’s qualifications and level of experience. This role also includes a meaningful equity component, allowing employees to share in the long-term success of the company.\n  \n Benefits \n Inflection AI values and supports our team’s mental, emotional, financial and physical health. We are focused on building a positive, safe, inclusive and inspiring place to work. Our benefits include: \n \n Robust medical, dental and vision options with employer contributions for HSA, FSA and DFSA\n 401k matching program \n Flexible Time Off, 10 paid holidays, 5 days sick leave\n Parental, Medical and Family care leave \n Generous cell-phone, wellness and office set up stipends \n Support of country-specific visa needs for international employees living in the Bay Area","salary_min":400000,"salary_max":550000,"location":"Palo Alto, CA","workplace":"onsite","remote_scope":"not_remote","job_type":"full-time","experience_level":"principal","tags":["llm","generative-ai","gpu","agents","speech","research"],"apply_url":"https://boards.greenhouse.io/inflectionai/jobs/4693024006?gh_jid=4693024006","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-06-29T18:24:22Z","expires_at":"2026-08-14T14:06:38.003809Z","created_at":"2026-06-30T14:04:31.512807Z","updated_at":"2026-07-15T14:06:38.13078Z","company_name":"Inflection AI","company_slug":"inflection-ai","company_logo_url":"https://www.google.com/s2/favicons?domain=inflection.ai\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/dc54d254-d290-4242-871b-1d784c72c570"},{"id":"37531681-5540-4d72-bae6-adb400a217ff","company_id":"f5ee7284-a657-4da2-b351-cb806a3681cd","title":"Member of Technical Staff","slug":"member-of-technical-staff-58cdf6b3","description":"SpaceXAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge.  Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company’s mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates. \n Member of Technical Staff (X.AI LLC; Palo Alto, CA): Introduce innovative techniques and analyses to theAI field to facilitate breakthroughs in quantitative reasoning and language understanding. Stabilize large language model training, pipeline parallelism training of large language models, and fine-tuning large language models with truthful data. Ensure organization’s work is aligned with broader company objectives. Spend time working on hands-on technical problems including design and implementation. Perform cutting-edge research on advanced techniques from AI and deep learning, including neural network architectures, language modeling, and speech recognition. Work closely with leaders across the company to deliver impactful projects which may involve work in areas such as machine learning, applied data science, recommendation systems, and information retrieval systems. Telecommuting permitted. Reference: 00100860  \n The position requires a Bachelor’s or foreign equivalent degree in Computer Science, Computer Engineering, Mechanical Engineering, Machine Learning or in a related field and 2 years of experience in the job offered or in a computer-related occupation.  \n Special Requirements: Position requires experience, knowledge or coursework in each of the following skills:\n \n BigData systems such as Spark, Hadoop, BigQuery, and related technology to build highly scalable data processing systems \n Building large-scale Kubernetes clusters for data storage, processing, and analysis on on-prem  systems and cloud computing \n Applied machine learning techniques and deploying large-scale deep learning systems 4. Working with distributed system engineers and AI researchers in developing technologies in the area of natural language processing, computer vision, and speech recognition applications 5. Rust, C++, or Python programming language to build tooling and features within company software development code and standards. \n Building applications with hardware accelerators, such as GPUs, TPUs from GCP, Azure, and AWS.Employment and background checks may be required. \n \n Salary: $324,000 - $396,000 per year \n To Apply: Any interested applicant may click on the APPLY NOW button above to apply for this position. \n SpaceXAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice .","salary_min":324000,"salary_max":396000,"location":"Palo Alto, CA","workplace":"onsite","remote_scope":"not_remote","job_type":"full-time","experience_level":"lead","tags":["llm","nlp","speech","computer-vision","search","deep-learning","fine-tuning","distributed-systems"],"apply_url":"https://job-boards.greenhouse.io/xai/jobs/5173208007","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-06-24T19:28:53Z","expires_at":"2026-08-14T14:04:42.867245Z","created_at":"2026-06-28T14:03:13.307512Z","updated_at":"2026-07-15T14:04:43.003373Z","company_name":"xAI","company_slug":"xai","company_logo_url":"https://www.google.com/s2/favicons?domain=x.ai\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/37531681-5540-4d72-bae6-adb400a217ff"},{"id":"dd43e8fb-8ef2-476d-b3d2-6ace126b3474","company_id":"f5ee7284-a657-4da2-b351-cb806a3681cd","title":"Member of Technical Staff","slug":"member-of-technical-staff-ea022ae5","description":"SpaceXAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge.  Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company’s mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates. \n Member of Technical Staff (X.AI LLC; Palo Alto, CA): Build collaborative relationships with the Principal and Staff engineering community as well as with engineering and product management leaders, and partner to deliver impact. Define the vision and strategy for the organization and have a substantial impact on the vision and strategy of customer and partner organizations. Plan and deliver projects that impact multiple organizations. Projects include models and large language model training, pipeline parallelism training of large language models, and fine tuning large language models with truthful data. Identify opportunities for technological differentiation, investment, or divestment. Ensure organization’s work is aligned with broader company objectives. Introduce innovative techniques and analyses to the AI field to facilitate breakthroughs in quantitative reasoning and language understanding. Provide mentorship and guidance to senior technical leaders and managers. Working on hands-on technical problems including design and implementation. Perform cutting-edge research on advanced techniques from AI and deep learning, including neural network architectures, language modeling, and speech recognition. Work closely with leaders across the company to deliver impactful projects which may involve work in areas such as machine learning, applied data science, recommendation systems, and information retrieval systems. Reference: 00101156. \n Minimum requirements: \n \n Must have a Bachelor’s degree in Computer Science, Artificial Intelligence, MachineLearning, Information Technology, or a related field, plus 5 years of progressive post-baccalaureate experience in AI/ML research or development. \n Alternatively, employer will accept a Master’s degree in Computer Science, Artificial Intelligence, Machine  Learning, Information Technology or a related field, plus 3 years of experience in AI/ML research or  development. \n Must have 3 years of experience in each of the following: \n Developing and applying AI models, including large language models, neural networks, or  reinforcement learning. \n Experience in Python and other relevant languages (such as C++, Java), using ML frameworks (such as PyTorch, TensorFlow, or JAX). \n Assessing system performance and designing scalable, efficient solutions for inference and model  deployment. \n Optimizing computational efficiency and memory usage through advanced algorithmic techniques or quantization. \n Building and maintaining robust, high-availability infrastructure for AI/ML services. ∙ Collaborating with cross-functional teams to define technical vision, strategy, and deliver impactful  projects. \n Mentoring and guiding senior technical leaders and managers in hands-on technical problem solving.∙ Driving innovation by introducing new techniques and analyses to advance AI capabilities in  quantitative reasoning and language understanding. \n Leading efforts in resource management, automation, and orchestration for large-scale ML  infrastructure. \n \n Salary: $324,000 - $396,000 per year \n To Apply: Any interested applicant may click on the APPLY NOW button above to apply for this position. \n SpaceXAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice .","salary_min":324000,"salary_max":396000,"location":"Palo Alto, CA","workplace":"onsite","remote_scope":"not_remote","job_type":"full-time","experience_level":"lead","tags":["pytorch","tensorflow","nlp","deep-learning","speech","llm","reinforcement-learning","fine-tuning"],"apply_url":"https://job-boards.greenhouse.io/xai/jobs/5173142007","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-06-24T19:05:16Z","expires_at":"2026-08-14T14:04:42.68898Z","created_at":"2026-06-28T14:03:13.39564Z","updated_at":"2026-07-15T14:04:42.829436Z","company_name":"xAI","company_slug":"xai","company_logo_url":"https://www.google.com/s2/favicons?domain=x.ai\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/dd43e8fb-8ef2-476d-b3d2-6ace126b3474"},{"id":"2d3dc650-dcfc-4532-93b8-8b3c42ec0fc2","company_id":"74257563-5513-4a8d-a0f7-01f00c59aed6","title":"Machine Learning Engineer, Community Support Engineering","slug":"machine-learning-engineer-community-support-engineering-12766547","description":"Airbnb was born in 2007 when two hosts welcomed three guests to their San Francisco home, and has since grown to over 5 million hosts who have welcomed over 2 billion guest arrivals in almost every country across the globe. Every day, hosts offer unique stays and experiences that make it possible for guests to connect with communities in a more authentic way. \n The Community You Will Join: \n Machine Learning and Artificial Intelligence are at the heart of the Airbnb product. From Trust to Payments, and from Customer Service to Marketing we rely on ML to ensure that guests and hosts have the best possible experience with Airbnb. \n The Core ML team in Community Support is the team responsible for adopting the Agentic AI technologies to enable an intelligent, scalable and exceptional customer service experience. We are responsible for developing the Chat AI assistant, Voice AI Assistant and more! The team is constantly exploring the SOTA Agentic architecture, develops and enhances various AI models, ML services and leverages tools including SFT, Reinforcement learning, Distillation, RAG/Search,  LLM evaluation and testing automation, feedback-based learning and guardrail for a wide range of applications in Airbnb. \n The Difference You Will Make: \n We believe our current customer experiences in these domains are only scratching the surface of the innovations that are possible, and that science is at the heart of delivering a step-function change for our Guest and and Host on Airbnb. You will build and leverage cutting edge AI technologies to transform Airbnb’s customer service by delivering personalized, easy-to-use and proactive customer service experience. Many of the initiatives you’ll tackle are in their early conceptual stages. You will have the opportunity to shape these ideas from inception to production, turning visionary concepts into impactful realities.\n A Typical Day:  \n \n Champion the development of novel ML systems, product integrations, and performance optimizations to solve real-world problems\n Work cross-functionally with product, design, and other engineering counterparts to design and build efficient AI solutions for Airbnb CS products\n Learn and share the latest AI/ML technologies with the team.\n \n Your Expertise: \n \n (Required) PhD or 3+ YOE in Computer Science, Machine Learning, Statistics, Artificial Intelligence, or a related technical field — or equivalent industry experience\n Hands-on expertise in LLM, including pretraining, fine-tuning (SFT, RLHF, GRPO), prompt engineering, RAG architectures, and LLM evaluation frameworks\n Experience building Agentic AI systems — including multi-agent orchestration, tool-use, planning, memory, and autonomous reasoning pipelines (e.g., ReAct, LangGraph, AutoGen, or similar)\n Experience of shipping production-grade ML/AI systems at scale, with deep understanding of ML infrastructure, model serving, and MLOps best practices\n Excellent communication skills with the ability to collaborate effectively across Engineering, Product, and Design organizations\n \n Your Location: \n Due to the nature of this position, the successful applicant will need to be based in San Francisco-Bay Area, CA or Seattle, Washington to be able to conduct their work. Currently, employees can not be located in: Alaska, Indiana, Nebraska, North Dakota, Ohio, South Dakota, Wisconsin, Alabama, Mississippi, Oklahoma, Delaware And Rhode Island. This list is continuously being updated, please check back with us if the state you live in is on the exclusion list.  If your position is employed by another Airbnb entity, your recruiter will inform you what states you are eligible to work from. \n Our Commitment To Inclusion \u0026 Belonging: \n Airbnb is committed to working with the broadest talent pool possible. We believe diverse ideas foster innovation and engagement, and allow us to attract creatively-led people, and to develop the best products, services and solutions. All qualified individuals are encouraged to apply.\n We strive to also provide a disability inclusive application and interview process. If you are a candidate with a disability and require reasonable accommodation in order to submit an application, please contact us at: reasonableaccommodations@airbnb.com. Please include your full name, the role you’re applying for and the accommodation necessary to assist you with the recruiting process. \n We ask that you only reach out to us if you are a candidate whose disability prevents you from being able to complete our online application.\n How We'll Take Care of You: \n Our job titles may span more than one career level. The actual base pay is dependent upon many factors, such as: training, transferable skills, work experience, business needs and market demands. The base pay range is subject to change and may be modified in the future. This role may also be eligible for bonus, equity, benefits, and Employee Travel Credits.   \n Pay Range\n $170,000 — $180,000 USD","salary_min":170000,"salary_max":180000,"location":"San Francisco, CA","workplace":"onsite","remote_scope":"not_remote","job_type":"full-time","experience_level":"mid","tags":["rag","llm","reinforcement-learning","pre-training","mlops","speech","agents","fine-tuning"],"apply_url":"https://careers.airbnb.com/positions/8024267?gh_jid=8024267","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-06-23T21:05:19Z","expires_at":"2026-08-14T14:11:22.225345Z","created_at":"2026-06-28T14:09:03.008829Z","updated_at":"2026-07-15T14:11:22.376533Z","company_name":"Airbnb","company_slug":"airbnb","company_logo_url":"https://www.google.com/s2/favicons?domain=airbnb.com\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/2d3dc650-dcfc-4532-93b8-8b3c42ec0fc2"},{"id":"b17f9ab1-1081-41a6-88dd-ae290d7d1c94","company_id":"28040a6c-6f94-41a4-b15a-f2e4520188ff","title":"AI Engineer, Voice Designer","slug":"ai-engineer-voice-designer-9541704c","description":"About Dialpad Dialpad is the AI-native business communications platform. We unify calling, messaging, meetings, and contact center on a single platform - powered by AI that understands every conversation in real time. \n More than 70,000 companies around the globe, including WeWork, Asana, NASDAQ, AAA Insurance, COMPASS Realty, Uber, Randstad, and Tractor Supply, rely on Dialpad to build stronger customer connections using real-time, AI-driven insights. \n We’re now leading the shift to Agentic AI: intelligent agents that don’t just analyze conversations but take action by automating workflows, resolving customer issues, and accelerating revenue in real time. Our DAART initiative (Dialpad Agentic AI in Real Time) is redefining what a communications platform can do. \n Visit dialpad.com to learn more. \n Being a Dialer At Dialpad, AI isn’t just a feature; it’s how our teams do their best work every day. We put powerful AI tools in every employee’s hands so they can move faster, think bigger, and achieve more. \n We believe every conversation matters. And we’ve built the platform that turns those conversations into insight and action, for our customers and ourselves. \n We look for people who are intensely curious and hold themselves to a high bar. Our ambition is significant, and achieving it requires a team that operates at the highest level. We seek individuals who embody our core traits: Scrappy, Curious, Optimistic, Persistent, and Empathetic . \n Your Role As an AI Engineer: Voice Designer, you’ll own the back-end implementation and linguistic optimization of the Text-to-Speech (TTS) layer for our next-generation AI voice agents. You’ll work squarely within our Speech Team—a high-impact R\u0026D and engineering group focused on speech recognition, enhancement, and synthesis. You will bridge the gap between core speech science and product engineering, ensuring our voice agents sound human, context-aware, and trustworthy. You’ll also help create the systems that manage voice personas, tone, and conversational fillers, eventually exposing these as tweakable parameters to our customer-facing UI. \n This position reports to our Senior Manager, AI Speech, is based at our Kitchener hub, and operates on a hybrid schedule . \n What You’ll Do \n \n TTS Backend Implementation: Own the integration and optimization of multiple TTS vendor APIs while leading research and prototyping for open-source or in-house TTS architectures. \n Linguistic Optimization: Apply expertise in phonetics and sociolinguistics to ensure TTS input is formatted for maximum naturalness, including SSML orchestration and pronunciation handling. \n Conversational Turn Design: Craft context-specific utterances to optimize turn handling and build caller trust during agentic \"thought\" processes. \n Prompt \u0026 Persona Management: Design and manage LLM and TTS prompts and parameters to define and refine agent personalities across different industry verticals. \n UI Parameter Exposure: Architect the logic to expose voice attributes (speed, pitch, tone, style) to the product UI, allowing customers to customize their agent’s voice profile. \n Cross-Functional R\u0026D: Partner with ASR and Audio AI engineers to ensure end-to-end voice quality and minimize latency in the ASR → LLM → TTS pipeline. \n \n Skills You’ll Bring \n \n Technical Foundation: Strong Python programming skills and experience with deep learning frameworks (e.g. PyTorch). \n Speech Expertise: 3+ years of experience in Speech Synthesis (TTS) or Voice Design, including hands-on work with frameworks like NVIDIA NeMo, ESPnet, or Coqui, and hands-on experience with major TTS APIs such as ElevenLabs, Rime, and Cartesia. \n Linguistic Background: Degree in Computational Linguistics, Computer Science, or AI/ML with a deep understanding of phonetics, prosody, and syntax. \n Prompt Engineering: Proven experience crafting and evaluating LLM prompts (system, few-shot) and managing structured prompt templates. \n Backend Engineering: Experience building production-grade APIs and integrating multi-vendor services in a cloud environment (GCP preferred). \n Evaluation Mindset: Knowledge of speech quality metrics (MOS, intelligibility, latency) and the ability to design rigorous A/B tests for voice personas. \n For exceptional talent based in Ontario, Canada  the target base salary range for this position is posted below. Our salary ranges are determined by role, level, and location. The range displayed on each job posting reflects the target range for new hire salaries for the position. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process. Please note that the compensation details listed in Ontario role postings reflect the base salary only, and do not include bonus, equity, or bene","salary_min":145000,"salary_max":172500,"location":"Kitchener, Canada","workplace":"hybrid","remote_scope":"not_remote","job_type":"full-time","experience_level":"mid","tags":["pytorch","speech","llm","agents","cloud","deep-learning"],"apply_url":"https://job-boards.greenhouse.io/dialpad/jobs/8601273002","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-06-19T17:07:59Z","expires_at":"2026-08-14T14:23:00.636995Z","created_at":"2026-06-28T14:19:26.870692Z","updated_at":"2026-07-15T14:23:00.745666Z","company_name":"Dialpad","company_slug":"dialpad","company_logo_url":"https://www.google.com/s2/favicons?domain=dialpad.com\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/b17f9ab1-1081-41a6-88dd-ae290d7d1c94"},{"id":"7eb3a4f0-b9e4-4d32-8d4b-045ffe5c6ed8","company_id":"28040a6c-6f94-41a4-b15a-f2e4520188ff","title":"AI Engineer, Voice Designer","slug":"ai-engineer-voice-designer-c02cea46","description":"About Dialpad Dialpad is the AI-native business communications platform. We unify calling, messaging, meetings, and contact center on a single platform - powered by AI that understands every conversation in real time. \n More than 70,000 companies around the globe, including WeWork, Asana, NASDAQ, AAA Insurance, COMPASS Realty, Uber, Randstad, and Tractor Supply, rely on Dialpad to build stronger customer connections using real-time, AI-driven insights. \n We’re now leading the shift to Agentic AI: intelligent agents that don’t just analyze conversations but take action by automating workflows, resolving customer issues, and accelerating revenue in real time. Our DAART initiative (Dialpad Agentic AI in Real Time) is redefining what a communications platform can do. \n Visit dialpad.com to learn more. \n Being a Dialer At Dialpad, AI isn’t just a feature; it’s how our teams do their best work every day. We put powerful AI tools in every employee’s hands so they can move faster, think bigger, and achieve more. \n We believe every conversation matters. And we’ve built the platform that turns those conversations into insight and action, for our customers and ourselves. \n We look for people who are intensely curious and hold themselves to a high bar. Our ambition is significant, and achieving it requires a team that operates at the highest level. We seek individuals who embody our core traits: Scrappy, Curious, Optimistic, Persistent, and Empathetic . \n Your Role As an AI Engineer: Voice Designer, you’ll own the back-end implementation and linguistic optimization of the Text-to-Speech (TTS) layer for our next-generation AI voice agents. You’ll work squarely within our Speech Team—a high-impact R\u0026D and engineering group focused on speech recognition, enhancement, and synthesis. You will bridge the gap between core speech science and product engineering, ensuring our voice agents sound human, context-aware, and trustworthy. You’ll also help create the systems that manage voice personas, tone, and conversational fillers, eventually exposing these as tweakable parameters to our customer-facing UI. \n This position reports to our Senior Manager, AI Speech, is based at our Vancouver hub, and operates on a hybrid schedule . \n What You’ll Do \n \n TTS Backend Implementation: Own the integration and optimization of multiple TTS vendor APIs while leading research and prototyping for open-source or in-house TTS architectures. \n Linguistic Optimization: Apply expertise in phonetics and sociolinguistics to ensure TTS input is formatted for maximum naturalness, including SSML orchestration and pronunciation handling. \n Conversational Turn Design: Craft context-specific utterances to optimize turn handling and build caller trust during agentic \"thought\" processes. \n Prompt \u0026 Persona Management: Design and manage LLM and TTS prompts and parameters to define and refine agent personalities across different industry verticals. \n UI Parameter Exposure: Architect the logic to expose voice attributes (speed, pitch, tone, style) to the product UI, allowing customers to customize their agent’s voice profile. \n Cross-Functional R\u0026D: Partner with ASR and Audio AI engineers to ensure end-to-end voice quality and minimize latency in the ASR → LLM → TTS pipeline. \n \n Skills You’ll Bring \n \n Technical Foundation: Strong Python programming skills and experience with deep learning frameworks (e.g. PyTorch). \n Speech Expertise: 3+ years of experience in Speech Synthesis (TTS) or Voice Design, including hands-on work with frameworks like NVIDIA NeMo, ESPnet, or Coqui, and hands-on experience with major TTS APIs such as ElevenLabs, Rime, and Cartesia. \n Linguistic Background: Degree in Computational Linguistics, Computer Science, or AI/ML with a deep understanding of phonetics, prosody, and syntax. \n Prompt Engineering: Proven experience crafting and evaluating LLM prompts (system, few-shot) and managing structured prompt templates. \n Backend Engineering: Experience building production-grade APIs and integrating multi-vendor services in a cloud environment (GCP preferred). \n Evaluation Mindset: Knowledge of speech quality metrics (MOS, intelligibility, latency) and the ability to design rigorous A/B tests for voice personas. \n For exceptional talent based in British Columbia, Canada  the target base salary range for this position is posted below. Our salary ranges are determined by role, level, and location. The range displayed on each job posting reflects the target range for new hire salaries for the position. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process. Please note that the compensation details listed in British Columbia role postings reflect the base salary only, and do not include bonu","salary_min":161500,"salary_max":191500,"location":"Vancouver, Canada","workplace":"hybrid","remote_scope":"not_remote","job_type":"full-time","experience_level":"mid","tags":["llm","pytorch","deep-learning","speech","cloud","agents"],"apply_url":"https://job-boards.greenhouse.io/dialpad/jobs/8597852002","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-06-19T17:04:22Z","expires_at":"2026-08-14T14:23:00.54492Z","created_at":"2026-06-28T14:19:26.794215Z","updated_at":"2026-07-15T14:23:00.655664Z","company_name":"Dialpad","company_slug":"dialpad","company_logo_url":"https://www.google.com/s2/favicons?domain=dialpad.com\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/7eb3a4f0-b9e4-4d32-8d4b-045ffe5c6ed8"},{"id":"7aeba6ea-09a9-41e9-a849-b67f75b6e4bd","company_id":"308b7777-69e1-49db-ad79-3912d6c6e648","title":"Engineering Manager, AI","slug":"engineering-manager-ai-499822bb","description":"Our Mission \n Healthcare should work for patients, but it doesn’t. In their time of need, they call down outdated insurance directories. Then wait on hold. Then wait weeks for the privilege of a visit. Then wait in a room solely designed for waiting. Then wait for a surprise bill. In any other consumer industry, the companies delivering such a poor customer experience would not survive. But in healthcare, patients lack market power. Which means they are expected to accept the unacceptable. \n  \n Zocdoc’s mission is to give power to the patient. To do that, we’ve built the leading healthcare marketplace that makes it easy to find and book in-person or virtual care in all 50 states, across +200 specialties and +12k insurance plans. By giving patients the ability to see and choose, we give them power. In doing so, we can make healthcare work like every other consumer sector, where businesses compete for customers, not the other way around. In time, this will drive quality up and prices down.  \n  \n We’re 18 years old and the leader in our space, but we are still just getting started. If you like solving important, complex problems alongside deeply thoughtful, driven, and collaborative teammates, read on. \n  \n Your Impact to the Mission \n Zocdoc’s most important asset is our people. As an Engineering Manager for Zo, Zocdoc’s AI phone assistant, you will play a critical role in achieving Zocdoc’s strategy of being the front door for healthcare. Zo acts like a human and manages appointments like an expert, handling unlimited inbound scheduling calls. Zo vanquishes hold times and instantly schedules appointments 24/7 using natural, conversational language. As the leader for this team, you’ll be on the cutting edge of building an LLM-based product, using voice technologies, and integrating with EHR systems. You will work with a highly motivated and experienced engineering team while collaborating with cross-functional teams, including Product, Sales, Data, Ops, Design, and Product Marketing. Your leadership, technical skill, and strategic vision will directly contribute to the value we deliver to our clients and the overall improvement of the healthcare experience.\n  \n You’ll enjoy this role if you are… \n \n Inspired by the opportunity to positively impact the healthcare experience for millions of customers and providers.\n Entrepreneurial, having a strong technical background, an ownership mindset, and a penchant for product and business strategy\n Passionate about software engineering and product development and building scalable, technical solutions\n Willing and eager to learn, including hands on development, code and system design review to guide and set an example for your team\n A leader who enjoys inspiring, supporting, and influencing their teams to do their best work with a focus on accountability, velocity, and continuous improvement\n A strong collaborator with excellent communication skills who enjoys being the bridge between engineering, product, and internal stakeholders\n A believer that diverse and inclusive teams and cultures are non-negotiable\n \n Your day to day is… \n \n Leading, managing, and growing a team of software engineers to deliver highly reliable, performant software solutions\n Partnering with product, design, and commercial teams to build out a clear technical product strategy and roadmap\n Working across the organization on Engineering culture, helping to build career ladders, steer cross-team events, and help build an inclusive technology workforce\n Mentoring and coaching a team of engineers to maintain a high quality of work, while delivering new features with bleeding edge technology at high velocity\n Ensuring we are building out the platform’s capabilities with attention to security, scale, performance, and uptime\n Working with cutting edge voice AI tools and technology\n \n You’ll be successful in this role if you have… \n \n The mentality of an entrepreneur/owner and a strong bias to action\n Experience working on consumer facing AI products, bonus if voice AI products\n Shipped multiple product features to external customers, either as an IC or Manager\n A deep knowledge of software development methodologies\n A passion for understanding the use cases of your customers and a focus on collaboration\n Experience working with microservice architectures and AWS / cloud environment\n AWS, React / JS for frontend development, C# / Python for backend development\n The ability to integrate generative AI tools into daily workflows to automate tasks, foster innovation, and maximize productivity\n Solid communication skills with the ability to influence key stakeholders and build consensus with engineers at all levels\n Humility. You believe in treating all people with dignity and respect, regardless of title or tenure, and you approach tough conversations with empathy\n \n Benefits: \n \n Flexible, hybrid work environment at our convenient Soho location\n Unlimited Vacation\n 100% pai","salary_min":210000,"salary_max":270000,"location":"New York, NY","workplace":"hybrid","remote_scope":"not_remote","job_type":"full-time","experience_level":"principal","tags":["speech","healthcare","generative-ai","cloud","llm","microservices"],"apply_url":"https://job-boards.greenhouse.io/zocdoc/jobs/7994224","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-06-15T17:13:50Z","expires_at":"2026-08-14T14:21:22.375979Z","created_at":"2026-06-28T14:17:59.453483Z","updated_at":"2026-07-15T14:21:22.471801Z","company_name":"ZocDoc","company_slug":"zocdoc","company_logo_url":"https://www.google.com/s2/favicons?domain=zocdoc.com\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/7aeba6ea-09a9-41e9-a849-b67f75b6e4bd"},{"id":"a3d16455-f42f-4915-8723-2d023a5b665b","company_id":"2ca4efa5-edc2-4352-a597-ea27086e1e5b","title":"Senior Software Engineer II, AI Labs \u0026 Foundations","slug":"senior-software-engineer-ii-ai-labs-foundations-e74eb4cd","description":"We're transforming the grocery industry \n At Instacart, we invite the world to share love through food because we believe everyone should have access to the food they love and more time to enjoy it together. Where others see a simple need for grocery delivery, we see exciting complexity and endless opportunity to serve the varied needs of our community. We work to deliver an essential service that customers rely on to get their groceries and household goods, while also offering safe and flexible earnings opportunities to Instacart Personal Shoppers. \n Instacart has become a lifeline for millions of people, and we’re building the team to help push our shopping cart forward. If you’re ready to do the best work of your life, come join our table.\n Instacart is a Flex First team \n There’s no one-size fits all approach to how we do our best work. Our employees have the flexibility to choose where they do their best work—whether it’s from home, an office, or your favorite coffee shop—while staying connected and building community through regular in-person events. Learn more about our flexible approach to where we work. \n Overview\n Join Instacart's mission to transform grocery shopping through frontier AI. As a Senior Software Engineer on AI Labs \u0026 Foundations, you will design, build, and operate the high-scale production systems that power our most ambitious AI experiences—from Cart Assistant, our conversational shopping agent, to voice AI interactions and beyond. This is a high-impact opportunity to work at the intersection of robust software engineering and cutting-edge production AI/ML, directly shaping products used by millions of customers every day.\n We are hiring a Senior Software Engineer who will participate in the design and delivery of production AI systems, identify high-leverage technical opportunities, and contribute hands-on to AI-native products across Instacart's platform. We value bottom-up ideas, high engineering quality, and close partnership with Product, Data Science, ML, and Infrastructure teams. If you enjoy inventing, navigating ambiguity, prototyping fast, and turning wild ideas into real, scalable products, this is the team for you.\n AI Labs \u0026 Foundations sits at the intersection of frontier AI research and production engineering. Our portfolio spans the full stack of AI innovation at Instacart, including building and launching Cart Assistant, pioneering voice AI interactions, and constructing the foundational systems that power these cutting-edge experiences. We are a fast-moving, collaborative team that thrives on 0-to-1 thinking, shares learnings openly, and ships with urgency by prototyping fast and testing rigorously.\n About the Job\n \n Design, build, and operate production AI-powered systems and agentic experiences (including Cart Assistant and voice AI) that directly impact how millions of customers shop.\n Build foundational systems for cutting-edge AI experiences, ranging from embedding infrastructure and voice AI pipelines, to client facing components and integrations, by prototyping bold ideas and productizing what works.\n Integrate foundation models via APIs and open-source frameworks; apply techniques like retrieval-augmented generation and vector search where appropriate.\n Own projects end-to-end: requirements, technical design, implementation, testing, deployment, observability, and iterative improvement focused on reliability, latency, and cost efficiency.\n Collaborate with cross-functional partners in product, design, data science, and infrastructure to ship AI features end-to-end.\n Drive engineering excellence, including thoughtful system design, rigorous code review, and technical leadership that includes defining and promoting best practices for AI/ML production engineering across the team.\n \n About You\n Minimum Qualifications: \n \n Proven senior software engineer who has built, shipped, and operated production systems at scale. You make architectural calls, own what you build, and deliver through ambiguity.\n Hands-on experience with AI or ML in production. You've shipped LLM-powered features or integrated foundation model APIs into a live product, demonstrating the necessary expertise at the intersection of robust software engineering and deep production ML.\n Experience owning services end-to-end, including CI/CD, automated testing, observability (logging, metrics, tracing), and on-call participation.\n Strong communicator who partners well across disciplines - you want to get to the right answer, not just defend the first one.\n Excitement and ability to leverage cutting-edge development tools, including AI assistance (e.g., Copilot, Cursor, Claude), to maximize velocity.\n \n Preferred Qualifications: \n \n 5 to 8+ years of industry experience.\n A track record of 0-to-1 work taking unconventional ideas from prototype through rapid iteration to production.\n Experience building conversational agents, multi-turn dialogue systems, or agentic LLM applications.\n Exp","salary_min":192000,"salary_max":202000,"location":"Remote (US)","workplace":"remote","remote_scope":"restricted","job_type":"full-time","experience_level":"senior","tags":["code-generation","agents","generative-ai","cloud","distributed-systems","speech","embeddings","search"],"apply_url":"https://instacart.careers/job/?gh_jid=7951041","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-05-29T22:43:14Z","expires_at":"2026-08-14T14:10:49.595125Z","created_at":"2026-05-30T14:08:42.180879Z","updated_at":"2026-07-15T14:10:49.726969Z","company_name":"Instacart","company_slug":"instacart","company_logo_url":"https://www.google.com/s2/favicons?domain=www.instacart.com\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/a3d16455-f42f-4915-8723-2d023a5b665b"},{"id":"e2f424cd-b174-4b49-99cd-822493faf193","company_id":"861968d1-d9f8-4217-9873-ce4b24851abc","title":"Software Engineer III, Voice AI","slug":"software-engineer-iii-voice-ai-8ddbaf75","description":"Position Summary \n  \n Are you ready to make a real impact on people's lives and be part of a rapidly-growing team? Natera is seeking a Software Engineer III to help design, develop, and maintain our Voice AI platform. This platform handles thousands of patient calls daily, providing automated test status, identity verification, billing support, and intelligent routing — directly improving patient access to their genetic testing results. Join us in our mission to change the way disease is managed, and be a part of a dedicated group of professionals who are passionate about making a difference. \n  \n The Software Engineer III – Voice AI is responsible for building and maintaining the real-time conversational AI systems that power Natera's automated patient call center. This role requires strong expertise in TypeScript and Node.js, hands-on experience with voice AI pipelines (STT, LLM orchestration, TTS), and familiarity with telephony systems, agentic architectures, and event-driven design. You should be comfortable working with real-time audio streaming, WebSocket protocols, and the unique latency and UX challenges of voice-based AI. You'll collaborate with cross-functional teams in a fast-paced environment to ship features that measurably improve call efficacy and patient satisfaction. \n  \n Primary Responsibilities \n Lead \n \n \n Take ownership of assigned voice AI features and components, guiding them through the full software development lifecycle. \n \n Contribute to design discussions, code reviews, and best practice adoption within the Voice AI team. \n \n Drive technical decisions on voice pipeline optimization — VAD tuning, turn-taking, interruption handling, and latency management. \n \n Manage \n \n \n Plan and prioritize tasks in an Agile environment, ensuring timely and high-quality delivery. \n \n Work with Product Managers and stakeholders to refine requirements and scope technical efforts for conversational AI features. \n \n Monitor voice platform health metrics (call efficacy, ASR accuracy, per-segment latency) and prioritize improvements based on data. \n \n Nurture \n \n \n Mentor junior team members, sharing knowledge and best practices in voice AI architecture, TypeScript, and real-time systems. \n \n Encourage a culture of continuous learning and technical excellence through pair programming and design reviews. \n \n Document voice AI patterns, integration contracts, and operational runbooks for the team. \n \n Collaborate \n \n \n Partner with Product Managers, QA, and clinical operations to gather requirements, validate conversational designs, and guide projects from inception to deployment. \n \n Coordinate with other engineering teams to integrate voice agents with internal services via authenticated APIs. \n \n Work with the analytics team to ensure voice metrics flow correctly through the data pipeline for reporting and optimization. \n \n Effect Change \n \n \n Drive improvements in our multi-agent orchestration approach — tool calling patterns, agent handoff logic, and state management across conversation turns. \n \n Advocate for high-quality standards and automated testing strategies for conversational AI systems, including voice-specific test patterns (simulated calls, transcript validation, latency benchmarks). \n \n Identify and resolve voice-specific UX issues: ASR errors on medical terminology, silence detection tuning, barge-in recovery, and end-to-end response latency. \n \n  \n Qualifications \n \n \n 5+ years of overall software development experience, focusing on scalable backend services using Node.js and TypeScript. \n \n 1+ years of experience with voice AI, conversational AI, or real-time audio systems in production. \n \n Hands-on experience with agentic LLM architectures — tool calling, multi-agent orchestration, prompt engineering, and conversation state management. \n \n Familiarity with voice AI pipeline components: STT (Deepgram, Azure Speech, OpenAI Whisper), TTS (ElevenLabs, OpenAI, Cartesia), and LLM APIs (OpenAI Realtime API, Anthropic Claude). \n \n Experience with telephony systems — Twilio (media streams, SIP, IVR) or equivalent WebSocket-based audio streaming platforms. \n \n Understanding of voice-specific challenges: VAD configuration, turn-taking, interruption handling, latency budgets, and audio codec management (mulaw/PCM). \n \n Solid understanding of the software development lifecycle (SDLC), including build, configuration, release, and deployment. \n \n Knowledge of microservice architecture and distributed systems best practices. \n \n Proficiency with AWS services (ECS Fargate, Lambda, DynamoDB, S3, Kafka/MSK, API Gateway). \n \n Experience with event-driven architecture and message processing (e.g., Apache Kafka, SQS). \n \n Strong relational database skills (MySQL) and exposure to NoSQL databases (DynamoDB, Redis). \n \n Demonstrated teamwork skills and a collaborative mindset. \n \n Excellent communication and organizational skills. \n \n Experience with RAG architectures (AWS Bedrock, vector s","salary_min":105700,"salary_max":132100,"location":"Remote (US)","workplace":"remote","remote_scope":"restricted","job_type":"full-time","experience_level":"senior","tags":["data-pipeline","rag","distributed-systems","api-design","healthcare","embeddings","speech","llm"],"apply_url":"https://job-boards.greenhouse.io/natera/jobs/6001268004","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-05-26T17:11:57Z","expires_at":"2026-08-14T14:12:45.398322Z","created_at":"2026-05-27T14:10:38.770244Z","updated_at":"2026-07-15T14:12:45.517011Z","company_name":"Natera","company_slug":"natera","company_logo_url":"https://www.google.com/s2/favicons?domain=natera.com\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/e2f424cd-b174-4b49-99cd-822493faf193"},{"id":"5316f644-75b1-462f-8dac-e2ff8f5c62eb","company_id":"861968d1-d9f8-4217-9873-ce4b24851abc","title":"Senior Software Engineer, Voice AI","slug":"senior-software-engineer-voice-ai-eb824760","description":"Role Description \n This is a high-autonomy, high-agency position for a voice AI engineer who thrives at the intersection of real-time systems, conversational AI, and healthcare. You'll own the architecture and delivery of Natera's Voice AI platform — a production system handling thousands of patient calls daily that provides automated test status, identity verification, billing support, and intelligent routing to human agents. \n  \n You'll work across the full voice AI stack: telephony, speech-to-text, LLM orchestration, text-to-speech, and analytics — building agentic conversational systems that directly improve patient access to their genetic testing results. This role requires deep understanding of the intricacies unique to voice AI: real-time audio streaming, turn-taking, interruption handling, latency optimization, and the orchestration challenges that distinguish voice from text-based AI systems. \n  \n Your work will span two critical domains: \n  \n 1. Voice AI Platform Engineering \n  \n Design, build, and operate Natera's production voice AI system. This includes multi-agent orchestration, real-time WebSocket audio pipelines, telephony integration, and the voice-specific challenges of latency management, VAD tuning, barge-in handling, and ASR accuracy for medical terminology. \n  \n 2. Agentic Conversational Architecture \n  \n Architect and implement autonomous agent workflows that handle complex patient interactions end-to-end — identity verification, OTP validation, personalized test status delivery, billing inquiries, and intelligent escalation. You'll design tool-calling patterns, agent handoff logic, state management across conversation turns, and the analytics infrastructure needed to measure and improve call efficacy. \n  \n What You'll Do \n \n \n Own the end-to-end voice AI architecture — from Twilio media streams through LLM orchestration to TTS output and call disposition \n \n Design and implement multi-agent systems using tool calling, agent handoffs, and shared conversation state for complex patient workflows \n \n Build and optimize real-time audio pipelines — WebSocket streaming, codec handling (mulaw/PCM), VAD configuration, and interruption management \n \n Architect analytics and observability infrastructure for voice-specific metrics: per-segment latency (STT/LLM/TTS), call efficacy, disposition accuracy, and ASR error rates \n \n Solve voice-specific challenges: turn-taking timing, silence detection thresholds, barge-in recovery, medical term recognition, and end-to-end latency optimization \n \n Integrate voice agents with internal services via secure authenticated APIs \n \n Drive platform reliability — eliminate single points of failure, implement multi-provider LLM failover, and design graceful degradation paths \n \n Collaborate with product and clinical operations to improve self-serve efficacy rates and reduce call escalations \n \n Mentor team members on voice AI best practices and contribute to architectural decisions \n \n  \n What We're Looking For \n \n \n 5+ years of software engineering experience, with at least 2 years building production voice AI or conversational AI systems \n \n Deep experience with voice AI pipelines — you understand the end-to-end flow from telephony through STT, LLM processing, TTS, and back to the caller, and you've solved real problems at each stage \n \n Production experience with agentic architectures — multi-agent orchestration, tool calling, agent handoffs, memory/state management, and LLM-driven decision making in real-time conversation contexts \n \n Strong understanding of voice-specific challenges: VAD tuning, turn-taking, interruption/barge-in handling, latency budgets, audio codec management, and the differences between voice and text-based AI UX \n \n Hands-on experience with telephony systems — Twilio (media streams, SIP, IVR), or equivalent platforms with WebSocket-based audio streaming \n \n Proficiency in TypeScript/Node.js with strong async programming patterns; experience with NestJS or similar frameworks \n \n Experience with STT/TTS providers (Deepgram, OpenAI, ElevenLabs, Azure Speech) and understanding of ASR accuracy challenges (domain-specific vocabulary, noise handling) \n \n Production experience with LLM APIs — OpenAI (especially Realtime API), Anthropic Claude, or equivalent; prompt engineering for conversational agents \n \n High agency and autonomy — you don't wait for permission, detailed specs, or hand-holding. You unblock yourself, seek out the highest-impact work, and drive it to completion \n \n Excellent communication — you can translate complex voice AI architecture decisions for product and clinical stakeholders \n \n  \n Preferred \n \n \n Experience in healthcare, biotech, or regulated environments (HIPAA, PHI handling, zero-retention architectures, BAA compliance) \n \n AWS infrastructure experience — ECS Fargate, Lambda, DynamoDB, Bedrock, Kafka/MSK, API Gateway, CDK \n \n Background in real-time systems: WebSocket lifecycle m","salary_min":125000,"salary_max":156000,"location":"Remote (US)","workplace":"remote","remote_scope":"restricted","job_type":"full-time","experience_level":"senior","tags":["agents","api-design","payments","speech","healthcare","embeddings","llm","rag"],"apply_url":"https://job-boards.greenhouse.io/natera/jobs/6001270004","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-05-26T17:11:25Z","expires_at":"2026-08-14T14:12:45.326285Z","created_at":"2026-05-27T14:10:38.690313Z","updated_at":"2026-07-15T14:12:45.446009Z","company_name":"Natera","company_slug":"natera","company_logo_url":"https://www.google.com/s2/favicons?domain=natera.com\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/5316f644-75b1-462f-8dac-e2ff8f5c62eb"},{"id":"57221d71-7c6d-475a-8bf7-a9a2f0b6c944","company_id":"6ea0f41a-b13e-481a-b410-5195f391f939","title":"Staff Platform Engineer, Voice AI","slug":"staff-platform-engineer-voice-ai-d5bc66dc","description":"About the Role \n Together AI is defining the infrastructure layer for the next generation of voice applications. Our Voice AI platform powers production-grade, real-time voice agents at scale — and we're looking for a Staff Platform Engineer to own the architecture that makes it possible. \n This isn't a role about maintaining what exists. You'll set the technical direction for how developers interact with Together's voice platform — from the real-time API primitives they build on, to the autoscaling systems that keep latency SLOs intact under unpredictable load, to the multi-provider abstraction layer that makes our platform uniquely powerful. Voice infrastructure is categorically harder than text inference: bidirectional audio streams, stateful long-lived connections, millisecond latency requirements, and complex multi-model routing don't forgive architectural shortcuts. You'll bring the judgment to get this right the first time, at scale. \n This is a foundational hire on a small, high-conviction team. The decisions you make in this role will define the platform architecture for years. \n Responsibilities \n \n Own the architecture and reliability of Together's real-time API layer — set the technical direction for WebSocket and HTTP streaming APIs powering STT and TTS at scale; establish the reliability bar (connection lifecycle, backpressure, graceful degradation, reconnection) that production voice agents — contact centers, AI agents, communication platforms — depend on. \n Lead autoscaling architecture for latency-sensitive voice workloads — design and ship orchestration systems that handle bursty, real-time traffic across tens of thousands of GPUs; solve the hard problems at the intersection of concurrent connection limits, streaming state, and hard latency ceilings that generic autoscalers weren't built for. \n Define the voice API feature surface — make the architectural calls on word-level alignment, real-time speaker diarization, audio format support (g711/mulaw, PCM, WebRTC), pronunciation controls, and multi-context WebSocket — with a clear view of what unlocks the next category of developer use cases. \n Build the observability platform for voice infrastructure — design the latency breakdown pipelines, audio quality signal collection, and customer-facing dashboards that give both the team and developers the instrumentation they need to operate at production quality; make debugging voice issues fast and systematic. \n Own the multi-provider abstraction layer — architect the normalization layer across model partners (Cartesia, Deepgram, Rime, and others) that delivers consistent, provider-agnostic API behavior; your design should absorb upstream variability without exposing it to developers. \n Drive the interface between API and ML serving — partner closely with ML engineering leadership to define the contract between the API layer and the model serving stack; your decisions here have direct impact on end-to-end latency and reliability SLAs. \n Raise the bar for developer experience across the platform — lead API design reviews, shape documentation strategy, define integration patterns and cookbooks; the voice developer experience should be something the industry references, not just adequate. \n Architect for the product surface that doesn't exist yet — build systems with the foresight that they become the foundation for multiple new voice products; your platform decisions should expand what's possible, not constrain it. \n \n Requirements \n \n 8+ years of experience building large-scale, real-time distributed systems — with clear ownership of systems that carried production traffic at meaningful scale; you can speak to the architectural decisions you made and defend the tradeoffs. \n Deep, battle-tested expertise in real-time streaming infrastructure — WebSocket server architecture, SSE, bidirectional streaming, connection multiplexing, stateful protocol design — you've debugged production failures in these systems and come out with durable architectural improvements. \n Expert-level TypeScript and Python, with strong proficiency in systems-level thinking; Rust experience is a meaningful advantage at this level given where voice infrastructure is heading. \n Senior distributed systems judgment — load balancing, autoscaling, rate limiting, and traffic shaping for latency-sensitive workloads aren't concepts you reference, they're problems you've solved under pressure. \n Deep Kubernetes expertise — custom autoscalers, resource management, and health checking for stateful, streaming services; you've built Kubernetes automation that handled edge cases the off-the-shelf tooling couldn't. \n Strong technical leadership — you set direction, influence across teams without authority, bring clarity to ambiguous problems, and leave systems and teams meaningfully better than you found them. \n Sharp product intuition for developer platforms — you have genuine opinions about API ergonom","salary_min":220000,"salary_max":280000,"location":"San Francisco, CA","workplace":"onsite","remote_scope":"not_remote","job_type":"full-time","experience_level":"lead","tags":["mlops","speech","agents","distributed-systems","api-design","platform"],"apply_url":"https://job-boards.greenhouse.io/togetherai/jobs/5142176007","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-05-19T19:32:06Z","expires_at":"2026-08-14T14:02:21.42469Z","created_at":"2026-05-27T14:02:00.784235Z","updated_at":"2026-07-15T14:02:21.550768Z","company_name":"Together AI","company_slug":"together-ai","company_logo_url":"https://www.google.com/s2/favicons?domain=together.ai\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/57221d71-7c6d-475a-8bf7-a9a2f0b6c944"},{"id":"bc38cbd7-6147-49eb-a610-64fb031af669","company_id":"6ea0f41a-b13e-481a-b410-5195f391f939","title":"Staff Machine Learning Engineer, Voice AI ","slug":"staff-machine-learning-engineer-voice-ai-049973bf","description":"About the Role \n Together AI is building the best inference infrastructure for voice applications. Our Voice AI platform powers production-grade, real-time voice agents and applications — serving speech-to-text and text-to-speech models with best-in-class latency and reliability.\n We're looking for a Staff ML Engineer to drive the model serving layer for voice workloads. You'll work hands-on with inference engines like TRT-LLM and SGLang to optimize how we serve models like Whisper, Parakeet, Orpheus, and Kokoro — pushing latency and throughput to the frontier. You'll profile GPU utilization, design batching strategies for streaming audio, and ensure new model architectures can go from research to production quickly.\n This is a foundational hire on a small, high-impact team. Voice inference has unique challenges — streaming audio, tokenization, real-time latency budgets — that require dedicated ML engineering focus. You'll shape how Together serves voice models as the industry moves from pipeline architectures (ASR → LLM → TTS) toward end-to-end speech-to-speech.\n \n Own the model serving stack that powers Together's voice platform across STT, TTS, and speech-to-speech.\n Work directly with state-of-the-art accelerators (H100s, H200s, B200s) to optimize voice model inference.\n Collaborate with model partners (Cartesia, Deepgram, Rime, and others) to bring their models to production on Together's infrastructure.\n Build quality evaluation frameworks that guide model selection for customers and inform the roadmap.\n Join a small, early-stage team with outsized impact on a fast-growing product area.\n \n  \n Responsibilities \n \n Own the voice inference roadmap end-to-end — define and execute the technical strategy for optimizing STT, TTS, and speech-to-speech models across Together's infrastructure, with a clear-eyed view of where the field is heading and how to position the platform ahead of it.\n Drive best-in-class inference performance — architect and implement systems targeting leading TTFB, throughput, and GPU utilization for voice workloads; set the performance bar others in the industry measure against, not just catch up to.\n Lead productionization of voice models at scale — design the serving architecture for serverless and dedicated endpoints, including batching strategies, streaming inference pipelines, and memory management tailored to real-time audio; own reliability and latency SLAs.\n Build the voice evaluation platform — design a rigorous, extensible evaluation framework covering WER across accents, languages, and noise conditions for STT; naturalness, latency, and pronunciation fidelity for TTS; establish the internal benchmark methodology that informs model selection and roadmap decisions.\n Shape the architecture for next-generation model support — anticipate and enable emerging model paradigms — audio-native LLMs, codec-based architectures (SNAC, Encodec), and end-to-end speech-to-speech systems — before they're mainstream, not after.\n Serve as the technical DRI for model partner integrations — lead deep collaboration with partners such as Cartesia, Deepgram, and Rime; own the full lifecycle from integration to optimization to ongoing performance accountability.\n Diagnose and resolve the hardest performance problems in the stack — conduct systematic profiling and root-cause analysis from GPU kernel behavior to framework-level bottlenecks; drive shipped improvements with documented, measurable impact.\n Influence platform architecture across the organization — partner with platform engineering leadership to ensure the serving layer is built for the latency and reliability demands of real-time voice APIs; your technical decisions should raise the ceiling for the whole team.\n Define and scale voice fine-tuning capabilities — lead the technical direction for enabling customers to fine-tune STT and TTS models on Together's infrastructure, establishing the primitives for differentiated voice experiences.\n Lay technical foundations for a category-defining product surface — architect systems with enough foresight that they support multiple new voice products with minimal rework; think in terms of platforms, not point solutions.\n \n Requirements \n \n 8+ years of ML engineering experience, with a demonstrated focus on model serving, inference optimization, or ML infrastructure at production scale — including systems you've owned from design through live traffic.\n Deep, practical expertise in LLM serving engines (vLLM, SGLang, TensorRT-LLM, or equivalent) — you've modified engine internals, debugged edge cases under load, and contributed improvements back; you don't stop at the API surface.\n Expert-level Python and PyTorch proficiency, with a strong command of GPU optimization — CUDA kernels, memory hierarchies, profiling toolchains — and a track record of turning that knowledge into shipped latency or throughput wins.\n Proven system design judgment — you've made arch","salary_min":220000,"salary_max":280000,"location":"San Francisco, CA","workplace":"onsite","remote_scope":"not_remote","job_type":"full-time","experience_level":"lead","tags":["pytorch","mlops","gpu","speech","llm","fine-tuning","machine-learning"],"apply_url":"https://job-boards.greenhouse.io/togetherai/jobs/5140763007","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-05-19T18:19:46Z","expires_at":"2026-08-14T14:02:21.254715Z","created_at":"2026-05-27T14:02:00.695384Z","updated_at":"2026-07-15T14:02:21.386773Z","company_name":"Together AI","company_slug":"together-ai","company_logo_url":"https://www.google.com/s2/favicons?domain=together.ai\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/bc38cbd7-6147-49eb-a610-64fb031af669"},{"id":"b9819fc1-996f-4825-bec6-a24dd9a53bdc","company_id":"0fc88a91-688e-421d-917d-4880569dd976","title":"Research Engineer, Voice","slug":"research-engineer-voice-ae96b0ee","description":"About Inflection AI \n Inflection AI is a Public Benefit Corporation empowering people with human-centered, emotionally intelligent AI. We’re shaping the future of AI by combining emotional intelligence (EQ) and raw intelligence (IQ) to elevate people’s potential. Inflection AI created Pi, the world’s first emotionally intelligent AI, to help people work through decisions, emotions, and challenges. Pi is a personal AI agent powered by Inflection AI’s foundation model, proving that AI can be personal, empathetic, and contextually aware.\n About the Role \n We’re looking for a Member of Technical Staff, Voice focused on voice and audio to help advance the spoken intelligence behind Pi. In this role, you’ll work at the intersection of research and production—developing, training, and shipping neural models across the full spectrum of voice: speech synthesis, recognition, audio generation, and real-time spoken dialogue. You’ll collaborate closely with ML engineers, product teams, and infrastructure to turn cutting-edge ideas in areas like neural audio codecs, diffusion-based TTS, and multimodal foundation models into the natural, expressive voice experiences that millions of Pi users interact with every day.\n What You’ll Do \n \n Research, develop, and optimize neural models for voice and audio—including text-to-speech, automatic speech recognition, audio generation, and spoken dialogue systems.\n Build and maintain production-grade training and inference pipelines for voice models, with close attention to latency, naturalness, and scalability.\n Run experiments end-to-end: data curation, model architecture design, training, evaluation, and ablation studies.\n Collaborate with ML engineers, product teams, and infrastructure to integrate voice models into Pi’s real-time conversational stack.\n Explore and apply advances in neural audio codecs, diffusion-based synthesis, streaming architectures, and multimodal foundation models to improve Pi’s voice experience.\n Develop robust evaluation frameworks combining perceptual metrics, automated benchmarks, and user-facing quality signals.\n Contribute to Inflection’s research culture through publications, internal reviews, and knowledge sharing.\n \n What We’re Looking For \n \n 2-5 years of research or engineering experience (including graduate work) in audio, speech, or multimodal ML.\n Strong proficiency in PyTorch and hands-on experience training and debugging large-scale neural models on GPU/accelerator clusters.\n Solid understanding of audio and speech fundamentals spectrograms, mel features, vocoders, codec-based representations, and signal processing.\n Demonstrated ability to take a research idea from prototype to production: equally comfortable reading papers and writing efficient, CUDA-aware training loops.\n Familiarity with modern generative architectures for audio (e.g., diffusion models, autoregressive codecs, flow-matching) and their trade-offs.\n Clear, collaborative communication able to distill complex research into actionable insights for cross-functional partners.\n Have a bachelor’s degree or equivalent in Computer Science, Electrical Engineering, Linguistics, or a related field; MS or PhD strongly preferred.\n \n Employee Pay Disclosures \n At Inflection AI, we aim to attract and retain the best employees and compensate them in a way that appropriately and fairly values their individual contributions to the company. For this role, Inflection AI estimates a starting annual base salary to fall within the range of $225,000 to $325,000 , depending on a candidate’s qualifications and level of experience. This role also includes a meaningful equity component, allowing employees to share in the long-term success of the company.\n Benefits \n Inflection AI values and supports our team’s mental, emotional, financial and physical health. We are focused on building a positive, safe, inclusive and inspiring place to work. Our benefits include: \n \n Robust medical, dental and vision options with employer contributions for HSA, FSA and DFSA\n 401k matching program \n Flexible Time Off, 10 paid holidays, 5 days sick leave\n Parental, Medical and Family care leave \n Generous cell-phone, wellness and office set up stipends \n Support of country-specific visa needs for international employees living in the Bay Area","salary_min":225000,"salary_max":325000,"location":"Palo Alto, CA","workplace":"onsite","remote_scope":"not_remote","job_type":"full-time","experience_level":"junior","tags":["pytorch","agents","generative-ai","search","speech","gpu","diffusion-models","research"],"apply_url":"https://boards.greenhouse.io/inflectionai/jobs/4681124006?gh_jid=4681124006","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-05-12T20:45:47Z","expires_at":"2026-08-14T14:06:38.162814Z","created_at":"2026-05-14T14:05:27.875309Z","updated_at":"2026-07-15T14:06:38.310779Z","company_name":"Inflection AI","company_slug":"inflection-ai","company_logo_url":"https://www.google.com/s2/favicons?domain=inflection.ai\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/b9819fc1-996f-4825-bec6-a24dd9a53bdc"},{"id":"e166cabe-5ba5-4fe5-a30d-688ddd5f8fc1","company_id":"5dfcd8fc-f8dd-4f46-b613-ca6da467ff4b","title":"Machine Learning Researcher, Audio","slug":"machine-learning-researcher-audio-6b0906fa","description":"MACHINE LEARNING RESEARCHER, AUDIO\n\nLocation: San Francisco, CA or Remote\n\n \n \n\n\nABOUT BLAND\n\nAt Bland.com, our mission is to empower enterprises to build AI phone agents at scale. Based in San Francisco, we are a fast-growing team reimagining how customers interact with businesses through voice. We have raised $65 million from leading Silicon Valley investors, including Emergence Capital, Scale Venture Partners, Y Combinator, and founders of Twilio, Affirm, and ElevenLabs.\n\n \n\nVoice is quickly becoming the primary interface between businesses and their customers. We are building the models and infrastructure that make those interactions feel natural, reliable, and genuinely human.\n\n \n \n\n\nTHE ROLE: MACHINE LEARNING RESEARCHER, AUDIO\n\nAs a Machine Learning Researcher at Bland, you'll be working on foundational research and development across the core components of our voice stack: speech-to-text, large language models, neural audio codecs, and text-to-speech. Your work will define how our agents understand, reason, and speak in real time at enterprise scale.\n\n \n\nThis is not a narrow research role. You will take ideas from theory to large-scale training to production inference systems serving millions of calls per day. You will design new modeling approaches, validate them with rigorous experimentation, and collaborate with engineering teams to deploy them into real customer environments.\n\n \n \n\n\nWHAT YOU WILL DO\n\nBuild and Scale Next-Generation TTS Systems\n\n - Design and train large scale text-to-speech models capable of expressive, controllable, human-sounding output.\n\n - Develop neural audio codec-based TTS architectures for efficient, high-fidelity generation.\n\n - Improve prosody modeling, question inflection, emotional expression, and multi-speaker robustness.\n\n - Optimize for real-time, low-latency inference in production.\n\n \n\nAdvance Speech-to-Text Modeling\n\n - Build and fine-tune large scale ASR systems robust to accents, noise, telephony artifacts, and code switching.\n\n - Leverage self-supervised pretraining and large-scale weak supervision.\n\n - Improve transcription accuracy for real-world enterprise scenarios, including structured extraction and conversational nuance.\n\n \n\nPioneer Neural Audio Codecs\n\n - Research and implement neural audio codecs that achieve extreme compression with minimal perceptual loss.\n\n - Explore discrete and continuous latent representations for scalable speech modeling.\n\n - Design codec architectures that enable downstream generative modeling and controllable synthesis.\n\n \n\nDevelop Scalable Training Pipelines\n\n - Curate and process massive audio datasets across languages, speakers, and environments.\n\n - Design staged training curricula and data filtering strategies.\n\n - Scale training across distributed GPU clusters focusing on cost, throughput, and reliability.\n\n \n\nRun Rigorous Experiments\n\n - Design ablation studies that isolate the impact of architectural changes.\n\n - Measure improvements using both objective metrics and perceptual evaluations.\n\n - Validate ideas quickly through focused experiments that confirm or eliminate hypotheses.\n\n \n \n\n\nWHAT MAKES YOU A GREAT FIT\n\nDeep Research Foundations\n\n - Experience with self-supervised learning, multimodal modeling, or generative modeling.\n\n - Ability to derive new formulations and implement them efficiently.\n\n \n\nExpertise in Voice Modeling\n\n - Hands-on experience building or scaling TTS, STT, or neural audio codec systems.\n\n - Familiarity with large scale speech datasets and real-world audio variability.\n\n - Strong intuition for audio quality, prosody, and conversational dynamics.\n\n \n\nSystems and Hardware Awareness\n\n - Experience training and serving large models on modern accelerators.\n\n - Knowledge of inference optimization techniques, including quantization, kernel optimization, and memory efficiency.\n\n - Understanding of real-time constraints in telephony or streaming environments.\n\n \n\nExperimental Rigor\n\n - Track record of designing controlled experiments and meaningful ablations.\n\n - Comfortable working with both offline benchmarks and live production metrics.\n\n - Ability to move quickly from hypothesis to validation.\n\n \n\nBuilder Mentality\n\n - Comfortable in fast-moving startup environments.\n\n - Strong ownership mindset from research through deployment.\n\n - Excited by ambiguous, unsolved problems.\n\n \n \n\n\nHOW YOU SHOW UP\n\n - You treat unsolved problems as opportunities to invent new paradigms.\n\n - You identify the single experiment that can validate an idea in days, not months.\n\n - You measure everything and let data drive decisions.\n\n - You are obsessed with making voice agents sound truly human.\n\n - You use AI tools aggressively to amplify your own impact and accelerate research cycles.\n\n \n \n\n\nBONUS POINTS\n\n - Experience with large scale distributed training.\n\n - Research publications or open source contributions in speech or language AI.\n\n - Background in real-time speech systems or telephony.\n\n ","salary_min":160000,"salary_max":250000,"location":"San Francisco, CA","workplace":"onsite","remote_scope":"not_remote","job_type":"full-time","experience_level":"senior","tags":["pre-training","distributed-systems","healthcare","speech","llm","gpu","machine-learning","research"],"apply_url":"https://jobs.ashbyhq.com/bland/2e815d0d-8e7a-43cc-8853-c1b029aeb499/application","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-04-20T22:07:11.702Z","expires_at":"2026-08-14T14:08:20.678982Z","created_at":"2026-04-22T15:40:14.708917Z","updated_at":"2026-07-15T14:08:20.806669Z","company_name":"Bland AI","company_slug":"bland-ai","company_logo_url":"https://www.google.com/s2/favicons?domain=bland.ai\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/e166cabe-5ba5-4fe5-a30d-688ddd5f8fc1"},{"id":"92e0e6b0-3459-44ec-9e1a-4e36a7b805d4","company_id":"4d985fa4-b897-4f93-9745-c332367ad86b","title":"Research Scientist - LLM ","slug":"research-scientist-llm-80f40837","description":"ABOUT RETELL AI\n\nRetell AI is using first principles to reimagine the call center with cutting-edge voice AI.\n\nThousands of companies now utilize Retell’s AI voice agents to handle sales, support, and logistics calls that once required large teams of human agents. Backed by Y Combinator, Alt Capital, and other leading investors, we have scaled to $60M ARR with a team of 40 people, up from $5M at the start of 2025.\n\nOur vision for 2026 is to build a modern CX platform where entire contact centers are powered by AI. Instead of basic automation that needs constant human tuning, we’re creating intelligent AI “workers” that can act as frontline agents, QA analysts, and managers, continuously executing, monitoring, and improving customer interactions.\n\nWe’re growing quickly and looking for ambitious builders who want to tackle hard technical problems, move fast, and have real impact at one of the fastest-growing voice AI startups.\n\nLet’s build the future together.\n\n - We’re a top 50 AI app in a16z list: https://tinyurl.com/5853dt2x\n\n - #4 on Brex's Fast-Growing Software Vendors of 2025: https://www.brex.com/journal/brex-benchmark-december-2025\n\n - We're also one of the top ranking startups on: https://leanaileaderboard.com/\n\n - Enterprise tech 30: https://www.wing.vc/et30/overview\n\n\n\n\nABOUT THE ROLE\n\nThis is a research-driven, high-impact role for ML researchers who want to push the boundaries of real-time AI. As a Founding Machine Learning Research Engineer at Retell, you’ll focus on advancing model capabilities for human-like voice agents operating in complex, real-world environments.\n\nYou’ll explore new approaches across LLMs and audio models, design novel evaluation methods, and prototype systems that improve reasoning, latency, and conversational quality. Your work will directly influence production systems, bridging cutting-edge research with real-world deployment.\n\nIf you’re excited about solving open-ended ML problems, experimenting rapidly, and shaping how voice AI systems think and perform, this is a unique opportunity to do so at scale.\n\n\n\n\nKEY RESPONSIBILITIES\n\n - Research \u0026 Experimentation – Explore and develop new techniques across LLMs and audio models to improve reasoning, latency, and conversational quality in real-time systems.\n\n - Model Training – Rapidly build and iterate on models and pipelines, turning research ideas into working prototypes. Innovate on paradigms, training methods, and inference.\n\n - Evaluation \u0026 Benchmarking – Design novel evaluation frameworks, datasets, and metrics to measure performance on complex, real-world voice tasks.\n\n - Bridge Research to Production – Collaborate closely with engineering to translate research insights into deployable systems.\n\n - Human Feedback Loops – Develop methods to incorporate human evaluation into model improvement, especially for subjective conversational quality.\n\n - Advance the Frontier – Stay at the cutting edge of ML research and bring new ideas into Retell’s product and infrastructure.\n\n\n\n\nREQUIRED\n\n - Strong ML Research Background – You've worked on advanced ML problems (like LLM pre-training and post-training, transcription model training, TTS, or multimodal systems), either in industry or academia.\n\n - Deep Technical Foundation – Comfortable with PyTorch, model architectures, and the math behind modern machine learning.\n\n - Top Academic Background – Master's degree in CS, ML, AI or related field required; PhD preferred. Equivalent research-level engineering experience also considered.\n\n\n\n\nYOU MIGHT THRIVE IF YOU\n\n - Published or Awarded – First/co-author publications at top-tier venues (NeurIPS, ICML, ICLR, ACL, Interspeech, etc.) or notable competition awards are a strong plus.\n\n - Experimental Mindset – You enjoy exploring open-ended problems and iterating quickly on ideas.\n\n - Bridge Theory \u0026 Practice – You can translate research into systems that work in real-world environments.\n\n - Startup-Ready – You thrive in fast-paced environments with high ownership and ambiguity.\n\n - Collaborative \u0026 Clear Communicator – You can explain complex ideas and work cross-functionally to drive impact.\n\n\n\n\nJOB DETAILS\n\n - Cash: $225,000 - $400,000 base salary\n\n - Equity: Offers Equity\n\n - Location: Redwood City, CA, US (100% Relocation Provided)\n\n - US Visas: Retell AI is open to sponsoring work authorization for qualified candidates, including H1B/H-1B, TN, L-1, E-3, F-1 (OPT/CPT), and O-1 visas.\n\n\n\n\nOTHER BENEFITS\n\n - 100% coverage for medical, dental, and vision insurance\n\n - $70/day DoorDash credit for unlimited meals and snacks\n\n - $200/month wellness reimbursement\n\n - $300/month commuter reimbursement\n\n - $75/month phone bill reimbursement\n\n - $50/month internet reimbursement\n   \n   \n\n\nCOMPENSATION PHILOSOPHY\n\n - Best Offer Upfront: Choose from three cash-equity balance options, no negotiation needed\n\n - Top 1% Talent: Above-market pay (top 5 percentile)\n\n - High Ownership: Small teams, \u003e$1M ","salary_min":225000,"salary_max":400000,"location":"San Francisco, CA","workplace":"onsite","remote_scope":"not_remote","job_type":"full-time","experience_level":"lead","tags":["speech","llm","search","pytorch","pre-training","research"],"apply_url":"https://jobs.ashbyhq.com/retell-ai/b0d780eb-df25-49d0-859a-915de204a2f2/application","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-04-14T05:52:56.477Z","expires_at":"2026-08-14T14:13:49.035862Z","created_at":"2026-04-16T11:17:45.913083Z","updated_at":"2026-07-15T14:13:49.134706Z","company_name":"Retell AI","company_slug":"retell-ai","company_logo_url":"https://www.google.com/s2/favicons?domain=retellai.com\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/92e0e6b0-3459-44ec-9e1a-4e36a7b805d4"},{"id":"acc5d396-6aa2-40ba-8a49-632774606bde","company_id":"4d985fa4-b897-4f93-9745-c332367ad86b","title":"Research Scientist - Audio ","slug":"research-scientist-audio-918408c6","description":"ABOUT RETELL AI\n\nRetell AI is using first principles to reimagine the call center with cutting-edge voice AI.\n\nThousands of companies now utilize Retell’s AI voice agents to handle sales, support, and logistics calls that once required large teams of human agents. Backed by Y Combinator, Alt Capital, and other leading investors, we have scaled to $60M ARR with a team of 40 people, up from $5M at the start of 2025.\n\nOur vision for 2026 is to build a modern CX platform where entire contact centers are powered by AI. Instead of basic automation that needs constant human tuning, we’re creating intelligent AI “workers” that can act as frontline agents, QA analysts, and managers, continuously executing, monitoring, and improving customer interactions.\n\nWe’re growing quickly and looking for ambitious builders who want to tackle hard technical problems, move fast, and have real impact at one of the fastest-growing voice AI startups.\n\nLet’s build the future together.\n\n - We’re a top 50 AI app in a16z list: https://tinyurl.com/5853dt2x\n\n - #4 on Brex's Fast-Growing Software Vendors of 2025: https://www.brex.com/journal/brex-benchmark-december-2025\n\n - We're also one of the top ranking startups on: https://leanaileaderboard.com/\n\n - Enterprise tech 30: https://www.wing.vc/et30/overview\n\n\n\n\nABOUT THE ROLE\n\nThis is a research-driven, high-impact role for ML researchers who want to push the boundaries of real-time AI. As a Founding Machine Learning Research Engineer at Retell, you’ll focus on advancing model capabilities for human-like voice agents operating in complex, real-world environments.\n\nYou’ll explore new approaches across LLMs and audio models, design novel evaluation methods, and prototype systems that improve reasoning, latency, and conversational quality. Your work will directly influence production systems, bridging cutting-edge research with real-world deployment.\n\nIf you’re excited about solving open-ended ML problems, experimenting rapidly, and shaping how voice AI systems think and perform, this is a unique opportunity to do so at scale.\n\n\n\n\nKEY RESPONSIBILITIES\n\n - Research \u0026 Experimentation – Explore and develop new techniques across LLMs and audio models to improve reasoning, latency, and conversational quality in real-time systems.\n\n - Model Training – Rapidly build and iterate on models and pipelines, turning research ideas into working prototypes. Innovate on paradigms, training methods, and inference.\n\n - Evaluation \u0026 Benchmarking – Design novel evaluation frameworks, datasets, and metrics to measure performance on complex, real-world voice tasks.\n\n - Bridge Research to Production – Collaborate closely with engineering to translate research insights into deployable systems.\n\n - Human Feedback Loops – Develop methods to incorporate human evaluation into model improvement, especially for subjective conversational quality.\n\n - Advance the Frontier – Stay at the cutting edge of ML research and bring new ideas into Retell’s product and infrastructure.\n\n\n\n\nREQUIRED\n\n - Strong ML Research Background – You've worked on advanced ML problems (like LLM pre-training and post-training, transcription model training, TTS, or multimodal systems), either in industry or academia.\n\n - Deep Technical Foundation – Comfortable with PyTorch, model architectures, and the math behind modern machine learning.\n\n - Top Academic Background – Master's degree in CS, ML, AI or related field required; PhD preferred. Equivalent research-level engineering experience also considered.\n\n \n\n\nYOU MIGHT THRIVE IF YOU\n\n - Published or Awarded – First/co-author publications at top-tier venues (NeurIPS, ICML, ICLR, ACL, Interspeech, etc.) or notable competition awards are a strong plus.\n\n - Experimental Mindset – You enjoy exploring open-ended problems and iterating quickly on ideas.\n\n - Bridge Theory \u0026 Practice – You can translate research into systems that work in real-world environments.\n\n - Startup-Ready – You thrive in fast-paced environments with high ownership and ambiguity.\n\n - Collaborative \u0026 Clear Communicator – You can explain complex ideas and work cross-functionally to drive impact.\n\n\n\n\nJOB DETAILS\n\n - Cash: $225,000 - $400,000 base salary\n\n - Equity: Offers Equity\n\n - Location: Redwood City, CA, US (100% Relocation Provided)\n\n - US Visas: Retell AI is open to sponsoring work authorization for qualified candidates, including H1B/H-1B, TN, L-1, E-3, F-1 (OPT/CPT).\n\n\n\n\nOTHER BENEFITS\n\n - 100% coverage for medical, dental, and vision insurance\n\n - $70/day DoorDash credit for unlimited meals and snacks\n\n - $200/month wellness reimbursement\n\n - $300/month commuter reimbursement\n\n - $75/month phone bill reimbursement\n\n - $50/month internet reimbursement\n   \n   \n\n\nCOMPENSATION PHILOSOPHY\n\n - Best Offer Upfront: Choose from three cash-equity balance options, no negotiation needed\n\n - Top 1% Talent: Above-market pay (top 5 percentile)\n\n - High Ownership: Small teams, \u003e$1M revenue/emplo","salary_min":225000,"salary_max":400000,"location":"San Francisco, CA","workplace":"onsite","remote_scope":"not_remote","job_type":"full-time","experience_level":"lead","tags":["search","pre-training","speech","llm","pytorch","research"],"apply_url":"https://jobs.ashbyhq.com/retell-ai/7dbe5404-e08c-4c62-99dc-ef050534d029/application","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-04-14T05:52:52.3Z","expires_at":"2026-08-14T14:13:48.734791Z","created_at":"2026-04-16T11:17:45.838238Z","updated_at":"2026-07-15T14:13:48.913647Z","company_name":"Retell AI","company_slug":"retell-ai","company_logo_url":"https://www.google.com/s2/favicons?domain=retellai.com\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/acc5d396-6aa2-40ba-8a49-632774606bde"},{"id":"9da24f0c-dc11-4d04-8eb1-c0e2fdc44e97","company_id":"6ea0f41a-b13e-481a-b410-5195f391f939","title":"Senior Platform Engineer, Voice AI","slug":"senior-platform-engineer-voice-ai-92a1406f","description":"About the Role \n Together AI is building the best inference infrastructure for voice applications. Our Voice AI platform powers production-grade, real-time voice agents and applications — serving speech-to-text and text-to-speech models with best-in-class latency and reliability.\n We're looking for a Senior Platform Engineer to own the API and infrastructure layer for voice workloads. You'll build the real-time WebSocket and HTTP APIs that developers use to ship voice experiences, design autoscaling for latency-sensitive streaming workloads, and ensure our multi-provider voice platform is reliable enough for production voice agents handling millions of calls.\n This is a foundational hire on a small, high-impact team. Voice APIs have fundamentally different infrastructure requirements than text-based inference — bidirectional audio streaming, stateful connections, tight latency SLOs, and complex multi-model routing. You'll define how developers interact with Together's voice platform as we grow from early customers to the default infrastructure for voice AI.\n \n Own the real-time API layer (WebSocket + HTTP streaming) that powers Together's voice platform.\n Design autoscaling and orchestration for voice workloads running on tens of thousands of GPUs.\n Build the developer experience — APIs, observability, and tooling — for a fast-growing product area.\n Work with production voice customers (contact centers, AI agents, communication platforms) to ship what they actually need.\n Join a small, early-stage team with outsized impact on a new product line.\n \n Responsibilities \n \n Build and harden real-time WebSocket and HTTP streaming APIs for STT and TTS — including connection lifecycle management, backpressure, error handling, and reconnection, at the reliability bar needed for production voice agents.\n Design and ship autoscaling for voice model endpoints that handles bursty, real-time traffic patterns — accounting for concurrent connection limits, streaming state, and hard latency ceilings.\n Implement voice-specific API features: word-level alignment, speaker diarization in realtime, audio format flexibility (g711/mulaw for telephony, PCM, WebRTC formats), pronunciation controls, and multi-context WebSocket support.\n Build voice-specific observability — latency breakdowns, audio quality signals, and dashboards that help both the team and customers debug issues.\n Own multi-model normalization across our model partners (Cartesia, Deepgram, Rime, and others), ensuring consistent API behavior regardless of the underlying provider.\n Collaborate with the ML engineering side of the team on the interface between the API layer and the model serving stack, ensuring latency and reliability requirements are met end-to-end.\n Contribute to developer experience — API design, documentation, integration cookbooks, playground and showcasing how best-in-class voice agents are built.\n Lay the groundwork for multiple new products down the line.\n \n Requirements \n \n 5+ years of experience building large-scale, real-time distributed systems and API services.\n Deep expertise in real-time streaming infrastructure — WebSocket server architecture, Server-Sent Events, bidirectional streaming, connection multiplexing, and stateful protocol design.\n Expert-level programming in TypeScript and Python; experience with Rust is a plus.\n Strong distributed systems fundamentals: load balancing, autoscaling, rate limiting, and traffic shaping for latency-sensitive workloads.\n Experience with Kubernetes — including custom autoscalers, resource management, and health checking for stateful services.\n Strong product sense — you care about API ergonomics and think about what developers building voice apps actually need.\n Comfort working on a small, early-stage team where you'll wear multiple hats and move fast.\n Experience with audio or media protocols (WebRTC, g711, PCM encoding) is a strong plus.\n Familiarity with ML model serving infrastructure and how inference engines work is a plus — you'll interface with the serving layer regularly.\n Full-stack experience (React, Next.js) is a nice-to-have for contributing to developer-facing tooling.\n Bachelor's or Master's degree in Computer Science, Computer Engineering, or related field, or equivalent practical experience.\n \n About Together AI \n Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI in","salary_min":200000,"salary_max":260000,"location":"San Francisco, CA","workplace":"onsite","remote_scope":"not_remote","job_type":"full-time","experience_level":"senior","tags":["distributed-systems","agents","mlops","api-design","speech","platform"],"apply_url":"https://job-boards.greenhouse.io/togetherai/jobs/5093344007","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-03-30T21:40:56Z","expires_at":"2026-08-14T14:02:20.401199Z","created_at":"2026-04-13T09:37:38.328299Z","updated_at":"2026-07-15T14:02:20.534836Z","company_name":"Together AI","company_slug":"together-ai","company_logo_url":"https://www.google.com/s2/favicons?domain=together.ai\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/9da24f0c-dc11-4d04-8eb1-c0e2fdc44e97"},{"id":"fdeb7783-851b-48d1-810b-3d39970161b6","company_id":"6ea0f41a-b13e-481a-b410-5195f391f939","title":"Senior Machine Learning Engineer, Voice AI ","slug":"senior-machine-learning-engineer-voice-ai-e60e860b","description":"About the Role \n Together AI is building the best inference infrastructure for voice applications. Our Voice AI platform powers production-grade, real-time voice agents and applications — serving speech-to-text and text-to-speech models with best-in-class latency and reliability.\n We're looking for a Senior ML Engineer to drive the model serving layer for voice workloads. You'll work hands-on with inference engines like TRT-LLM and SGLang to optimize how we serve models like Whisper, Parakeet, Orpheus, and Kokoro — pushing latency and throughput to the frontier. You'll profile GPU utilization, design batching strategies for streaming audio, and ensure new model architectures can go from research to production quickly.\n This is a foundational hire on a small, high-impact team. Voice inference has unique challenges — streaming audio, tokenization, real-time latency budgets — that require dedicated ML engineering focus. You'll shape how Together serves voice models as the industry moves from pipeline architectures (ASR → LLM → TTS) toward end-to-end speech-to-speech.\n \n Own the model serving stack that powers Together's voice platform across STT, TTS, and speech-to-speech.\n Work directly with state-of-the-art accelerators (H100s, H200s, B200s) to optimize voice model inference.\n Collaborate with model partners (Cartesia, Deepgram, Rime, and others) to bring their models to production on Together's infrastructure.\n Build quality evaluation frameworks that guide model selection for customers and inform the roadmap.\n Join a small, early-stage team with outsized impact on a fast-growing product area.\n \n Responsibilities \n \n Optimize inference performance for voice models (STT, TTS, speech-to-speech) — targeting best-in-class TTFB, throughput, and GPU utilization across our curated model set.\n Productionize voice models on serverless and dedicated endpoints, including batching strategies, streaming inference, and memory management tailored to audio workloads.\n Build and maintain a voice model evaluation framework — measuring WER across accents, languages, and noise conditions for STT; naturalness, latency, and pronunciation accuracy for TTS.\n Enable new model architectures in our serving stack as the field evolves, including audio-native LLMs, codec-based models (SNAC), and speech-to-speech systems.\n Collaborate with model partners to integrate and optimize their models (Cartesia, Deepgram, Rime, and others) running on Together's infrastructure.\n Profile and debug performance across the full inference stack — from GPU kernels to framework-level bottlenecks — and ship measurable improvements.\n Work with the platform engineering side of the team to ensure the serving layer meets the latency and reliability requirements of real-time voice APIs.\n Contribute to voice model fine-tuning capabilities (STT and TTS) as we enable customers to build differentiated voice experiences on Together.\n Lay the groundwork for multiple new products down the line.\n \n Requirements \n \n 5+ years of experience in ML engineering, with a focus on model serving, inference optimization, or ML infrastructure.\n Hands-on experience with LLM serving engines (vLLM, SGLang, TensorRT-LLM, or similar) — comfortable reading and modifying engine internals, not just using APIs.\n Strong proficiency in Python and PyTorch; experience with GPU profiling and optimization (CUDA, memory management, kernel-level debugging).\n Track record of shipping ML systems to production with measurable performance improvements.\n Strong product sense — you think about what developers building voice apps actually need, not just what's technically interesting.\n Comfort working on a small, early-stage team where you'll wear multiple hats and move fast.\n Experience with speech and audio ML (ASR, TTS architectures, audio signal processing) is a strong plus but not required — you can learn this quickly if you have strong ML engineering fundamentals.\n Familiarity with audio codecs and tokenization schemes (SNAC, Encodec, DAC) is a plus.\n Experience training or fine-tuning speech models is a plus.\n Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field, or equivalent practical experience\n \n About Together AI \n Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure.\n Compensation \n We offer competitive compensation, start","salary_min":200000,"salary_max":260000,"location":"San Francisco, CA","workplace":"onsite","remote_scope":"not_remote","job_type":"full-time","experience_level":"senior","tags":["speech","fine-tuning","gpu","llm","mlops","pytorch","machine-learning"],"apply_url":"https://job-boards.greenhouse.io/togetherai/jobs/5088817007","is_featured":false,"is_sticky":false,"status":"active","published_at":"2026-03-30T19:36:00Z","expires_at":"2026-08-14T14:02:20.318004Z","created_at":"2026-04-13T09:37:38.250213Z","updated_at":"2026-07-15T14:02:20.453403Z","company_name":"Together AI","company_slug":"together-ai","company_logo_url":"https://www.google.com/s2/favicons?domain=together.ai\u0026sz=128","quality_score":90,"url":"https://aidevboard.com/job/fdeb7783-851b-48d1-810b-3d39970161b6"}],"market_demand_pack":{"amount_cents":2900,"api_checkout_url":"https://aidevboard.com/api/v1/checkout?product_id=aidevboard_ai_skills_demand_pack","checkout_url":"https://aidevboard.com/market-demand-pack?qc=api-jobs-market-demand-pack\u0026utm_campaign=skills_demand_pack\u0026utm_medium=jobs_api\u0026utm_source=api","currency":"USD","description":"Full ranked public AI/ML demand CSV, source job URLs, and decision brief with market and offer angles.","fulfillment":"automatic_email_after_paid_checkout","human_checkout_url":"https://aidevboard.com/market-demand-pack?qc=api-jobs-market-demand-pack\u0026utm_campaign=skills_demand_pack\u0026utm_medium=jobs_api\u0026utm_source=api","name":"AI Market Demand Pack","next_step":"Open checkout_url for Stripe Checkout, or call api_checkout_url to get the non-charging checkout handoff payload.","price_usd":29,"product_id":"aidevboard_ai_skills_demand_pack","quote_url":"https://aidevboard.com/api/v1/quote?product_id=aidevboard_ai_skills_demand_pack"},"page":1,"per_page":20,"total":204,"total_pages":11}