Researcher, Models

Cartesia AI · San Francisco, CA
full-time mid Posted 1 year ago

About this role

ABOUT CARTESIA Our mission is to build the next generation of AI: ubiquitous, interactive intelligence that runs wherever you are. Today, not even the best models can continuously process and reason over a year-long stream of audio, video and text—1B text tokens, 10B audio tokens and 1T video tokens—let alone do this on-device. We're pioneering the model architectures that will make this possible. Our founding team met as PhDs at the Stanford AI Lab, where we invented State Space Models or SSMs, a new primitive for training efficient, large-scale foundation models. Our team combines deep expertise in model innovation and systems engineering paired with a design-minded product engineering team to build and ship cutting edge models and experiences. We're funded by leading investors at Index Ventures and Lightspeed Venture Partners, along with Factory, Conviction, A Star, General Catalyst, SV Angel, Databricks and others. We're fortunate to have the support of many amazing advisors, and 90+ angels across many industries, including the world's foremost experts in AI. YOUR IMPACT Conduct groundbreaking research in neural network architecture design to advance the state-of-the-art in alternative architectures (e.g., state space models, efficient transformers, hybrid architectures). - Design novel architectures that improve model quality, inference efficiency, and adaptability across diverse deployment environments, from cloud to on-device. - Explore and develop capabilities such as statefulness, long-range memory, and innovative conditioning mechanisms for enhancing model expressiveness and generalization. - Investigate how architectural decisions impact model trade-offs, including scalability, robustness, latency, and energy efficiency. - Develop new frameworks and tools to evaluate architectural innovations, benchmarking performance across research and production settings. - Collaborate with cross-functional teams to translate architectural research into scalable and impactful systems for real-world applications. WHAT YOU BRING - Deep expertise in architecture design, with experience in researching or deploying advanced architectures (e.g., state space models, transformers, RNN variants, CNN variants). - Strong understanding of how architectures interact with system constraints, including deployment in cloud environments or on-device. - Proficiency in designing architectures that balance quality, efficiency, and adaptability across different use cases and modalities (e.g., vision, audio, text). - Familiarity with generative modeling paradigms like autoregressive and diffusion models, and designing capabilities such as statefulness and conditioning in deep learning models. - A proven research track record in top-tier ML/AI venues (e.g., NeurIPS, ICML, ICLR, CVPR) or demonstrable contributions to state-of-the-art architectures. - Exceptional analytical and problem-solving skills, with a focus on experimentation and iterative refinement. - Strong programming skills in deep learning frameworks such as PyTorch or TensorFlow, and experience with profiling tools for understanding model performance. NICE TO HAVES - Prior research or publications in state space models, efficient Transformers or other alternative architectures. - Research or practical experience in designing architectures for multi-modal systems. - Early-stage startup experience or a track record of rapid innovation in R&D environments. OUR PERKS 🍽 Lunch, dinner and snacks at the office 🏥 Fully covered medical, dental, and vision insurance for employees 🏦 401(k) ✈️ Relocation and immigration support 🦖 Your own personal Yoshi OUR CULTURE 🏢 We’re an in-person team based out of San Francisco. We love being in the office, hanging out together, and learning from each other every day. 🚢 We ship fast. All of our work is novel and cutting edge, and execution speed is paramount. We have a high bar, and we don’t sacrifice quality or design along the way. 🤝 We support each other. We have an open & inclusive culture that’s focused on giving everyone the resources they need to succeed.

Similar Jobs

Related searches:

On-site Jobs Mid-Level Jobs On-site Mid-Level Jobs Mid-Level Computer VisionMid-Level Machine LearningMid-Level Generative AIMid-Level AI Research AI Jobs in San Francisco Computer Vision in San FranciscoMachine Learning in San FranciscoGenerative AI in San FranciscoAI Research in San Francisco tensorflowdiffusion-modelsdeep-learninggenerative-aipytorchresearch