Senior, ML Engineer - Auto Tagger

Torc Robotics · Ann Arbor, MI · $177k - $212k
full-time senior Posted 6 days ago

About this role

About the Company   At Torc, we have always believed that autonomous vehicle technology will transform how we travel, move freight, and do business. A leader in autonomous driving since 2007, Torc has spent over a decade commercializing our solutions with experienced partners.  Now a part of the Daimler family , we are focused solely on developing software for automated trucks to transform how the world moves freight. J oin us and catapult your career with the company that helped pioneer autonomous technology, and the first AV software company with the vision to partner directly with a truck manufacturer.    Meet The Team:   The Auto Tagger team is the engine behind our data flywheel, responsible for translating petabytes of raw, multi-modal vehicle data into a highly curated library of critical driving scenarios. By mining driving logs for long-tail events, we provide the foundational data required for safe autonomous trucking. Leveraging Pegasus logical layers, this team structures and catalogs findings into an observations database that directly accelerates development across autonomous perception, sensor fusion, and generative simulation testing.   What You'll Do:   Scenario Mining at Scale:  Architect and optimize distributed data pipelines to process massive multi-sensor logs (camera, LiDAR, radar, kinematics), automatically extracting and cataloging safety-critical and long-tail driving events. Advanced Event Tagging:  Develop and tune both heuristic-based and ML-assisted algorithms (including exploring Vision-Language Models or semantic vector search) to automatically classify and describe complex environmental and behavioral scenarios. Standardized Data Structuring:  Extract and format scenario data utilizing the Pegasus layer standard (alongside opensource frameworks) to ensure semantic consistency and rigorous metadata integrity. Data Flywheel Integration:  Manage the ingestion of tagged events into the observations database, enabling high-speed querying and retrieval for ML training, regression testing, and system validation. Cross-Functional Alignment:  Operate with broad autonomy to drive consensus across organizational boundaries. Collaborate closely with downstream consumers in perception, simulation, and systems engineering to define what constitutes an "interesting scenario" and operationalize a continuous data loop. Mentorship & Team Growth:  Guide, mentor, and elevate less-experienced engineers. Lead design reviews, establish coding standards, and foster a culture of technical excellence and collaborative problem-solving.   What You'll Need to Succeed:   BS or MS  in Computer Science, Robotics, Engineering, or a STEM field, with  6+ years  in data engineering, ML systems, or autonomous data curation. Core Languages:  Strong Python and SQL skills, with heavy experience processing massive time-series or unstructured datasets. ML & Dataset Curation:  Hands-on machine learning and dataset curation experience, with a demonstrated history of implementing targeted datasets that measurably improve downstream model performance. Data Exploration:  Hands-on experience using Databricks (or similar platforms) for large-scale analytics, interactive querying, and making massive vehicle datasets searchable. Cloud & Compute:  Expertise in distributed compute frameworks (Ray, Spark, Beam) and cloud platforms (AWS, GCP, or Azure) for executing heavy data workloads. AV Standards:  Experience parsing complex data formats and applying scenario-description standards like Pegasus layers. Communication:  Exceptional ability to translate complex data engineering challenges into clear strategies for cross-functional stakeholders. Technical Leadership:  Proven track record of mentoring teams, driving system architecture, and defining engineering roadmaps.   Bonus Points!   Auto-labeling & VLMs:  Familiarity with foundational models, auto-labeling pipelines, or zero-shot classification for scenario extraction. Model Serving:  Experience with vLLM, SGLang, or similar frameworks for highly optimized, high-throughput model serving and inference Semantic Inference:  Experience with semantic extraction and attribute mapping to help build out a robust semantic inference engine, moving beyond standard bounding-box object detection. Data Tooling:  Familiarity with parsing robotics formats (ROS bags, MCAP) and optimizing high-performance columnar storage formats (Parquet, Arrow). Downstream Integration:  Knowledge of how scenario data feeds into generative simulation workflows, neural rendering, or sensor fusion validation. Advanced Retrieval:  Experience building semantic retrieval systems or vector databases for automotive data.   Perks of Being a Torc’r     Torc cares about our team members and we strive to provide benefits and resources to support their health, work/life balance, and fu

Similar Jobs

Related searches:

Remote Jobs Senior Jobs Remote Senior Jobs Senior Computer VisionSenior Data EngineeringSenior Data ScienceSenior Machine LearningSenior Generative AISenior NLP & Language AISenior Robotics & AutonomySenior Fintech & Payments AI AI Jobs in Ann Arbor Computer Vision in Ann ArborData Engineering in Ann ArborData Science in Ann ArborMachine Learning in Ann ArborGenerative AI in Ann ArborNLP & Language AI in Ann ArborRobotics & Autonomy in Ann ArborFintech & Payments AI in Ann Arbor fine-tuningcomputer-graphicscomputer-visionpaymentsautonomous-vehiclesroboticsembeddingsdata-pipeline

Get jobs like this delivered weekly

Free AI jobs newsletter. No spam.