Senior Product Manager, Experimentation Tooling

Lightning AI · New York, NY · $160k - $275k

full-time senior Posted 1 month ago

Apply Now Stand out: build a proof-of-work pitch →

Free GitHub-based preview. Direct apply stays one click away.

Get weekly job alerts like this →

Hiring for this role?

gpu reinforcement-learning fine-tuning distributed-systems pytorch

About this role

Who We Are Lightning AI is the company behind PyTorch Lightning. Founded in 2019, we build an end-to-end platform for developing, training, and deploying AI systems—designed to take ideas from research to production with less friction. Through our merger with Voltage Park, a neocloud and AI Factory, Lightning AI combines developer-first software with cost-efficient, large-scale compute. Teams get the tools they need for experimentation, training, and production inference, with security, observability, and control built in. We serve solo researchers, startups, and large enterprises. Lightning AI operates globally with offices in New York City, San Francisco, Seattle, and London, and is backed by Coatue, Index Ventures, Bain Capital Ventures, and Firstminute. What We're Looking For We’re looking for a Senior Product Manager to own Lightning AI’s experimentation and post-training product end to end—from product strategy and roadmap through launch, adoption, pricing, and go-to-market. This is a role focused on how AI researchers and engineers turn an idea into a high-quality, validated model. You’ll define the workflow for running and comparing experiments, managing fine-tuning and reinforcement-learning workloads, evaluating model quality, understanding failures, and selecting the right model or checkpoint for production. You’ll work at the intersection of developer tooling, AI infrastructure, and model quality. The right candidate understands how modern AI teams work today: notebooks, training jobs, experiment trackers, checkpoints, evaluation suites, spreadsheets, and custom internal tooling. You can identify where those workflows break down and turn them into a cohesive, minimal product experience. You should be able to move fluidly between designing an intuitive developer workflow, discussing distributed training and artifact lineage with engineers, evaluating model outputs, and explaining the product’s value to customers and sales teams. This role requires unusually high ownership. You will work directly with engineering, customers, and the executive team; develop strong product opinions; create your own product artifacts; and drive work forward without waiting for another function to define the next step. You will join the Product Team, report to our VP of Product, and work directly with our executive team as we grow this business. This is a hybrid role based in our New York City or San Francisco office, with an in-office expectation of two days per week. What You’ll Do Own the product vision and roadmap for post-training and experimentation — what we build, what we integrate with, what we don't build, and in what order Understand how ML engineers and AI researchers actually work today: the jobs they run, the comparisons they make, the failures they debug, and the handoffs that break down between research and production — then build the product that makes that workflow coherent Develop a strong point of view on where Lightning should build differentiated experiences versus integrate with the existing ecosystem of experiment trackers, evaluation frameworks, data tools, and model registries Work directly with engineers from problem definition through architecture, implementation, and launch — understand the constraints, help shape the solutions, don't hand off requirements and wait Use the product yourself; inspect failed workflows, read logs, identify friction, and remove it without waiting for someone to surface it Own model evaluation as a product function — write evals, assess outputs, and let quality signals drive roadmap decisions Design pricing and packaging in partnership with Growth and Finance — model unit economics, run experiments, and make calls that affect both adoption and margin Build workflows that help teams collaborate: share results, compare models, move work from research into production, and maintain enough lineage that decisions can be explained and reproduced Be the product voice in GTM — sales positioning, technical objection handling, and developer-facing content that builds credibility with ML engineers and platform teams Define and instrument the metrics that matter across activation, iteration speed, compute consumption, retention, and expansion What You’ll Need 7+ years of product management experience, including at least 3 years building infrastructure, platform, developer-tooling, or machine-learning products. Hands-on experience building products for ML engineers, AI researchers, or data scientists. A detailed understanding of experimentation and post-training workflows, including training jobs, checkpoints, metrics, artifacts, experiment comparison, reproducibility, and model evaluation. Experience with one or more modern post-training techniques, such as supervised fine-tuning, preference optimization, reinforcement learning, distributed training, or hyperparameter optimization. Experience designing o