Build Systems Engineer (Bazel)

MatX · Mountain View, CA · $250k - $475k

full-time principal Posted 4 days ago

Apply Now Tailor a pitch before applying → Get weekly job alerts like this → Hiring? Promote this listing →

cloud

About this role

What MatX Is Building MatX builds custom AI accelerator silicon. The build system is the backbone: it wires together RTL, a stack of commercial EDA tools (simulation, synthesis, place-and-route, lint/CDC), and a Rust/Python software stack into one hermetic, reproducible pipeline. We run on Bazel with bzlmod, RBE, custom rules, and a small but tight set of platform-level abstractions. You'll join a small group that owns the build graph, the toolchains, the rules that wrap each EDA tool, and the CI infrastructure that keeps thousands of targets green. What You’ll Do Here New EDA tool integrations. Wrap a closed-source tool in a hermetic Bazel rule with proper providers, runfiles, and execution constraints. Add a new front-end stage to an existing toolchain; add a rule for test variants that share configuration; wire a third-party generator into our verilog graph as a first-class dep Bazel version migrations. Lead upgrades (8.x → 9.x) and the bzlmod/ MODULE.bazel housekeeping that comes with them Hermeticity work. Hunt down the implicit assumptions: system Python, system gcc, leaked /usr/bin deps, host-state in tests. Replace them with hermetic toolchains and tracked inputs Refactors that delete code. Rewrite a fragmented family of test macros in terms of one shared rule. Remove a homegrown wrapper rule once upstream covers the case. Extract a common aspect helper used by three places that duplicated it. The good PRs net negative Build performance. Persistent workers for slow tools, RBE configs, action graph hygiene, cache-key debugging when something silently rebuilds CI infrastructure. GitHub Actions self-hosted runners on GCE COS, Buildbarn workers, monitoring, rolling upgrades PRs are small and frequent. Median is +50/-30. Big refactors arrive as a series of mechanical commits, each individually reviewable Reviews are real. We comment, ask questions, request changes. Reviews are how we share the build system across the team — not rubber-stamping Negative diffs are celebrated. "Remove unused X" and "Replace ad-hoc Y with Z" are first-class contributions You'll teach the rest of the team Bazel. Half the company writes RTL or Rust, not Starlark. Good rules let them stay in their domain. Good docstrings (and stardoc) keep them self-serve You'll work tightly with at least one of us. Most non-trivial changes are pair-designed before code. Fast feedback loops, whiteboard sessions, no async-only collaboration Lean on AI, but stay persnickety. We use Claude Code and similar tools heavily — for prototypes, refactors, scripts, even rule scaffolding. We also reject most of what they produce on the first pass. You'll steer the model hard toward your taste, push back on the easy answer, and review every line you commit as if you wrote it. Auto-generated PRs that pass tests but miss the point are not what we want Who You Are Deep build-system fluency. Rules, providers (or equivalent), aspects, toolchains, platforms, configuration/select, transitions, query. You can read a build-system file — .bzl , Buck2 BUCK , Shake Rules.hs , whatever — and predict what its action graph will look like. Bazel-native is a plus; we'll trade six weeks of Starlark ramp for the right taste. We hire on build-system fluency, not Bazel-keyword-matching. If you've done equivalent work in Buck2, Shake/Hadrian, Pants, Nix, or a homegrown Blaze-shaped system, read the bullets as concepts — Bazel is what you'll write here, but the principles port. Be honest about ramp on Starlark and bzlmod. bzlmod / MODULE.bazel . Module extensions, lockfile management, vendoring third-party deps cleanly Remote execution. RBE, Buck2 RE, BuildBuddy, BuildBarn, your own — they all teach the same lessons. Cache-key debugging, Build without the Bytes, diagnosing "works locally, fails remote." If you've owned one end-to-end, the next one is a port Comfort in Rust / Python / shell / Starlark. You might read all four in any given week Bonus Points If You Have Build graph is the source of truth. "If two things must stay in sync, make one depend on the other." Allergic to parallel lists in workflow YAML, Python arrays, and .bzl dicts that drift Don't parse what you can generate. If a tool has the structured data internally, have it write structured output. Parsing human-readable reports is a temporary bridge, not a design Split build from check. A rule that produces artifacts always succeeds; a separate _test target gates on quality. Empty dashboards because the build broke are unacceptable Let Bazel parallelize, not the orchestrator. One bazel build --keep_going over N matrix jobs that each warm up Bazel Encode execution constraints in the rule, not the invocation. No README accumulating per-tool --strategy=... , --remote_download_outputs=... , --sandbox_debug incantations. execution_requirements belongs on the action Compose at the boundary. Dev and prod differ only in where