Silicon Architect, Diffusion ASICs

Normal Computing · New York, NY
full-time mid Posted 9 hours ago

About this role

NORMAL COMPUTING | INCREDIBLE OPPORTUNITIES The Normal Team builds foundational software and hardware that help move technology forward, supporting the semiconductor industry, critical AI infrastructure, and the broader systems that power our world. We work as one team across New York, San Francisco, Copenhagen, Seoul, and London. YOUR ROLE IN OUR MISSION Look at the AI accelerator roadmaps coming out of every major silicon company right now and you will notice something strange: they are all building the same chip. Bigger systolic arrays. More HBM. More of the same architecture, scaled harder. The industry has placed a collective bet that the way to win the next decade of AI inference is to refine the GPU paradigm until it cannot be refined any further. We know that bet is wrong. Normal is building ASICs purpose-built for image and video diffusion inference, grounded in the physics of computation rather than the assumptions everyone else has inherited. The compute substrate has to be invented, not specified, and we are looking for the person who wants to help invent it. You will work directly alongside our lead architect and research engineers, contributing across the full architecture stack: compute core microarchitecture, memory subsystem, interconnect, and the FPGA prototyping that proves the decisions before silicon. The team is small. The scope is wide. The architecture is being shaped now, not refined, and your contributions will be visible in the chip when it tapes out. If the appeal of working on a chip that has to be invented is greater to you than iterating on one that already exists, keep reading. RESPONSIBILITIES - Help define the architecture and microarchitecture of novel AI accelerator compute blocks. PE array design, datapath organization, and support for efficiency techniques such as sparsity exploitation and reduced-precision computation. The compute tile is the surface where Normal's research advantages have to show up in silicon, and you are one of the people responsible for making sure they do. - Translate workload analysis and research findings into hardware specifications. Identify where architectural innovation creates the most leverage, define the structures that realize it, and produce microarchitecture documents unambiguous enough for RTL engineers to implement against. You work closely with them through implementation, not over the wall from it. - Reason across the full stack and defend PPA tradeoffs at every level. Move between algorithm-level workload behavior, memory hierarchy, on-chip interconnect, and physical design constraints. Make the call when the data is incomplete, and articulate why under scrutiny from the lead architect and the research team. - Partner with the compiler lead on ISA co-design. The compute tile must be compilable and programmable, not just simulatable. The programming model and the microarchitecture are defined together, and you are accountable for both sides meeting in the middle. - Own the FPGA prototyping work. Scope what the FPGA implementation actually proves, drive the implementation through to bring-up, and use it to de-risk architecture decisions before tapeout. You decide which questions are worth answering in FPGA versus cycle-accurate simulation. - Stay current with the AI accelerator research landscape and be able to articulate clearly where Normal's approach differs from existing solutions and why that matters. This is a research-adjacent seat and you are expected to read, not just consume. WHAT WE'RE LOOKING FOR - A degree in Electrical Engineering, Computer Engineering, Computer Science, or equivalent work experience. PhD welcome but not required; the bar is the work, not the credential. - Substantial experience in architecture or microarchitecture of high-performance digital systems. AI accelerators, compute engines, or similarly complex logic. You have shaped the structures inside a chip, not just consumed them from the outside. - Fluency moving between algorithm-level analysis and hardware specification. You can read a profile of a workload and translate it into datapath widths, pipeline stages, and area/power estimates without losing the thread on either side. - Experience with simulation-driven architecture. You have used cycle-accurate or analytical models to make and defend design decisions before RTL exists, and you know which questions each tool can answer and which it cannot. - Familiarity with quantization and reduced-precision approaches for inference and their implementation implications. You understand the cost of a bit at the hardware level, not just the model level. - Experience writing microarchitecture specifications and working closely with RTL engineers through implementation. Your specs are read, not just filed. - Proficiency in Python or C++ for performance modeling and analysis, and familiarity with SystemVerilog or equivalent RTL. - Comfort operating in an environment whe

Similar Jobs

Related searches:

On-site Jobs Mid-Level Jobs On-site Mid-Level Jobs Mid-Level Data Engineering AI Jobs in New York Data Engineering in New York search

Get jobs like this delivered weekly

Free AI jobs newsletter. No spam.