Senior Software Engineer - SRE Core Infrastructure
full-time
senior
Posted 1 month ago
About this role
About us
PhysicsX is a deep-tech company with roots in numerical physics and Formula One, dedicated to accelerating hardware innovation at the speed of software.
We are building an AI-driven simulation software stack for engineering and manufacturing across advanced industries. By enabling high-fidelity, multi-physics simulation through AI inference across the entire engineering lifecycle, PhysicsX unlocks new levels of optimization and automation in design, manufacturing, and operations — empowering engineers to push the boundaries of possibility. Our customers include leading innovators in Aerospace & Defense, Materials, Energy, Semiconductors, and Automotive.
Senior Software Engineer – SRE Core Infra London (Hybrid) | Engineering | Full Time
The Role
PhysicsX is growing rapidly and so is the infrastructure that underpins our platform. We are building core infrastructure that is reliable, secure, scalable and reproducible across multiple cloud providers and on premises environments. As the platform evolves to serve increasingly complex engineering workloads, the foundational infrastructure layer becomes ever more critical.
We are looking for a Senior Software Engineer to join our Platform SRE Core Infrastructure team. This role is responsible for the design, provisioning and operation of the shared infrastructure that the entire PhysicsX platform depends on. You will work across infrastructure as code, Kubernetes cluster management, secrets management, GPU drivers, networking and multi tenancy architecture to ensure the platform is dependable, scalable and secure.
This is a role for an engineer who combines deep infrastructure expertise with a reliability engineering mindset and an appreciation for the developer experience of the teams that consume the platform.
What You Will Do
Own the design and delivery of core infrastructure across multi cloud providers (AWS, GCP, Azure) and on premises environments using Terraform and Crossplane.
Architect and operate Kubernetes clusters supporting both single tenant and multi tenant workloads, with a strong emphasis on isolation, performance and reliability.
Define and implement infrastructure provisioning patterns using Crossplane compositions and Terraform modules, ensuring reproducibility and auditability across environments.
Design and operate secrets management solutions, including dynamic secret provisioning, rotation and fine grained access control integrated with cluster identity.
Manage and maintain GPU driver configurations and accelerated compute node pools, ensuring compatibility and performance for AI and simulation workloads.
Own cluster networking design including CNI selection, Istio service mesh integration, ingress strategy and cross cluster connectivity.
Implement and maintain vCluster based multi tenancy to provide strong workload isolation within shared infrastructure.
Develop lightweight Kubernetes Operators or controllers where automation of infrastructure lifecycle tasks requires it.
Establish SLOs and reliability targets for core infrastructure components and lead the response to production incidents.
Partner with security and platform teams to enforce infrastructure governance, network policies and compliance controls.
Contribute to and uphold engineering standards across the platform organisation.
What You Bring to the Table
Kubernetes depth – 5 or more years of professional experience operating Kubernetes in production. You have a thorough understanding of cluster architecture, the scheduler, networking, storage and the API lifecycle. Kubernetes certifications such as CKAD, CKA or CKS are highly desirable.
Crossplane expertise – significant hands on experience designing and operating Crossplane compositions, providers and managed resources in production environments.
Terraform proficiency – strong experience authoring, structuring and operating Terraform at scale, including state management, module design and CI integration.
Multi cloud and on premises – practical experience operating infrastructure across more than one cloud provider and on premises environments, with an understanding of the differences in identity, networking and storage.
Multi tenancy architecture – experience designing and implementing both single tenant and multi tenant Kubernetes architectures, with strong views on isolation, resource governance and operational overhead.
Secrets management – experience with tools such as Vault, External Secrets Operator or cloud native secret stores, including dynamic provisioning and rotation.
Networking – solid knowledge of Kubernetes networking, CNI plugins, Istio service mesh and ingress patterns. Experience with cross cluster or hybrid connectivity is valuable.
vCluster and virtual clusters – experience using vCluster or similar tooling to provide lightweight, isolated Kubernetes environments within shared clusters.
GPU and accelerated compute – familiarity with GPU dr
Similar Jobs
Related searches: