Mistral AI
Mistral Cloud – Site Reliability Engineer Overview
| Company Name | Mistral AI |
| Job Role | Mistral Cloud – Site Reliability Engineer |
| Qualifications | Not Specified |
| Category | IT Jobs |
| Job Type | Full Time |
| Location | London |
Mistral AI is looking for an experienced Site Reliability Engineer to help shape the dependability, scalability, and speed of its cloud platform and the applications used by customers. The role sits within the Engineering & Infra organization and works closely with software engineering and product teams to make sure internal and external users get a reliable, high-performing experience. The company builds AI products and infrastructure designed to simplify work, support learning and creativity, and serve enterprise needs both in cloud and on-premises environments.
What you will be doing
- Build, operate, and improve infrastructure that can scale efficiently, remain highly available, and continue functioning even when components fail.
- Support production systems day to day, including incident handling, response to interruptions, user administration, data extraction tasks, and infrastructure scaling.
- Strengthen monitoring, alerting, and incident response processes so problems are detected quickly and downtime is minimized.
- Create and maintain the operational workflows and technical tooling used for continuous integration and delivery, container-based deployments, orchestration, logging, monitoring, and alerting for both customer APIs and large-scale training workloads.
- Join on-call coverage occasionally, respond to incidents, and perform post-incident analysis to identify and eliminate recurring causes.
- Improve automation, deployment, and orchestration practices across the infrastructure environment.
- Work alongside software engineers to build systems that allow model-training experiments to be run safely and in a reproducible way.
- Contribute to the development of the cloud platform by helping define the layer that connects scientific work, engineering, and infrastructure.
- Design new tools and workflows that raise reliability, availability, and performance, including scripts, refactoring work, API-driven features, internal applications, and dashboards.
- Collaborate with security specialists to ensure infrastructure follows strong security practices and meets compliance requirements.
- Document operational procedures and team processes so knowledge is preserved and shared consistently.
- Contribute beyond the core role through open-source work, research publications, blog posts, and conference participation.
What the company is looking for
- A masterâs degree in computer science, engineering, or a related field.
- At least five years of experience in DevOps or site reliability engineering.
- Strong hands-on experience with bare-metal infrastructure and distributed systems that must stay available under load.
- Experience dealing with reliability issues in critical environments, including troubleshooting live systems, identifying root causes, and participating in on-call rotations.
- Experience working toward reliability targets and operational metrics such as observability, alerting, and service-level agreements.
- Practical experience with CI/CD, containerization, and orchestration tools such as Docker and Kubernetes.
- Knowledge of observability and operations tooling such as Prometheus, Grafana, ELK Stack, and Datadog.
- Familiarity with infrastructure-as-code tools such as Terraform or CloudFormation.
- Ability to script and develop in languages such as Python, Go, or Bash, plus familiarity with software development best practices.
- Strong understanding of networking, security, and system administration.
- Excellent communication and problem-solving skills.
- A self-motivated approach and the ability to thrive in a fast-paced startup environment.
- Experience in AI or machine learning environments would be especially valuable.
- Experience with high-performance computing systems and workload managers such as Slurm would be a plus.
- Experience with modern AI infrastructure providers or platforms such as Fluidstack, CoreWeave, or Vast would also strengthen an application.
Hiring process
- Introductory call lasting 30 minutes.
- Interview with the hiring manager lasting 30 minutes.
- Technical interview focused on system design lasting 45 minutes.
- Technical deep-dive interview lasting 60 minutes.
- Culture-fit discussion lasting 30 minutes.
- Reference checks.
Team culture
The team emphasizes a culture built around rigorous thinking, boldness, customer success, early shipping, rapid iteration, and humility. Candidates are expected to align with these values and contribute to a collaborative, low-ego environment.
Location and working arrangement
This position is mainly based in one of the companyâs European offices, specifically Paris or London. Candidates who already live in those locations, or who are willing to relocate, will be prioritized. The company places a strong emphasis on in-person collaboration to support communication and team relationships.
Remote candidates may also be considered if they are based in one of the countries listed for the role: France, the UK, Germany, Belgium, the Netherlands, Spain, or Italy. For remote hires, the company requires an initial onboarding visit to the Paris headquarters for the first week, with accommodation and travel paid for, and then expects at least three days in the Paris office every six weeks, again with travel and lodging covered.
What is offered
- Competitive salary together with equity.
- Health insurance.
- Allowance for sports or fitness activities.
- Meal vouchers.
- Generous parental leave.
- Visa sponsorship.
- For remote hires who need to travel to Paris for onboarding and recurring office visits, accommodation and travel expenses are covered.
How to apply
Applicants can submit their application through the job postingâs application link. By applying, candidates agree to the companyâs Applicant Privacy Policy.
Degree Requirement: Not Specified
Visa Sponsorship Promising
To apply for this job please visit jobs.lever.co.