Job offer

Site Reliability Engineer

Man Group is seeking a Site Reliability Engineer to ensure the reliability, availability, and performance of the company’s technology platform and to work on innovative projects. The successful candidate will be part of a high-performing team and will have the opportunity to develop and grow at various levels within the company.

Abroad

Man Investments AG

100%

The role

Join our high-performing Site Reliability Engineering (SRE) team and play a key role in ensuring the reliability, availability, and performance of the technology that powers Man Group’s hedge funds, lending, custody, and banking operations. This is an opportunity to work on groundbreaking projects and help shape the future of our platform.

role responsibility

- Ensure the reliability and performance of critical systems across the global infrastructure through proactive monitoring and rapid incident response - Design and implement observability solutions using tools such as Prometheus, Datadog, ELK, and Loki to provide insights and enable data-driven decisions - Automate operational tasks and build self-service capabilities to eliminate routine work and improve efficiency - Develop and maintain SLIs, SLOs, and error budgets to drive reliability improvements and inform engineering priorities - Participate in incident response efforts, conduct blame-free post-mortems, and implement preventive measures to reduce errors - Collaborate with development teams to improve system design, deployment practices, and operational excellence - Configure and roll out major infrastructure upgrades; manage compute/server utilization and high-performance distributed systems - Contribute to capacity planning and performance budgeting to ensure systems meet business requirements - Manage multiple ELK clusters hosting hundreds of terabytes of log data, telemetry, and APM data

Key competencies

Required

- Strong understanding of SRE principles, including SLIs, SLOs, fault budgets, and reliability testing - Experience with observability and monitoring tools such as Prometheus, Grafana, ELK, Loki, or similar - Proficiency in automation tools (Ansible, Terraform) and scripting/programming languages (Python, Go, PowerShell) - Strong troubleshooting and problem-solving skills in distributed systems, with the ability to diagnose complex issues under pressure - Experience with infrastructure, containers, on-call rotations, and post-incident reviews - Familiarity with Kubernetes and container orchestration

Advantageous

- Experience with CI/CD pipelines and source code workflows (Git, Jenkins, TeamCity, GitLab) - Administration of Linux and Windows systems and experience with cloud technologies (AWS/Azure) - Understanding of network concepts, load balancing, and distributed architectures - Knowledge of AIOps/MLOps (Splunk, Elastic, Grafana, NDP-Peers) - Familiarity with internal communication and collaboration tools - Previous experience with Man Group

Benefits

- Modern office space on the OPD campus with easy access to public transportation and amenities - Hybrid work model - 28-day vacation package - 21 days of paid vacation - Premium pension contribution - Competitive benefits package - Additional compensation for long-term service and volunteer work - Additional benefits - Opportunities for professional development, including internal tech talks - Sponsorship and engagement with employee resources

Original job description

Job details

Found on:

March 31, 2026

Employer:

Man Investments AG

job percentage:

100%

Place of work:

Abroad

Place found on:

https://job-boards.eu.greenhouse.io/mangroup/jobs/4714467101

About the employer:

Man Investments AG is a global active investment manager.

The company offers alternative and traditional investment solutions.

The Swiss headquarters are located in Pfäffikon (SZ).

Man Group is listed on the London Stock Exchange and is part of the FTSE 250 Index.

Training opportunities powered by skillaware.