Site Reliability Engineer (Linux Platform Operations) (m/f/d)

As a Site Reliability Engineer, you operate and evolve Linux-based production platforms that power critical business services at scale. You focus on automation, reliability, and reducing operational overhead while enabling teams to work more independently.

Responsibilities

Ensure reliable, secure, and high-performing Linux-based production systems with full ownership
Automate operational tasks (e.g. patching, provisioning, deployments) to eliminate manual effort and improve efficiency
Standardize and optimize deployment and configuration processes for scalability and consistency
Lead incident response and drive root cause analysis and long-term fixes
Manage and automate access and identity processes with a strong focus on security and auditability
Maintain and improve core Linux infrastructure services essential for platform operations
Collaborate with engineering teams to enhance observability and shared operational practices
Analyze complex systems end-to-end and simplify them to improve reliability and performance
Drive the modernization of operations towards automation, scalability, and self-service models
Adapt quickly to changing environments and deliver pragmatic, effective solutions

Requirements

5+ years of experience in Linux-based production environments
Strong expertise in Linux systems engineering, performance tuning, and lifecycle management
Strong understanding of reliability concepts (SLOs, SLAs, performance, capacity)
Solid scripting and automation skills (e.g., Bash, Python) with a continuous improvement mindset
Hands-on experience with configuration management (e.g., Salt, Ansible) and Infrastructure as Code (e.g., Terraform)
Experience with CI/CD tools (e.g., GitLab, Jenkins) and automated deployments
Good knowledge of monitoring and observability tools (e.g., Zabbix, Grafana, ELK)
Proven experience in incident management, root cause analysis, and postmortems
Experience with security practices, including patching and access control
Knowledge of core traffic services (DNS, load balancing, CDN)
Basic experience with container and cloud technologies (Docker, Kubernetes, AWS)

Benefits

We value diversity and treat all applications equally – regardless of gender, background, age, religion, disability, or sexual orientation. Different perspectives enrich our team and make EVENTIM stronger.

PHP-Newsletter

PHP-Newsletter

Site Reliability Engineer (Linux Platform Operations) (m/f/d)

Site Reliability Engineer (Linux Platform Operations) (m/f/d)

Responsibilities

Requirements

Benefits

Sprachanforderungen

Über CTS Eventim AG

Ähnliche Stellen

Senior Full Stack PHP Engineer (m/f/d)

Senior Full Stack PHP Engineer (m/f/d)

PHP-Entwickler (m/w/d) für ERP-System {Anwendungsprogrammierer/in}

PHP-Entwickler (m/w/d) für ERP-System {Anwendungsprogrammierer/in}

Softwareentwickler Mobile Apps iOS (m/w/d) – Remote {Mobile Developer}

Softwareentwickler Mobile Apps iOS (m/w/d) – Remote {Mobile Developer}