Site Reliability Engineer (SRE)

bei Pragmatike

Remote
IT & Softwareentwicklung
DevOps
Site Reliability Engineering (SRE)
Production

Beschäftigungsart:

Gleitzeit
Vollzeit

Fähigkeiten:

Python
AWS
CI / CD
Jenkins
Kubernetes
Terraform
Linux
monthsOfExperience: 60
Orchestration
Grafana
GitLab
Scalability
Veröffentlicht am:
Bewerbungsfrist:

Location: Start Date: Languages:

Responsibilities

  • Design, implement, and maintain scalable, resilient AWS infrastructure
  • Develop and manage CI/CD pipelines and infrastructure-as-code (Terraform or similar)
  • Set up and optimize monitoring, alerting, and incident response processes
  • Proactively identify and resolve performance, reliability, and security issues
  • Collaborate with development teams to integrate SRE best practices into their workflows
  • Conduct post-mortems and root cause analyses on incidents
  • Participate in on-call rotations to support 24/7 system reliability

Requirements

  • 5+ years of experience as an SRE or similar role
  • Deep knowledge of AWS services (EC2, ECS, RDS, Lambda, S3, etc.)
  • Proficient in infrastructure-as-code tools (Terraform, CloudFormation, etc.)
  • Solid experience with Linux systems administration and networking concepts
  • Strong programming/scripting skills (Python, Bash, Go, etc.)
  • Experience with CI/CD tools (GitLab CI, Jenkins, etc.)
  • Familiarity with observability tools (Prometheus, Grafana, Datadog, etc.)

Nice To Have

  • Experience with container orchestration (ECS, EKS, or Kubernetes)
  • Understanding of security best practices in cloud environments
  • Exposure to incident management frameworks (SRE handbook, etc.)

Why Join Us

  • 100% remote work with flexible hours
  • High-impact role with autonomy and ownership
  • Collaborative and international engineering team
  • Cutting-edge tech stack with strong focus on reliability and automation.

Benefits: MinSalary: MaxSalary: SalaryCurrency: EUR Remote Model: Remote