Senior Site Reliability Engineer
Remote
Full Time
#Engineering
#Kubernetes
#Ansible
#Helm
#Kustomize
#Prometheus
#Grafana
#AWS
#Terraform
#RabbitMQ
#Kafka
Oomnitza is the developer of a highly versatile Enterprise Technology Management platform that automates and orchestrates critical IT business processes. Our SaaS solution utilizes agentless integrations and low-code workflows to help innovative companies streamline tasks like onboarding, offboarding, and audit readiness. By reducing manual effort and redundant spending, we empower organizations to improve efficiency and mitigate cyber risk. We are currently looking for a Senior Site Reliability Engineer to join our growing, dynamic team. In this role, you will apply DevSecOps methodologies to ensure our global, large-scale distributed microservices remain secure, performant, and highly available.
Key outcomes
- Analyze platform and application metrics to drive continuous performance tuning and proactive fault detection.
- Collaborate with our engineering teams to enhance services through rigorous release procedures and testing.
- Automate infrastructure and system uplifts to create sustainable, efficient services.
- Manage the system landscape by balancing feature delivery speed with well-defined service level objectives to ensure maximum availability.
- Implement practices and technologies that align with our commitments to security, compliance, and system uptime.
- Coordinate and execute system upgrades while mentoring other engineers to foster company-wide process improvements.
- Participate in a rotating on-call schedule to maintain our service standards.
Requirements
- Extensive experience with Kubernetes for container orchestration, including scaling and troubleshooting production clusters.
- Proficiency in configuration management and automation tools such as Ansible, Helm, and Kustomize.
- Hands-on experience monitoring system health and optimizing performance using Prometheus and Grafana.
- Deep knowledge of AWS cloud services, including VPC, IAM, EC2, and S3.
- Expertise in Terraform for infrastructure as code to ensure repeatable and efficient deployments.
- Familiarity with message queuing systems like Kafka and RabbitMQ.
- Strong background in managing MySQL databases and Amazon RDS.
- Proficiency in at least one high-level programming language such as Python, Go, or JavaScript.
- A solid understanding of networking, security protocols, and high-uptime environment management.
- Excellent cross-functional collaboration skills and a proactive approach to problem-solving.
- Please note that this position requires you to be located in Ireland.
Benefits
- Comprehensive dental, vision, and life insurance, including coverage for spouses and dependents.
- Equity compensation plan.
- Flexible work hours and a remote-first environment.
- Work from home equipment allowance and your choice of Mac or PC hardware.
- Opportunities for professional development and the chance to work directly with our founders.
- Participation in a fast-growing, venture-backed company with a collaborative and progressive culture.
How to apply
If you are a motivated engineer who enjoys solving complex challenges at scale, we invite you to apply and join our team. We look forward to reviewing your background and discussing how your unique perspective can help us continue to build robust, reliable systems.



