Work model: remote.
Business trips: occasional to Copenhagen.
Assignment Type: B2B
Project Length: Long-term
Start Date: ASAP
Project Language: English
About a role:
A unique opportunity to join as a Site Reliability Engineer to the dynamic, ambitious, and international company where you will work with a lot of skilled colleagues.
You will join the dispersed team, with members, that develops a product platform that helps other product teams deliver cloud native functionality in a consistent manner.
RESPONSIBILITIES:
Define and maintain containers for Kubernetes (both in Azure and local developer environments).
Create Helm charts used for deploying our product in Azure.
Be responsible for our CI/CD processes on GitHub Actions, focusing on quality, efficiency, and automation.
Develop and maintain our authentication and authorization functionality (OpenID Connect and OAuth2).
Be responsible for logging, telemetry, and driving improvements in CI/CD and observability.
Maintain internal deployments used by developers.
Enhance the quality and cadence of release processes.
Collaborate with the development team to improve the deployment platform.
Must have:
5+ yeas of experience from a similar position working on a SaaS product.
Hands-on experience with cloud solutions in production, either as a cloud software developer who has worked on a SaaS solution, or as a cloud-ops engineer who has been responsible for operating a SaaS solution.
Hands-on experience with Kubernetes.
Experience with logging and tracing tools for effective troubleshooting and debugging.
Experience in optimizing system performance, scalability, and efficiency to handle growing workloads.
Expertise in incident management, including the ability to diagnose and resolve incidents quickly and efficiently.
Knowledge of:
Infrastructure as Code principles.
monitoring tools like Prometheus, Grafana, or similar solutions to ensure visibility into system performance and health.
security best practices and the ability to incorporate security considerations into the design and operation of systems
reliability engineering principles,(e.g., Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets).
Strong communication skills to effectively collaborate with cross-functional teams, including developers, operations, and other stakeholders.
Ability to document processes, procedures, and system architecture comprehensively.
Strong analytical and problem-solving skills, with the ability to diagnose complex issues and implement effective solutions.
Willingness to adapt to evolving technologies and industry best practices, with a commitment to continuous learning.
We offer:
Long-term cooperation
Transparently built relations based on trust and fair play
Medicover card, Multisport card on preferential conditions.
Internal reference bonus