Industry: Public transport 

Work model: 100% remote

Business trips: Yes, once every 2-3 months to DK

Project length: 6 months (+possible extensions)

Remuneration: up to 200 PLN/h +VAT, depending on experience

Assignment type: B2B  

Start: ASAP/Flexible

We are lookng for an experienced MLOps/LLMOps specialist with a strong background in cloud engineering to elevate our technological capabilities. We currently operate a well-managed Kubernetes cluster, overseen by a dedicated Kubernetes team. This team monitors application and container consumption, performance, and more via Prometheus and Grafana. Our goal is to advance this Kubernetes cluster from its current nascent stage in ML maturity to a fully operational state. We are poised to start training models on Databricks and deploying them using MLflow on Azure, and we need your expertise to make this a reality. This role requires close collaboration with Cloud Architects, AI Engineers, and Full Stack Developers.

Tech stack: Azure AI Search, and Azure OpenAIReact, TypeScript, Azure, microservices, Azure Kubernetes, GitHub, Jenkins.

Responsibilities:

  • Deploy and serve fine-tuned open-source large language models and automatic speech recognition models.

  • Utilize Kubernetes and Docker to optimize the deployment and management of ML models, ensuring efficient use of GPU resources.

  • Build and maintain robust infrastructure on Azure, leveraging automation tools such as Terraform and Bicep for efficient resource management and deployment.

  • Implement continuous observability, monitoring and A/B testing strategies to evaluate and enhance LLM & ASR model’s performance and reliability.

 

Required Skills:

  • In-depth experience with MLFlow, Databricks & Kubernetes with a focus on model deployment and management in microservice architectures, enhancing operational efficiency and scalability in production environments.

  • Proficiency in Docker containerization and Kubernetes orchestration for deploying ML models in production environments.

  • Strong understanding of A/B testing methodologies.

  • Skilled in deploying GPU-based ML/DL models for parallel inference, optimizing computational efficiency and performance for near real-time processing in production settings (preferably using CUDA).

  • Experience with open-source LLMs and Whisper is highly desirable and considered a significant plus.

  • Knowledge of networking concepts and experience in configuring and optimizing network resources within Azure.

  • Proven track record in MLOps, with a focus on MLFlow and Azure technologies.

WE OFFER:

  • Technical growth and education.

  • International projects in Scandinavian business culture.

  • Long-term cooperation across multiple projects and sectors.

  • Transparently built relations based on trust and fair play.