Machine Learning Operation Engineer (MLOps Engineer)
KL Gateway, Kerinchi, Kuala Lumpur, Malaysia
Full Time
RTA
Experienced
Duties and Responsibilities:
- Provides deep technical expertise in the aspects of cloud infrastructure design and API development for the business environments.
- Bridges the gap between data scientists and software engineers, enabling the efficient and reliable delivery of ML - powered solutions
- Ensures solutions are well designed with maintainability/ease of integration and testing across multiple platforms.
- Possess strong proficiency in development and testing practices common to the industry
- Working closely with data scientists, ML engineers, and other stakeholders to deploy ML models
- Setting up and maintaining cloud and edge infrastructure for MIL models deployment
- Design, implement and maintain scalable infrastructure for ML workloads
- Good verbal and written communication skills
- Collaborative and oriented
Bachelor's degree in computer science, Engineering or related subject and/or equivalent formal training or work experience
Work Experience / Skills Requirement(s):
1. Cloud Infrastructure & Kubernetes
- Minimum 2 years of hands-on experience managing cloud infrastructure (e.g. AWS,GCP,Azure) in a production environment
- Hands-on experience with Kubernetes for container orchestration, scaling and deployment of ML services
- Familiar with Helm charts, ConfigMaps, Secret and autoscaling strategies
- Proficient in building and maintaining RESTful or gRPC APIs for ML inference and data services
- Experience in message queue integration such as RabbitMQ or ZeroMQ for asyncronous communication, job queuing or real-time model inference pipelines
- Proven experience working with relational databases (RDBMS) such as Microsoft SQL Server and PostgreSQL.
- Proficient in schema design, writing complex queries, stored procedures, indexing strategies, and query optimization.
- Hands-on experience with vector search and embedding-based retrieval systems.
- Practical knowledge using FAISS, LanceDB, or Qdrant for building similarity search or semantic search pipelines.
- Understanding of vector indexing strategies (e.g., HNSW, IVF), embedding dimensionality management, and integration with model inference pipelines.
- Demonstrated expertise in building scalable and maintainable API services using Python frameworks such as Flask, FastAPI, or Litestar.
- Fluent in HTML, CSS, and JavaScript for building simple web-based dashboards and monitoring interfaces.
- Experience with Go, C++, or Rust is a strong plus, especially for performance-critical or low-latency inference applications.
- Experience in integrating models using NCNN, MNN, or ONNX Runtime Mobile on mobile and edge devices.
- Familiarity with quantization, model optimization, and mobile inference profiling tools.
- Experience with Docker/Podman, CI/CD pipelines, Git, and ML lifecycle tools such as MLflow, Airflow, or Kubeflow.
- Exposure to model versioning, A/B testing, and automated re-training workflows.
- Ability to set up monitoring (e.g., Prometheus, Grafana) and logging (e.g., ELK stack, Loki) to track model performance and system health.
- Strong analytical and troubleshooting skills.
- Able to work closely with data scientists, backend engineers, and DevOps to deploy and maintain reliable ML systems.
- Excellent communication and documentation habits.
Apply for this position
Required*