Machine Learning Operation Engineer (MLOps Engineer)

KL Gateway, Kerinchi, Kuala Lumpur, Malaysia
Full Time
RTA
Experienced
Duties and Responsibilities:
  • Provides deep technical expertise in the aspects of cloud infrastructure design and API development for the business environments.
  • Bridges the gap between data scientists and software engineers, enabling the efficient and reliable delivery of ML - powered solutions
  • Ensures solutions are well designed with maintainability/ease of integration and testing across multiple platforms.
  • Possess strong proficiency in development and testing practices common to the industry
Summary of Principal Job Responsibility & Specific Job Duties and Responsibilities:
  • Working closely with data scientists, ML engineers, and other stakeholders to deploy ML models
  • Setting up and maintaining cloud and edge infrastructure for MIL models deployment
  • Design, implement and maintain scalable infrastructure for ML workloads
  • Good verbal and written communication skills
  • Collaborative and oriented
Academic Qualification (s):
Bachelor's degree in computer science, Engineering or related subject and/or equivalent formal training or work experience

Work Experience / Skills Requirement(s):

1. Cloud Infrastructure & Kubernetes
  • Minimum 2 years of hands-on experience managing cloud infrastructure (e.g. AWS,GCP,Azure) in a production environment
  • Hands-on experience with Kubernetes for container orchestration, scaling and deployment of ML services
  • Familiar with Helm charts, ConfigMaps, Secret and autoscaling strategies
2. API Development & Messaging Integration
  • Proficient in building and maintaining RESTful or gRPC APIs for ML inference and data services
  • Experience in message queue integration such as RabbitMQ or ZeroMQ for asyncronous communication, job queuing or real-time model inference pipelines
3. System Design, Database & Software Architecture
  • Proven experience working with relational databases (RDBMS) such as Microsoft SQL Server and PostgreSQL.
  • Proficient in schema design, writing complex queries, stored procedures, indexing strategies, and query optimization.
  • Hands-on experience with vector search and embedding-based retrieval systems.
  • Practical knowledge using FAISS, LanceDB, or Qdrant for building similarity search or semantic search pipelines.
  • Understanding of vector indexing strategies (e.g., HNSW, IVF), embedding dimensionality management, and integration with model inference pipelines.
4. Programming Languages
  • Demonstrated expertise in building scalable and maintainable API services using Python frameworks such as Flask, FastAPI, or Litestar.
  • Fluent in HTML, CSS, and JavaScript for building simple web-based dashboards and monitoring interfaces.
  • Experience with Go, C++, or Rust is a strong plus, especially for performance-critical or low-latency inference applications.
5. Edge AI Deployment
  • Experience in integrating models using NCNN, MNN, or ONNX Runtime Mobile on mobile and edge devices.
  • Familiarity with quantization, model optimization, and mobile inference profiling tools.
6. MLOps & Tooling
  • Experience with Docker/Podman, CI/CD pipelines, Git, and ML lifecycle tools such as MLflow, Airflow, or Kubeflow.
  • Exposure to model versioning, A/B testing, and automated re-training workflows.
7. Monitoring & Logging
  • Ability to set up monitoring (e.g., Prometheus, Grafana) and logging (e.g., ELK stack, Loki) to track model performance and system health.
8. Soft Skills & Collaboration
  • Strong analytical and troubleshooting skills.
  • Able to work closely with data scientists, backend engineers, and DevOps to deploy and maintain reliable ML systems.
  • Excellent communication and documentation habits.
Share

Apply for this position

Required*
Apply with Indeed
We've received your resume. Click here to update it.
Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

Human Check*