AI Engineering and MLOps: Designing, Deploying, and Scaling AI Systems

Program Description

While this outline serves as a foundational framework with use cases from multiple industries and functions, the final program is fully customized to your industry and internal workflows.

Participants work on real-world problems, not generic examples. We engage in a pre-workshop alignment to inject your specific organizational datasets, pain points, and proprietary use cases directly into the curriculum.

Learning Objectives

Program Details

Content

Day 1: AI Systems Design & The Engineering Lifecycle

  • Moving from “Notebooks” to “Microservices.” Understanding the core components of a production AI system: Feature Stores, Model Registries, and Metadata Tracking.
  • Scenario (Banking): A technical lead audits a legacy credit scoring system and architects a transition to a real-time, event-driven feature store to enable instant loan approvals.
  • Hands-on: “The Architecture Blueprint” – Designing a multi-tier AI system architecture that separates data ingestion, training, and inference layers for high availability.
  • Expected Impact: Technical clarity on selecting the right tools (e.g., MLflow, Kubeflow) vs. cloud-native managed services (AWS/Azure/GCP).
  • Ensuring “Reproducibility” in AI. Deep dive into Data Version Control (DVC) and automated data validation pipelines to prevent “Garbage In, Garbage Out.”
  • Demo (Manufacturing): An automated “Data Guardrail” in a factory sensor pipeline that halts model retraining if it detects faulty calibration data from a specific assembly line.
  • Hands-on: Setting up a Git-integrated Data Versioning (DVC) workflow to track changes in a large-scale e-commerce transaction dataset.
  • Expected Impact: Elimination of the “It worked on my machine” problem; 100% auditability of data lineage.
  • Beyond standard DevOps. Implementing Continuous Training (CT) – where models automatically retrain and validate when performance drops or new data arrives.
  • Scenario (Retail): A recommendation engine that automatically retrains itself when it detects a 5% drop in click-through rate during a sudden Malaysian festive sale (e.g., Shopee 11.11).
  • Hands-on: Coding a GitHub Action or GitLab CI pipeline that triggers a model validation suite every time a code change or data update is committed.
  • Expected Impact: Significant reduction in manual intervention; faster “Time-to-Production” for model improvements.
  • Implementing “Privacy-by-Design.” Technical methods for PII masking, k-anonymity, and the newly mandated Data Protection Impact Assessments (DPIA) within the AI pipeline.
  • Scenario (HR/Operations): Building a secure talent analytics pipeline where NRICs and sensitive personal data are automatically encrypted and pseudonymized at the point of ingestion.
  • Hands-on: Implementing an automated “PII Scanner” node in a Python data pipeline that flags and redacts sensitive Malaysian identifiers before data storage.
  • Expected Impact: 100% compliance with Malaysian PDPA 2.0; structural protection against multi-million ringgit fines and data breaches.

Day 2: LLMOps, Observability, and Scaling

  • The unique challenges of GenAI. Architecting for Retrieval-Augmented Generation (RAG), managing Vector Databases (Pinecone, Weaviate), and optimizing GPU utilization.
  • Demo (Customer Experience): Architecting a “Sovereign RAG” system for a Malaysian telco that answers customer queries using internal PDFs while keeping data strictly within local cloud regions.
  • Hands-on: Configuring an end-to-end RAG pipeline – linking a Vector Database to an LLM and setting up automated “Hallucination Checks” for output quality.
  • Expected Impact: Transition from simple chatbots to reliable, context-aware enterprise AI agents.
  • Choosing the right “Inference Strategy.” Comparing Batch vs. Real-time vs. Edge deployment. Mastering Docker and Kubernetes for AI containerization.
  • Scenario (Logistics/E-commerce): Deploying a high-concurrency demand forecasting model that must handle 10,000+ requests per second during peak “12.12” traffic without latency.
  • Hands-on: Containerizing a Python-based model using Docker and simulating a high-load deployment to test auto-scaling triggers and load balancing.
  • Expected Impact: Technical mastery over infrastructure cost-optimization and system resilience.
  • The “Post-Deployment” crisis. Monitoring for Concept Drift (market changes) and Data Drift (input changes). Setting up automated alerts and “Circuit Breakers.”
  • Demo (Finance/Risk): A monitoring dashboard that flags a “Drift Alert” when the profile of mortgage applicants shifts significantly due to a rise in interest rates by Bank Negara Malaysia.
  • Hands-on: Setting up an automated monitoring loop in Python that calculates Population Stability Index (PSI) and triggers an email alert if the model’s confidence drops.
  • Expected Impact: Proactive risk management; ensuring AI remains accurate and fair long after its initial launch.
  • Implementing the National AI Governance & Ethics (AIGE) principles (Fairness, Accountability, Transparency). Designing “Emergency Shut-off” mechanisms for AI agents.
  • The Framework: Prioritizing the “AI Backlog” based on Technical Debt, Scalability, and Regulatory Risk.
  • Hands-on: Co-creating an “Enterprise AI Playbook” – defining technical KPIs for MLOps maturity and a phased 3-6 month roadmap for departmental AI scaling.
  • Expected Impact: A clear, sustainable path toward transforming your organization into a technically-mature, AI-First enterprise.
Data Analytics Training for IT Professionals

List of Deliverables

Prerequisites

Who Should Attend

Training Methodology

100% HRDC-Claimable

This program is fully registered and compliant with HRDC (Human Resource Development Corporation) requirements under the SBL-Khas scheme, allowing Malaysian employers to offset the training costs against their levy.

Certification of Completion

Participants who successfully complete the program will be awarded a “Professional Certificate in AI Engineering & MLOps Leadership.

Post-Workshop Consulting (Optional)

For organizations looking to bridge the gap between training and execution, we offer optional, paid consulting services. These engagements provide expertise and technical support for specific pilot development or full-scale operational integration of the data- and AI-driven use cases established during the program.

Contact us for In-House Training

    * All fields are required