Azure MLOps Challenge Blog: Part 6

Welcome back! 🙌

In my previous post, we successfully completed the fifth module of the Azure MLOps Challenge. We were tasked with updating our GitHub Actions workflow to run unit tests and linting on our code but we took it a step further and added manual approvals and dependency scanning with Snyk to our GitHub Actions workflow. In this post I will show you how to complete the sixth module of the Azure MLOps Challenge. We’ll add a production dataset to Azure Machine Learning and add a new training job to our GitHub Actions workflow to trigger the training of a production model. We’ll also add a new environment to our GitHub repo and add a required reviewer to the environment.

This content is based on Challenge 5 of the MLOps Challenge Labs. I’ll provide an overview of the key concepts and my solution to the challenge. If you want to follow along yourself, you can find instructions at the MLOps challenge documentation site:

Here is a brief description of technologies you’ll encounter in this challenge, that weren’t covered in the previous blog/s.

  • Azure Machine Learning Environments: are a collection of resources used for model training and deployment. It contains the Python packages, environment variables, software settings and operating system that are required to run your model. You can create environments from a set of curated base images, or from your own custom Docker image. You can also create environments from a Conda specification file that defines the packages to be installed in the environment.
  • GitHub Actions Environments: provide a structured way to manage deployment targets such as production, staging, or development. The real power of environments lies in their protection rules and secrets. These rules ensure that deployments only proceed when all conditions are met, offering a secure and flexible way to manage deployment workflows. Whether you’re pushing a new feature to staging or deploying a hotfix to production, GitHub Environments has got you covered.

Let’s get started! 🚀

Challenge 5: Work with environments

The Objectives of the sixth challenge are as follows:

  • Set up a development and production environment.
  • Add a required reviewer.
  • Add environments to a GitHub Actions workflow.

We created the dev environment in the previous challenge so we can use that knowledge to create the prod environment using the same process including adding yourself as a required reviewer. As per the challenge instructions – we’ll add the Azure credentials directly the GitHub environment secrets and delete the repo-level secrets. This is a good practice to follow as it ensures that the secrets are only available to the GitHub Actions workflow and not to anyone with access to the repo. In a production environment you would use seperate Service Principals for each environment, scoped to the minimum permissions required in the respective Azure Subscription and/or Resource Group. In the context of our learning challenge, we’ll use the same Service Principal for both environments.

The next step is to add the prod environment to our Service Principal’s “Federated credentials” list. This is the same process we followed in the previous challenge to add the dev environment. The complete “Subject Identifier” should look like this:

prod: repo:[your github account]/[your repo]:environment:prod

dev: repo:[your github account]/[your repo]:environment:dev

At this point you can remove the required reviewer rule from the dev environment and add the Azure environment secrets if you havn’t already done so. It’s also a good idea to remove any unused credentials from the Service Principal’s “Federated credentials” list too.

Now we’ll add the diabetes-prod-folder to our Azure Machine Learning Workspace datasets. This is the folder that contains the diabetes-prod.csv file that we’ll use to create our prod environment. This CSV contains a larger dataset to train the model. Create a new yml file name dataset-prod.yml with the following contents:

name: diabetes-prod-folder
version: 1
path: production/data
description: Dataset pointing to diabetes-prod CSV in the repo. Data will be uploaded to default datastore.

The Azure Machine Learning CLI command to create the environment is:

az ml data create --file dataset-prod.yml --workspace-name myWorkspace --resource-group myResourceGroup 

A successfull json response should look like this:

  "creation_context": {
    "created_at": "2023-08-19T23:26:52.918654+00:00",
    "created_by": "",
    "created_by_type": "User",
    "last_modified_at": "2023-08-19T23:26:52.927832+00:00"
  "description": "Dataset pointing to diabetes-prod CSV in the repo. Data will be uploaded to default datastore.",
  "id": "/subscriptions/[subId]/resourceGroups/myResourceGroup/providers/Microsoft.MachineLearningServices/workspaces/myWorkspace/data/diabetes-prod-folder/versions/1",
  "name": "diabetes-prod-folder",
  "path": "azureml://subscriptions/[subId]/resourcegroups/myResourceGroup/workspaces/myWorkspace/datastores/workspaceblobstore/paths/LocalUpload/386681db2f7cd59e8c96f5ee80b212db/data/",
  "properties": {},
  "resourceGroup": "myResourceGroup",
  "tags": {},
  "type": "uri_folder",
  "version": "1"

We also need to create a new yaml file for the prod training job in the src/model folder. This file will be called train-prod.yml wit the following code:

code: model
command: >-
  --training_data {inputs.training_data}
  --reg_rate {inputs.reg_rate}
    type: uri_folder 
    path: azureml:diabetes-prod-folder:1
  reg_rate: 0.01
environment: azureml:AzureML-sklearn-0.24-ubuntu18.04-py37-cpu@latest
compute: azureml:myCompute
experiment_name: diabetes-classification-production
description: Train a classification model to predict diabetes

We’re using the same compute target (myCompute) as the training job, in a production environment we would use a different compute target (cluster) to support the scale of the datasets.

Finally, we’ll add another job to the existing GitHub Actions workflow to trigger the prod training job. This job will look like this:

    runs-on: ubuntu-latest
    needs: [train-dev]
      name: prod
      - name: Check out repo
        uses: actions/checkout@main
      - name: Install az ml extension
        run: az extension add -n ml -y
      - name: "Az CLI login"
        uses: azure/login@v1
          client-id: ${{ secrets.AZURE_CLIENT_ID }}
          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
      - name: Trigger Azure ML job
        run: az ml job create --file src/job-prod.yml --workspace-name ${{ env.workspace-name }} --resource-group ${{ env.resource-group }} --stream # output job logs to the console

We’re ready to commit and push our changes! After a few minutes we’ll see the DevSecOps jobs have completed, the dev training job has completed successfully, and the prod model job is waiting for approval:

And once all the jobs have completed finished we’ll see the following:


In this post, we walked through the steps to meet the success critera of the sixth module of the Azure MLOps Challenge. We created a new prod environment in our GitHub repo, added a required reviewer and added our Azure credentials as Environment secrets. Lastly we created a new dataset in Azure Machine Learning and added new training job in our GitHub Actions workflow to trigger the prod training job.

In the next post, we’ll tackle the seventh and final module of the Azure MLOps Challenge which is to deploy and test our model.

Happy learning and good luck! 🚀

[mailpoet_form id="1"]

Other Recent Blogs

Level 9, 360 Collins Street, 
Melbourne VIC 3000

Level 2, 24 Campbell St,
Sydney NSW 2000

200 Adelaide St,
Brisbane QLD 4000

191 St Georges Terrace
Perth WA 6000

Level 10, 41 Shortland Street

Part of

Arinco trades as Arinco (VIC) Pty Ltd and Arinco (NSW) Pty Ltd. © 2023 All Rights Reserved Arinco™ | Privacy Policy
Arinco acknowledges the Traditional Owners of the land on which our offices are situated, and pay our respects to their Elders past, present and emerging.

Get started on the right path to cloud success today. Our Crew are standing by to answer your questions and get you up and running.