How to integrate Machine Learning with DevOps

Hello everyone, Do you ever ask yourself the question "Can I integrate Machine Learning with DevOps ?"

Hello everyone, Do you ever ask yourself the question "Can I integrate Machine Learning with DevOps ?" So I am here with the answer and answer is yes, We can integrate ML with DevOps and it is collectively called as MLOps. We can achive MLOps in various ways and for various use cases. This article will explain you one of them.

Problem Overview :

  • Create docker container image that has python installed in it along with all the essential libraries required for training Machine learning model.
  • Create number of jobs in jenkins to test , notify , rebuild , tweak the machine learning model in order to get desired accuracy.

Solution Overview :

We are going to build chain of jobs here in order to get desired accuracy for given dataset but before going ahead we have to create a Dockerfile which will create image with required configurations.

  • Job 1 : Job 1 will keep an eye on github repository as soon as developer push something. This job will automatically copy everything in the folder of my base os.(I am using RHEL 8 as my base os (VM)).
  • Job 2 : Success of Job 1 will trigger job 2. This job will launch docker container which is workspace for Jenkins.
  • Job 3 : After successfully launching the OS Jenkins will trigger this job. This job will search for file which is pushed by developer and has a main code to train model. (  main.py in my case)
  • Job 4 : I have created Job 4 to notify developer that main.py file has some errors due to which job 3 failed.
  • Job 5 : If main.py runs successfully but give less accuracy than what developer desire then jenkins should automatically tweak something and by various hit and trials will try to increase the accuracy. In order to achieve this thing developer will push one more file along with main.py that is rebuild.py. This will help jenkins to take tests and build model again and again till desired accuracy is achieved.
  • Job 6 : This job is success notifier as soon as jenkins succeed in achieving desired accuracy it will notify developer about success of the model and accuracy achieved.
  • Job 7 : This job is again failure notifier but now it will notify on failure of rebuild.py file.
  • Job 8 : Job 8 will be monitoring job. It will keep an eye on running container. If it found container crashed, it will immediately launch new container with same configuration.
Integrate Machine Learning with DevOps
Integrate Machine Learning with DevOps

So let's get started . . .

Github link : https://github.com/ajinkya48765/MLOps-Project-1

First of all we have to start with building our dockerfile in order to create python workspace

Creating a dockerfile
Creating a dockerfile
FROM centos

RUN yum install -y python36

RUN yum install -y epel-release

RUN yum groupinstall -y 'development tools'

RUN yum install -y python36-devel

RUN pip3 install keras

RUN pip3 install numpy

RUN pip3 install pandas

RUN pip3 install pillow

RUN pip3 install opencv-python

RUN pip3 install tensorflow

Use this docker file to create docker image which you can use to launch your docker container for further use to build this image use :

docker build -t <name>:<tag> <location of docker file>

Keep in mind name of file containing code to generate image should always be Dockerfile because it is standard name. as we have our image ready , let's start job creation.

Setting up git and github environment :

First of all clone this repo: https://github.com/ajinkya48765/Integration-of-ML-with-DevOps.git. now configure post commit so that developer doesn't need to push code again and again.

setting up git and env
setting up git and env

Job 1 (Github watcher) :

As we have our github workspace ready so we will everything to base os from github.

Github watcher
Github watcher

I am using trigger using gitscm polling so selected the same in that image.

Job 1 ( github watcher )
Job 1 ( github watcher )

In this way we have created our first job for keeping eye on github and copy all the content to folder in baseOS.

Job 2 (Image_launcher) :

When job 1 runs successfully jenkins triggers job 2 . This is basically a image launcher depending upon codes provided by developer. Ex, If someone provide code of CNN then model having libraries related to CNN is launched.

image launcher
image launcher
Job 2 ( Image launcher )
Job 2 ( Image launcher )

Job 3 (main_runner) :

This job will run main.py file and find the accuracy of model. This job is triggered by job 2 :

main runner
main runner

Now we have too add code for running main.py file

Job 3 ( main runner )
Job 3 ( main runner )
Job 3 ( main runner )
Job 3 ( main runner )

In case this code has some errors then it should inform developer so for that purpose I have used post trigger here.

Job 3 ( main runner )
Job 3 ( main runner )

On failure of job 3 , job 4 will be triggered.

Job 4 (main_failed_notifier) :

As I said earlier during job chaining if main file has some errors and it failed to compile then It should be informed to developer. So I have used some python scripts for sending email to developer.

failed notifier
failed notifier
failed notifier
failed notifier

Job 5 (Rebuild Runner) :

If main.py fails to achieve desired accuracy then it will call this job. Developer will provide some tweak which will keep on hitting some values and keep on monitoring accuracy. If it found accuracy greater than marker then this job will succeed.

rebuild runner
rebuild runner

Since, This job is triggered by remote Auth key we have to configure Auto token here .

rebuild runner
rebuild runner

We have to run rebuild.py file inside this job.

rebuild runner
rebuild runner

Now we have run our code but what if our code has some errors ? We have to again send email to developer regarding failure of rebuild.py. And If we got desired accuracy then it it should trigger job6 to inform developer about success. For this purpose we have to provide post build triggers.

rebuild runner
rebuild runner

Job 6 (Success Notifier) :

This job will send email to developers about success of training and also accuracy of model. First we have to configure remote trigger auth token.

success notifier
success notifier
success notifier
success notifier
success notifier
success notifier

Job 7 (Rebuild_failed_Notifier) :

If rebuild.py has some error or it gets interrupted then this job will come into action and inform developer.

rebuild failed notifier
rebuild failed notifier
rebuild failed notifier
rebuild failed notifier

Job 8 (Monitor) :

We have built such a massive setup but what if our docker container goes down we can't do anything in such condition. So I have created this job which will continuously keep an eye on the working of container. If it found container stopped then it will automatically launch new.

monitor
monitor

This is all about this project.

# Conclusion :

Used Jenkins , Git , Github , Docker to integrate Machine learning with DevOps. This kind of projects will reduce lots of time of a developer which he invest in trials and testings.

Thank You.