Blog

Move your data science environment and models to Azure

The why and what of migrating your data science workloads to Azure  

Published: 03 November 2022

Are you still developing your data science solutions on your local machine? Perhaps it is time to think about moving it to the Azure cloud. This article explains what Azure has to offer regarding data science. By doing so, we will tackle the following subjects:  

  • Why you need an Azure data science environment; 
  • What Data Science tools does Azure Offer?

Are you curious why you should utilize Data Science in the first place? Find out in our previous article where we explored the basic concepts of a data scientist role 

Why you need an Azure data science environment  

At Intercept, we believe moving to Azure offers the following benefits:  

  • A single maintainable platform 
  • A secure solution
  • Increased collaboration
  • Having all packages pre-installed  

First of all, Azure provides you with a single platform for your entire data science solution. As it integrates with many Azure services, you can build the complete solution on one platform. For example, you can store your data using storage accounts, run machine learning pipelines in Azure Machine Learning, and give each employee access using role-based access control. Since your entire technology stack runs in one place on the Azure cloud, maintaining your data science solution is more efficient and transparent.  

Next, security is of utmost importance for any data science solution. Azure offers us plenty of tools to keep our solution secure and resilient to outside attacks. For example, one could utilize the Azure Key Vault to store secrets that keep our passwords safe. To distribute rights to access our data science environment, we can use Role-Based Access Controls (RBAC). Together, these tools help us to secure our data science environment in a user-friendly manner. 

Azure offers a single platform for building, developing, and deploying models. Next to increasing maintainability, this aids collaboration as well. Picture this: Given that everyone in your team is working on the same platform, there is no longer a need for transferring files, datasets, or outcomes between colleagues ever again!  

Do you want to use specific packages in your data science workloads? Collaborating on the same platform ensures all needed packages can be pre-installed and ready to use. On Azure, we do not need separate frameworks or ML package installations. Once installed, anybody in your team can run your code without spending time setting up a local environment.  

 

What Data Science tools does Azure Offer? 

Azure offers various data science and analytics tools. Curious to know what tool is right for your use case? In this section, we will talk about five often-used data science tools on Azure: 

  1. Azure Machine Learning (AML) 
  2. Azure Cognitive Services 
  3. Azure Databricks 
  4. Azure Synapse Analytics 
  5. Azure Data Science Virtual Machine (DsVM) 

Azure Machine Learning (AML) 

At Intercept, we love using AML for our Data Science workloads, as it is a user-friendly tool. AML can be seen as an end-to-end data science platform solution in the cloud to accelerate and manage your entire data science project. Using AML, one can develop data science workloads using Python. Running code is done by utilizing Jupyter notebooks. The computational power of AML comes from the compute cluster. The compute cluster allows you to perform distributed training of your data science models. Given that the Python SDK comes pre-installed, your notebooks will be ready for execution immediately after deployment.  

Azure Cognitive Services 

If you have a specific business challenge around decision-making, language, vision, or speech, it would be wise to investigate Azure Cognitive Services. Azure Cognitive Services offers Artificial Intelligence (AI) services on Azure, using pre-built and trained machine learning models from Microsoft. In your data science and analytics environment, you can use development languages to interact with these services through REST APIs and client library SDKs. Azure Cognitive Services is a great option to add cognitive intelligence to your applications, even when you don’t have AI or data science knowledge. However, it’s good to note that the algorithms used within Azure Cognitive Services cannot be modified.  

Azure Databricks 

If you want to run large-scale machine learning workflows, Azure Databricks might be worth looking into. Given that Azure Databricks utilizes the powers of Apache Spark, you will have a distributed data processing framework at your fingertips. On Azure Databricks, you can set up your Spark environment within minutes, enjoy autoscaling, and collaborate with your colleagues using notebooks. In addition, it integrates well with almost any Azure service, making Azure Databricks a well-integrated and user-friendly platform.  

Azure Synapse Analytics 

Do you need an all-in-one platform? Then Azure Synapse Analytics is the tool for you. This tool integrates well within the Azure platform, ensuring a proper connection to all services needed for an end-to-end data science project. Also, Azure Synapse Analytics is relatively compact, including data integration, warehousing, analytics, and data science tools all in one product. 

Azure Data Science Virtual Machine (DsVM) 

If you want to make an easy start with moving your data science solution to the cloud, a single Data Science Virtual Machine (DsVM) on Azure might be it for you. DsVM is a cloud-based copy of your local environment, with a Python SDK pre-installed and other data science and ML tools. Using Azure Data Science VM gives you more scalability compared to developing locally. However, in our experience, for an integral data science solution, using Azure Machine Learning and using a regular Virtual Machine as cloud-based computing integrated with AML is preferred.  

 

Conclusion on data tools 

In conclusion, we will always be able to fit a tool to move your data science and analytics workload to Azure. This is because of the wide variety of tools we can choose from in Azure, the ease of setting them up and maintaining them, and their high-security features. 

 

How Intercept tackles the move to Azure    

Following the DLM approach, Intercept tackles your Data Science project step by step. Together with our Data Scientist, we can guide you through your business requirements, translating them into a Data Design. This Data Design acts as the blueprint for your Data Science project. Schedule a meeting with us to identify what challenges you’d like to solve with Data Science.