In this post I want to share an alternative way to copy an Azure Data Factory pipeline to Synapse Studio. Because I think it can be useful.
For those who are not aware, Synapse Studio is the frontend that comes with Azure Synapse Analytics. You can find out more about it in another post I did, which was a five minute crash course about Synapse Studio.
By the end of this post, you will know one way to copy objects used for an Azure Data factory pipeline to Synapse Studio. Which works as long as both are configured to use Git.
Azure Data Factory example
For this example, I decided to use the pipeline objects that I created for another post. Which showed an Azure Test Plans example for Azure Data Factory. It uses a mapping data flow, as you can see below.
In order to use this method both Azure Data Factory and Azure Synapse Analytics need to be setup to use source control. For this demo I have them stored in Azure Repos within Azure DevOps.
However, they can just as easily be in a GitHub Enterprise repository instead.
Copy an Azure Data Factory pipeline to Synapse Studio
How I did the copy was very simple. I just copied all the individual objects from the Azure Data Factory repository to the Azure Synapse repository using the same structure.
Below are the required objects I needed for the pipeline in the Azure Data Factory repository. Which are the linked services, datasets, the data flow and of course the pipeline itself. Shown as separate json files.
I copied the json files from the Azure Data Factory repository to the same locations in the Azure Synapse workspace repository, as you can see below. Making sure they went into the same branch that I was working on in Synapse Studio.
You can see that there are some extra objects in the Azure Synapse repository. Which get added by default when you connect an Azure Synapse workspace to a Git repository within Synapse Studio.
One key point is that it does this even if you do not select the option to import existing resources into Git.
Testing in Synapse Studio
Now, copying the Azure Data Factory objects this way is all well and good but does it work?
Well to test this thoroughly I recreated the two Azure SQL Databases that were used in the initial Data flow. With the source database based on the AdventureWorksLt sample database and the other database blank.
Afterwards, I opened up Synapse Studio and went to the Manage hub. Where I changed the Linked services for the two databases to connect to the new ones.
Once I had done that, I went into the Develop hub in Synapse Studio. I then opened the new Data flow and enabled Data flow debug.
I then tested the connection to the dataset, as you can see below. In addition, I was able to preview the data.
Afterwards, I went to the Integrate hub. From there I ran the pipeline in Synapse Studio by clicking Debug. Which succeeded, as you can see below.
To be absolutely sure I went into the Azure SQL database that was used for the destination (aka sink) in the Azure Portal. To help with some syntax here, in pipelines and data flows the destination is called sink.
I then logged into the Query editor and ran the below query. To make sure that new rows were in the database.
Which confirms that this method worked. Because that database was blank before we ran the pipeline.
DataWeekender lightning talks
In reality, something simple and effective like this can be explained within ten minutes as a lightning talk. With this in mind, if you have something like this you want to share with the community feel free to submit a lightning talk session to DataWeekender v4.2.
I thought I better mention this since call for speakers is still open. You can get to the sessionize page by clicking on this DataWeekender v4.2 call for speakers link or on the image below.
Final words about copying an Azure Data Factory pipeline to Synapse Studio
I hope this post about an alternative way to copy an Azure Data Factory pipeline to Synapse Studio helps some of you.
I like this method. Because it shows a simple and effective way to copy objects from Azure Data Factory to Azure Synapse.
I discovered this whilst looking to create more Azure DevOps templates after a previous post. Which introduced Azure DevOps templates for Data Platform deployments. So, expect more templates to appear in my GitHub site.
Of course, if you have any comments or queries about this post feel free to reach out to me.
[…] Kevin Chant copies and pastes: […]
[…] It can ingest data by using its own pipeline functionality or by using Azure Data Factory. In fact, the two are very similar. In fact they are so similar a wrote a post about how you can copy an Azure Data Factory pipeline to Synapse Studio. […]
[…] In this post I want to cover how you can automate a pipeline migration to a Synapse workspace using Azure DevOps. As a follow up to a previous post I did about one way to copy an Azure Data Factory pipeline to Synapse Studio. […]
[…] In this post I want to cover how you can automate a pipeline migration to a Synapse workspace using Azure DevOps. As a follow up to a previous post I did about one way to copy an Azure Data Factory pipeline to Synapse Studio. […]
[…] It can ingest data by using its own pipeline functionality or by using Azure Data Factory. In fact, the two are very similar. In fact they are so similar that a wrote a post about how you can copy an Azure Data Factory pipeline to Synapse Studio. […]