This post covers one way to update both ends of Azure Synapse Link for SQL Server 2022 using Azure DevOps. As shown at the Data Toboggan conference.
You can now watch the video for the ‘Create Database Project for use with Azure Synapse Link for SQL Server 2022‘ session online. Clicking on the session name will open up the video.
By the end of this post you will know one to deploy database updates to both the SQL Server database and the Azure Synapse dedicated SQL Pool that are used as part of Azure Synapse Link for SQL Server 2022, using a pipeline in Azure DevOps.
In order to keep both the SQL Server database and the dedicated SQL Pool consistent.
Plus, some of you will find out how to manipulate text files in Azure DevOps pipelines. Because in this post I show you how a pipeline can manipulate a database project to deploy only certain tables to a different platform.
Azure Synapse Link for SQL Server 2022 recap
Just to recap, Azure Synapse Link for SQL Server 2022 transfers data from a SQL Server 2022 database to an Azure Synapse Analytics dedicated SQL Pool. As represented in the red squares in the below diagram. Which I showed in my previous post with an example based on the WideWorldImporters sample database.
I am doing this post since I promised that I would during my session at Data Toboggan. Plus, I promised it in my last post that covered how to create a database project for use with Azure Synapse Link for SQL Server 2022.
A couple of key points about this post before I go any further.
First, Azure Synapse Link for SQL Server 2022 recently came out of preview and is now generally available.
Secondly, in a lot of my posts I tend to demo using YAML pipelines. However, since I did the demo using GUI pipelines at Data Toboggan, I will use those for this post.
Below is a diagram that highlights what steps this post covers:
Azure DevOps repository
In my last post on how to database project for use with Azure Synapse Link for SQL Server 2022 I showed how to create a database project for use with Azure Synapse Link for SQL Server 2022. Plus, how to make the folder that contains the database project a Git repository.
Following on from that post, I need to synchronize my local Git repository that contains the database project with a new one in Azure Repos before I start working with pipelines.
I explained how to do this in detail in a post I wrote about creating a dacpac for an Azure Synapse Analytics dedicated SQL Pool using Azure DevOps.
So, instead of repeating myself I recommend that you read the below two sections within that post. Which shows how you create the repository in Azure DevOps and than copy your database project to it using Azure Data Studio.
You can open the post by clicking on either of the sections in the below list.
- Create Git repository for SQL Pool in Azure DevOps
- Synchronize Azure Synapse SQL Pool repository to Azure DevOps
One thing to note is that for my demo the repository name is different.
Before building the pipeline
Before building the pipeline, I stop the Azure Synapse Link for SQL Server 2022 that I setup. So that I can test that both methods work afterwards.
I do this by going into Synapse Studio, going to the Integrate section and selecting the link. I can then click on Stop.
Next I made sure that all my sensitive information relating to connections were stored in Azure Key Vault. To reduce the risk of exposing sensitive details in my pipeline.
I cover how you can use Azure Key Vault with Azure DevOps in a blog post called ‘Keep your Azure Synapse secrets secret in Azure DevOps’.
Finally, I setup the service connection for my Azure DevOps project in Project Settings. So that it can connect to the Azure services that I wanted to update. Try and give these service connections a name which does not expose subscription information.
Azure Synapse Link for SQL Server 2022 build pipeline
After I did this and my database project was in Azure Repos I used the Azure Pipelines Classic Editor to create a build pipeline. Which is the GUI-based method you can use to create a build pipeline. As opposed to the code-based YAML pipelines.
I selected the classic editor by going into Azure Pipelines and than clicking on the ‘New Pipeline’ button. From there I selected the ‘Use the classic editor’ option.
Once done, I selected the agent I wanted to use to run the pipeline. For the benefit of the demo, I selected a self-hosted agent that was running on my laptop. Because it could connect to my SQL Server 2022 VM in Azure and my Azure Synapse Analytics dedicated SQL Pool.
After doing the above, I linked the pipeline to the variable group that I had created in the variable groups section.
Tasks in the Classic Editor
From there, I created tasks in the Classic Editor to create two separate dacpacs.
One dacpac to deploy updates to the SQL Server 2022 database and another to deploy updates to only a couple of tables that are in the database project to the dedicated SQL Pool. Like in the below example.
Even thought the above is fairly self-documenting for some people here is a diagram of what these tasks essentially do:
It is a lot to take in. Plus, some parts of this are complex. So, I will break these tasks down next.
Breakdown of build pipeline for Azure Synapse Link for SQL Server 2022 tasks
First of all, I build the dacpac for the SQL Server 2022 database in the “Build Database Project” task. Which creates a dacpac I can use to deploy the updates to the database in SQL Server 2022. I then publish the artifact to Azure DevOps in the next task.
Afterwards a “Copy files” task copies the contents of the repository in Azure Repos to another location on the agent. So that I can change the files in the copied version to prepare it to create a second dacpac.
PowerShell tasks
Following that I run some PowerShell tasks. Because the second dacpac needs to be based on a database project for a different target platform. Plus, it needs to exclude various tables because there are only two tables in the dedicated SQL Pool.
First one changes the target platform to be for a dedicated SQL Pool. It does this with the below that changes the Database Schema Provider (DSP) line.
# Change Target Platform in staging area
cd..
cd a
cd staging
$Old_DSP = 'Microsoft.Data.Tools.Schema.Sql.Sql130DatabaseSchemaProvider'
$New_DSP = 'Microsoft.Data.Tools.Schema.Sql.SqlDwDatabaseSchemaProvider'
rename-item WideWorldImportersDBProject.sqlproj WideWorldImportersSQLPoolProject.sqlproj
(Get-Content "WideWorldImportersSQLPoolProject.sqlproj") -replace $Old_DSP, $New_DSP | Set-Content "WideWorldImportersSQLPoolProject.sqlproj"
Second one adds a list of SQL files in the database project to exclude. I had to do this so that it only deployed updates to the two database tables that are used as part of Azure Synapse Link for SQL Server 2022.
It is important to note here that you must keep the schema files for whichever tables you want to update. Here is a sample of the code to give you an idea of how it looks.
# Add excluded files
cd..
cd a
cd staging
$Old_EOF = '</Project>'
$New_EOF = '<ItemGroup>
<Build Remove="Purchasing\**" />
<Build Remove="Sequences\**" />
<Build Remove="Storage\**" />
{MORE FOLDERS}
<Build Remove="Sales\Tables\BuyingGroups_Archive.sql" />
{MORE TABLES(FILES) IN THE SALES FOLDER}
</ItemGroup>
</Project>'
(Get-Content "WideWorldImportersSQLPoolProject.sqlproj") -replace $Old_EOF, $New_EOF | Set-Content "WideWorldImportersSQLPoolProject.sqlproj"
It then edits the SQL file for the Sales schema to remove content that can cause problems. As per the below example.
cd..
cd a
cd staging\Security
# Remove erroneous content
(Get-Content "Sales.sql") | Where-Object { -not $_.Contains('GO') } | Set-Content "Sales.sql"
(Get-Content "Sales.sql") | Where-Object { -not $_.Contains('EXECUTE sp_addextendedproperty') } | Set-Content "Sales.sql"
# Then gracefully add a GO
$Old_EOF = ';'
$New_EOF = ';
GO'
(Get-Content "Sales.sql") -replace $Old_EOF, $New_EOF | Set-Content "Sales.sql"
type Sales.sql
It then does the same for the SQL files for the OrderLines and Orders tables. Remembering to use escape characters where required so that the special characters are recognized for the comparisons. For example, the below code shows backslashes (\) being used as escape characters.
$Old_LastEditedWhen = 'CONSTRAINT \[DF_Sales_OrderLines_LastEditedWhen\] DEFAULT \(sysdatetime\(\)\) NOT NULL'
Now, I must note one important point here. If you go to deploy a dacpac to a dedicated SQL Pool and the SQL file contains column types which are not supported by clustered Columnstore indexes in dedicated SQL Pools, it will fail when trying to create the dacpac.
For example, the NVARCHAR(MAX) columns in the Orders table. You can work around this by converting the table to be a heap instead in the copied version of the database project. As you can see below.
$Old_EOF = 'ON \[USERDATA\] TEXTIMAGE_ON \[USERDATA\];'
$New_EOF = '
WITH
(
DISTRIBUTION = ROUND_ROBIN,
HEAP
)
GO'
(Get-Content "Orders.sql") -replace $Old_EOF, $New_EOF | Set-Content "Orders.sql"
Dedicated SQL Pool dacpac tasks
Anyway, once all the file manipulations have been done the another task is run that creates a second dacpac based on the copied version of the database project. Which I also publish to Azure DevOps.
Once this process is completed, I have two dacpac files. One that will update the SQL Server 2022 database in its entirety. Another that will update only the required tables in the dedicated SQL Pool.
From there, we can deploy the updates using the Releases feature.
Deploying updates to both ends of Azure Synapse Link for SQL Server 2022 using Azure DevOps
To deploy the updates to both SQL Server 2022 and the dedicated SQL Pool using the dacpacs created by the build pipeline I used the Releases feature.
In the Releases feature I first setup the release to use the latest version of the artifacts in the Artifacts section. Afterwards I added two stages as you can see below.
First stage deploys to the SQL Server 2022 database using the below tasks.
Second stage deploys to the dedicated SQL Pool.
Once the release had finished I started the Azure Synapse Link for SQL Server 2022 again in Synapse Studio to verify that it worked.
To make this release work seamlessly with the build pipeline ongoing I enabled Continuous Deployment. So that every time a new build had finished in the Classic Editor a new release took place straight afterwards.
On a side note, you can do both the builds and releases stages together in a YAML pipeline. So that you can update both ends of Azure Synapse Link for SQL Server 2022 using Azure DevOps with one YAML file. Instead of various GUIs.
Final words
Well I hope this post about how to update both ends of Azure Synapse Link for SQL Server 2022 using Azure DevOps helps some of you.
Plus, I hope the section on how to manipulate database project files in a pipeline inspires some of you to come up with some interesting solutions of your own.
Of course, if you have any comments or queries about this post feel free to reach out to me.
[…] Kevin Chant makes a change: […]
[…] another post I showed how you can use CI/CD to update both ends of Azure Synapse Link for SQL Server 2022 using Azure DevOps. Allowing you to update both a SQL Server 2022 database and an Azure Synapse Analytics dedicated […]
[…] For further details, you can read the ‘Before building the pipeline’ section in my update both ends of Azure Synapse Link for SQL Server 2022 using Azure DevOps post. […]