Skip to content

Unit tests on Microsoft Fabric items

Reading Time: 6 minutes

In this post I want to share various ways that you can perform unit tests on Microsoft Fabric items. Since a lot of people ask about unit testing in Microsoft Fabric these days and I want to provide some options.

To clarify, I cover unit tests in this post. In other words, tests on individual components. As opposed to integration tests which test multiple components together.

In reality, there are many ways you can perform unit testing on items in Microsoft Fabric. Which you decide to implement decides on a lot of factors. Including complexity and personal preference.

With this in mind, I decided to highlight various methods in this post. In order to answer some questions and highlight some possibilities. Which I share in the below sections:

  • Unit testing Power BI reports and semantic models
  • Unit testing Microsoft Fabric Lakehouses, Data Warehouses and KQL Database tables
  • Unit tests on a Microsoft Fabric Data Warehouse
  • Unit testing Data pipeline and Dataflows Gen2

I must stress this is a selective list of techniques you can look to work with and others are available. Along the way I share plenty of links. So that you can find out more about any of the mentioned methods that are of interest you.

Unit testing Power BI reports and semantic models

I made a clear decision to group unit testing Power BI reports and semantic models (formerly datasets) together in this post. Since they are both components of a Power BI Desktop project. However, you can test both separately as you will see in this section.

Over the years people came up with various ways to unit test Power BI reports and semantic models. Ever since Power BI Desktop projects were introduced even more innovative ways to perform unit testing are available.

For example, the below method which I first covered in a post about Power BI Project (PBIP) and Azure DevOps CI performance tests. Which can perform unit tests against your Power BI reports and semantic models.

Unit testing Microsoft Fabric Power BI reports and semantic models
Unit testing Microsoft Fabric Power BI reports and semantic models

I must stress though that there are alternative options available to perform unit tests.

For instance, the Power BI Project (PBIP) and Azure DevOps build pipelines for continuous integration guide by Microsoft highlights working with the Best Practice Rules within a deployment pipeline. In order to check semantic models.

However, depending on your requirements you can look to work with semantic link. Taking it one step further you can potentially look to use semantic link with another recognized python framework. Such as pytest.

You can also look to work with the Invoke-DQVTesting PowerShell module created by John Kerski. John covers automating this in a post that covers Invoke-DQVTesting and orchestration.

At the end of the day there are various ways you can perform unit testing for both Power BI reports and semantic models. Like I mentioned earlier in the post you must consider various factors. Including what you need to test and your personal preferences.

Unit testing Microsoft Fabric Lakehouses, Data Warehouses and KQL Database tables

I decided to cover ways that you can perform unit testing on Microsoft Fabric Lakehouses, Data Warehouses and KQL Database tables together for one very good reason.

Which is to highlight the fact that you can work with the same unit testing strategy thanks to the power of shortcuts within Lakehouses. Like in the below example. Which shows one table from a KQL Database and another from a Data Warehouse.

Lakehouse with multiple shortcuts

However, I must stress that in order for KQL Database tables to be added as shortcuts you must enable OneLake availability.

Enabling OneLake availability for a KQL Database table
Enabling OneLake availability for a KQL Database table

Once they are added into a Lakehouse you can work with all three of them in notebooks with the testing framework of your choice. In fact, you can work with all three of them together in a notebook if you so wish.

Testing Microsoft Fabric Lakehouses, Data Warehouses and KQL Database tables with pytest

To demonstrate that pytest works with all three tables I created a new notebook that was connected to the Lakehouse and installed pytest with the below code.

pip install pytest

I then ran the below code to insert the data from the tables into spark dataframes.

geographydf = spark.sql("SELECT * FROM AzDoUnitTestingLH.Geography")
metricstabledf = spark.sql("SELECT * FROM AzDoUnitTestingLH.MetricsTable")
publicholidaysdf = spark.sql("SELECT * FROM AzDoUnitTestingLH.publicholidays")

Afterwards, the below code to convert the spark dataframes to pandas dataframes so that I can work with the isnull functionality in this example.

# Convert the spark dataframes to pandas dataframes
pd_geographydf = geographydf.toPandas()
pd_metricsdf = metricsdf.toPandas()
pd_publicholidaysdf = publicholidaysdf.toPandas()

I then created a test function that checks for missing values.

def test_no_missing_values(test_dataframe):
    df = test_dataframe
    missing = df.isnull().sum().sum()
    assert missing == 0, f"Found {missing} missing values"

Afterwards, I ran the function with the below syntax for all three dataframes.

test_no_missing_values(pd_publicholidaysdf)

Two of the tables returned an assertion error. Including the one below for the publicholidays sample table.

AssertionError: Found 70621 missing values

I checked the publicholidays sample table to verify this figure. I confirmed that the figure is accurate due to missing values in the isPaidTimeOff and countryRegionCode columns.

Of course, this is just one example of working with a testing framework. Because there are various other ways you can perform unit tests for the contents of Microsoft Fabric Lakehouses, Data Warehouses and KQL Database tables within notebooks.

Those of you who have worked with other data engineering services can experiment with other testing frameworks you worked with in the past. To see if they are more suitable.

Because I suspect those who are more experienced data engineers will want to build a more comprehensive framework.

One final note relating to unit testing tables is that you can also look to create a shortcut to a mirrored database to do perform tests if there is a requirement.

Unit tests on a Microsoft Fabric Data Warehouse

When looking to perform unit tests on a Microsoft Fabric Data Warehouse you have a couple of interesting options.

First option that I want to highlight is that those working with dbt can look to work with both their data and unit testing functionality. Microsoft shows an example of a test in action in their tutorial that covers how to set up dbt for Fabric Data Warehouse.

Second option is to perform unit tests with tSQLt. I will not be covering in-detail how you can use tSQLt to perform unit tests on Microsoft Fabric Data Warehouses in this post. Due to its level of complexity and nicheness of the target audience.

In short, what you can do is deploy updates in a dacpac file that are based on a database project that contains the schema details of your Data Warehouse to a SQL Server database first and perform your tSQLt tests there.

Afterwards, you can deploy a dacpac containing the same schema to a Microsoft Fabric Data Warehouse. I created the below diagram to help visualize one way you can work with tSQLt.

As you can see, it is a reasonably complex process. You can simplify the process by creating one dacpac for updates and setting the deployment to ignore errors. But that can introduce other issues ongoing.

Those who are curious about this process can look at an older post of mine. Which covers unit tests for an Azure Synapse Analytics dedicated SQL Pool. You can adopt the logic contained within that post.

For further information about creating a second database project to store tSQLt items to separate your schema for data from tSQLt items, I recommend a great blog post by Nikolai Thomassen.

Unit testing Data pipeline and Dataflows Gen2

Microsoft has a testing framework for Data Factory available to create unit tests for Microsoft Fabric Data Pipelines. Which also allows you to write unit tests for pipelines in Azure Data Factory and Azure Synapse Analytics.

Alternatively, you can copy your Microsoft Fabric Data Pipelines and Dataflows to an alternative workspace. Once done you can change the destinations to be locations in the alternative workspace, run them and then test the results there.

You can achieve this easier with Microsoft Fabric Data Pipelines since they are supported by Microsoft Fabric Git integration.

Since Microsoft Fabric Dataflows Gen2 are not supported by Microsoft Fabric Git integration yet I suggest looking to export your Dataflows with the export template feature.

Which will download a Power Query template locally for you so that you can import into a newly created Dataflow in your alternative workspace. Once, done, you can change the destinations and then publish your updated Dataflow.

I am looking forward to Dataflows Gen2 support in Microsoft Fabric Git integration. To allow for a more integrated process.

Final words

I hope that this post inspires some of you to explore various ways that you can perform unit tests on Microsoft Fabric items. Because I know it is a topic that often gets talked about.

Which is why I wanted to provide awareness about some of the options available for various Microsoft Fabric items.

Of course, if you have any comments or queries about this post feel free to reach out to me.

Published inMicrosoft FabricUnit testing

2 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *