Skip to content

The great “number of workspaces for medallion architecture in Microsoft Fabric” debate

Reading Time: 5 minutes

In this post I want to share my thoughts about the great “number of workspaces for medallion architecture in Microsoft Fabric” debate.

Since I got asked about it this week during the Learn Together session I did alongside Shabnam Watson (l/X). Plus, it is a highly debated topic in our community, and I wanted to share my thoughts about it.

Due to the fact that my personal opinion is that it depends. However, the number you choose depends on a variety of reasons which I intend to cover in this post.

By the end of this post, you will know my personal opinions as to why. Plus, plenty of things to consider when deciding on the number of workspaces to implement.

Along the way I also share plenty of links.

Medallion architecture recap

Before I go any further, I will do a short(ish) recap about the medallion architecture in Microsoft Fabric. To recap for those who know and to help those studying for the DP-600 Microsoft Fabric exam.

Basically, the medallion architecture is a suggested architecture paradigm where you ingest data and transform it in various layers, sometimes referred to as zones.

In order to ensure that data is reliable and consistent enough for it to be consumed elsewhere. Plus, to give more peace of mind that the relevant data is stored securely (more on that later).

In reality, the medallion architecture is based on concepts that have been around for years.

Typically, there are three layers recommended when working with the medallion architecture. Which are commonly known as bronze, silver and gold.

Three layers typically recommended for the medallion architecture. Which is a source of discussion for the great "number of workspaces for medallion architecture in Microsoft Fabric" debate
Three layers typically recommended for the medallion architecture

For those wondering, the colors for the layers above were created using the official color codes.

Typical data flow in the medallion architecture

Anyway, these layers have also been known as other names over the years and still are for some. Typically, data flows between the layers as follows:

  • Source data gets ingested into the bronze layer in its original format, or the closest format to it.
  • Once in the bronze layer data is extracted and transformed into a cleansed state into the silver layer. Typically, things that occur at this stage includes removing duplicates and converting null values to a standardized value.
  • Afterwards, the data is extracted and transformed from the silver layer to the gold layer. During this stage data tends to be aggregated and prepared to be consumed elsewhere.
    For example, Power BI for BI purposes or Microsoft Purview to consume data for assessments. Like the ESG data estate data I covered in a previous post.
    One recommendation by Microsoft is that the data conforms to the common star schema design in this layer.

Now, the reason I said typically because the above is advisory and you can customize to suit your business needs.

Anyway, you can find out more about this by going through the “Organize a Fabric lakehouse using medallion architecture design” Microsoft learn module. Alternatively, you can find resources that explain this on the Microsoft Fabric Career Hub.

The great “number of workspaces for medallion architecture in Microsoft Fabric” debate

I know there has been a lot of debate about the number of workspaces required when looking to implement the medallion architecture.

Due to the fact that some people prefer to have one workspace for all three layers and others recommend a separate workspace for each layer.

Personally, I think it depends on a few different things. I know it sounds like the typical answer from a data platform engineer, but it really does because there are so many factors involved.

Below are some factors which in my opinion you need to consider when thinking about the number of workspaces required for your medallion architecture.

Environment you are creating your medallion architecture in

You need to consider the environment you are creating your medallion architecture in. For example, creating in your own Microsoft Fabric environment is fine.

However, if the intention is to deploy to multiple workspaces elsewhere you should test that as well.

Sensitivity of data in your medallion architecture

You must consider sensitivity of data. For example, dummy or sanitized data are good candidates for deploying the medallion architecture to a single workspace.

If you intend to deploy production data to a single workspace you must take security very seriously for all the layers concerned.

For highly sensitive data, or data where there is a clear separation of duties required for each layer, multiple workspaces are more appropriate.

For example, you can end up in a situation where different personas want access to the different layers for various purposes. Such as data scientists and developers requiring access to the raw layer and business analysts wanting access to the gold layer.

In this instance, to make sure that personas only get access to the data that they are required to work with you can configure the layers in different workspaces.

Governance requirements for each layer

You must consider governance requirements. For example, when clear separation of duties is required like in the previous example than multiple workspaces are a more appropriate solution.

Plus, any Enterprise Architecture principles that are in-place or requirements from your security team.

Capacity requirements for each layer in the medallion architecture

Another point to consider is the capacity requirements for each layer.

For instance, you need an F64 Fabric capacity or above to ingest data sources with either trusted workspaces or managed virtual networks.

With this in mind, you must ask yourself if all the layers require these security features. Plus, even if they should.

In addition, there are other capacity considerations to consider. Including Copilot for Data Factory usage and the data sovereignty for each layer.

Fabric items used for each layer

In my previous diagram I showed the different layers represented as Lakehouses.

However, you can potentially look to implement Data Warehouses for the silver or gold layers. Which can be beneficial if T-SQL is required or there is a strong preference to perform in-place Warehouse restores.

Medallion architecture with Data Warehouses

If so, one important point to consider is do you want to move your Warehouses to different workspaces in order to avoid accidental restores. Even more so since it was announced recently that a UX for restores is coming soon.

Administering multiple Microsoft Fabric workspaces

When you are considering implementing multiple workspaces you must ensure you are on top of Microsoft Fabric administration. Due to the fact that it does increase the complexity of your Microsoft Fabric estate.

Even more so when you need to implement separate workspaces to cater for Development, Test, Acceptance and Production (DTAP) environments. Like I mentioned in a previous post. Because then the number of workspaces you must support increases exponentially.

It makes for some interesting mathematics. Especially if the intention is to do it for every data product.

Final words about the great “number of workspaces for medallion architecture in Microsoft Fabric” debate

I hope my post about the great “number of workspaces for medallion architecture in Microsoft Fabric” debate has given you food for thought.

To clarify, there are a lot of things to consider when deciding on the number of workspaces to implement as part of a medallion architecture. Luckily, there are plenty of resources online to help you make the decision.

Of course, if you have any comments or queries about this post feel free to reach out to me.

Published inDP-600Microsoft Fabric

One Comment

Leave a Reply

Your email address will not be published. Required fields are marked *