Skip to content

Review of the ESG metrics capability in Microsoft Fabric after the Microsoft Ignite announcement

Reading Time: 7 minutes

In this post I want to share my review of the ESG metrics capability in Microsoft Fabric after the Microsoft Ignite announcement that the Industry Solutions workload is now Generally Available (GA).

Microsoft Fabric architecture diagram with Industry Solutions highlighted. Which contains the the ESG metrics capability.
Microsoft Fabric architecture diagram with Industry Solutions highlighted

Due to the fact that there is a new Environmental, Social and Governance (ESG) metrics capability that is a part of the Sustainability data solutions in Fabric. Which is in public preview.

This capability is going to be important for a lot of organizations. Because it can help produce the Environmental, Social and Governance metrics that are required for the Corporate Sustainability Reporting Directive (CSRD). Which will apply to a lot of companies over the next couple of years.

To clarify, I abbreviate Environmental, Social and Governance to ESG in this post for easier reading.

I wanted to compare the updates to the previous method that I covered in an older post. Where I covered testing of the new ESG data estate capability for sustainability reporting in Microsoft Fabric. Due to the fact there are significant changes.

In fact, there are so many changes you can consider my older post obsolete. With this post being the new definitive one for ESG metrics within Microsoft Fabric.

By the end of this post, you will know my findings and some insights. Along the way I share plenty of links.

About the new ESG metrics capability in Microsoft Fabric

In the past when you deployed the ESG data estate capability it deployed everything to enable you to generate ESG metrics and to generate the output for Microsoft Purview.

This has now changed with the functionality effectively split in two. Now the ESG data estate capability only deploys the Lakehouses, along with the capability to generate demo data and the ESG schema tables in the Processed Lakehouse.

All of the logic to transform and aggregate the data has been moved to a new ESG metrics capability. Including all the additional logic to generate the output required for Microsoft Purview.

At this moment in time the new ESG metrics capability is in preview. Probably due to the fact that it is new and subject to change. It is worth noting that the same applies to the Social and governance insights capability. Which I suspect is for a similar reason.

However, both the environmental data and insights and the Microsoft Azure emissions insights capabilities are both now GA.

Deploying the new ESG Metrics

In order to deploy ESG Metrics you first must deploy the initial ESG data estate. To start the process, you must the Sustainability data solutions in Fabric in a workspace.

My personal preference to do this is to select Industry Solutions in the bottom-left workload menu. Due to the fact that it provides quick access to additional options and learning materials.

Selecting the Industry Solutions workload whilst in a workspace within Microsoft Fabric
Selecting the Industry Solutions workload whilst in a workspace

After clicking to deploy Sustainability solutions the new Sustainability welcome screen appears.

Sustainability solutions screen, where you can select both ESG data estate and ESG metrics
Sustainability solutions screen

To perform a clean install, you click on ESG data estate. Which still shows the same Lakehouse diagram on the right-hand side as before. Along with the significantly reduced list of items it now deploys.

ESG data estate deployment screen
ESG data estate deployment screen

You can think of the three Lakehouses it shows as being similar to the Medallion architecture. Because the Processed Lakehouse contains nicely defined delta tables based on a well-defined ESG schema, and the Computed Lakehouse contains aggregations.

To deploy the new ESG Metrics capability you can open the new Sustainability welcome screen again. By selecting the Sustainability solution item. Which is created in the workspace when the Sustainability solution is first deployed.

Selecting the Sustainability solution item in the Fabric workspace
Selecting the Sustainability solution item in the Fabric workspace

From there, you can repeat the steps above to deploy the Environmental, Social and Governance metrics (ESG) capability.

Observations about the new ESG metrics capability

When you look at the list of items that are deployed with the new ESG metrics capability you can see it is more than just a separation of the previously created items. Which consisted of mostly notebooks and a sample Power BI report.

One immediate difference is that new items are provided with this new capability. Including a new dashboard to view metrics and some helpful Data Pipelines.

Plus, the notebooks appear to be completely rewritten. They now contain more PySpark syntax and less SQL. In fact, it appears to be a completely reimagined way of creating the relevant ESG metrics.

According to the prebuilt ESG metrics library article, sixty-one prebuilt metric definitions are now included. As opposed to the thirty-two that were available when it was all part of ESG Data Estate.

Getting started with the ESG metrics capability in your workplace

Below is an example of how you can populate a workspace with ESG metrics. Split into two parts to cover setting up the ESG data estate capability first and then and the ESG metrics capability.

First part focuses on configuring the initial ESG data estate and the second part focuses on the ESG metrics capability itself.

Configuring the ESG data estate

Before you look to configure ESG metrics you need ESG data estate deployed and configured. Here is one way you can look to do that.

  1. Deploy the ESG data estate capability.
  2. Create empty ESG data model schema tables in the ProcessedESGData_LH Lakehouse.

In reality, you can look to customize your ESG data estate deployment depending on your requirements. For example, after creating schema tables you can look to create reference values.

Configuring ESG metrics

After deploying the ESG data estate and creating the tables in the ProcessedESGData_LH Lakehouse you can start working with the ESG metrics capability.

Below is one way you can do it. However, be aware that this is all fully customizable and you can add additional functionality if required. Plus, the steps and order are subject to change depending on the situation and preferences.

  1. Deploy the ESG metrics capability.
  2. Ingest data into the Raw Lakehouse. Microsoft provides a guide on how you can integrate ESG data into data Lakehouse.
  3. Check the metrics stored in the “metrics_definitions_config.json” file in the Config subfolder of the ConfigAndDemoData_LH Lakehouse. Modify if required and then run the LoadDefinitionsForMetrics_INTB notebook to load metric definitions into the ComputedESGMetrics_LH Lakehouse.
  4. Ingest, transform and load data into the ESG schema tables you created in the ProcessedESGData_LH Lakehouse.
  5. Configure authentication for the prebuilt DatasetForMetricsMeasures and DatasetForMetricsDashboard Semantic Models that are provided with the capability. As documented in the prerequisites to compute metrics data.
  6. Run the ExecuteToCreateAggregates_DTPL Data Pipeline to generate aggregate tables. Specifying which sets of aggregate tables you require in the pipeline parameters.
  7. Refresh your DatasetForMetricsMeasures Semantic Model.
  8. Run the GenerateOutputForMetrics_INTB notebook to generate output data. Which is stored in the ComputedESGMetrics table in the ComputedESGMetrics_LH Lakehouse.
  9. Run the TranslateMetricsOutputForReport_INTB to extract and transform the data stored in the ComputedESGMetrics_LH table o another table called ComputedESGMetricsForDashboard. Which is the data source for the prebuilt dashboard that comes with the ESG Metrics capability.
  10. Refresh your DatasetForMetricsDashboard Semantic Model and then open the dashboard to view the updated data.
  11. Publish metrics data for auditing in Compliance Manager. Which was shown during the sustainability session presented by Mahesh Narayanan during Microsoft Ignite.

Microsoft Purview

On step eleven of the steps in the previous section, I mention preparing the data for consumption by Microsoft Purview. It is worth noting that the new TranslateOutputOfMetricsForCM notebook prepares the output differently now.

It now outputs the metrics into folders with descriptive names relating to the individual points. As opposed to the folders names as the requirement points like I showed in a previous post.

However, the points are still defined the same in the “metadata.json” file. As you can see below:

New naming convention for published metrics data for Purview whilst testing the ESG Metric capability in Microsoft Fabric after the Microsoft Ignite announcement
New naming convention for published metrics data for Purview

Testing the ESG metrics capability with a sample pipeline

Microsoft clearly understands that looking into how the ESG metrics capability in Microsoft Fabric works can be daunting. Which is why they provide a sample pipeline that allows you to explore the ESG metrics capability with demo data.

Which can perform the majority of the ESG metrics steps that I detailed previously. As you can see below example.

ExecuteComputationForMetrics_DTPL sample pipeline which you can use to test the ESG Metric capability in Microsoft Fabric after the Microsoft Ignite announcement
ExecuteComputationForMetrics_DTPL sample pipeline

I say majority of the process I covered previously due to the fact that it loads demo data into the nicely defined delta tables in Processed Lakehouse and bypasses populating the Raw Lakehouse.

However, it does cover how to work with the demo data to prepare the final metrics for both Purview Compliance Manager and the metrics dashboard.

One key point I will highlight here is that you do not need to create the ESG tables in the Processed Lakehouse before running this pipeline. It will add the tables as part of the first activity.

Configure everything correctly before starting the Data Pipeline. Also, be aware that it can take a while to complete depending on your workspace configuration.

For example, it took fifty minutes for me to perform an initial deployment in a workspace with the default settings. So, I strongly recommend either fine tuning or refactoring the pipeline.

You can also make a copy of this pipeline and adapt it for real-world use. One way to do this is to replace the activity to load demo data with activities to ingest data into the Raw Lakehouse and then process it into the ESG schema tables in the Processed Lakehouse.

Final words about the ESG metric capability in Microsoft Fabric

I hope this review of the ESG metrics capability in Microsoft Fabric provides an interesting summary for yourself. I must admit having worked with the initial ESG data estate capability which aimed to deliver everything I am impressed with the new ESG metric capability.

To me it made sense to separate the logic to build the initial ESG data estate and the logic to generate the ESG metrics to two separate capabilities. Because it makes it easier to customize and update the new ESG metrics capability.

Plus, I really like the effort that was put it to create and document all the new logic. To help people get up to speed with the offering better.

Anyway, my advice is to go through the documentation for both the ESG data estate and the ESG metrics capabilities in detail. In order to understand the new ESG metrics capability better and see how it can be customized for your environment.

Of course, if you have any comments or queries about this post feel free to reach out to me.

Published inMicrosoft Fabric

2 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *