In this post I want to cover how to create an assessment in Purview to view ESG data estate data in Fabric.
To be more precise, how to create an assessment in Purview to ingest ESG data estate data collected in Microsoft Fabric. In order to perform an audit for the Corporate Sustainability Reporting Directive (CSRD).
Which is becoming a mandatory requirement for a lot of companies now.
To help with some terminology here, ESG data estate is one of the capabilities offered as part of the Sustainability data solutions in Microsoft Fabric.
In fact, you can consider this post a follow-up to a previous post. Where I covered testing of the new ESG data estate capability for sustainability reporting in Microsoft Fabric.
Due to the fact that you can perform this step after populating your ESG data estate. I show how you do this in this post.
I show images for locations in the new Purview Portal in this post. However, a lot of the links I share also show you the locations in the classic Compliance Manager portal.
Fiftieth Microsoft Fabric post
I am very proud to publish post now for a variety of reasons. Including the fact that this appears to be my fiftieth Microsoft Fabric related blog post.
Plus, I get to publish this post just after Microsoft Build. Where Satya Nadella mentioned sustainability during the keynote. Since it is an important topic at the moment.
By the end of this post, you will know how to create an assessment in Purview to view ESG data estate data created as part of the Sustainability data solutions in Fabric. Along the way I share plenty of links.
One key point I want to highlight beforehand is that the Sustainability data solutions in Microsoft Fabric are currently in preview and are subject to change.
Preparing ESG data estate in Microsoft Fabric
First thing had to run additional notebooks against the Lakehouse that contained the demo tables I had created in my previous post. In order to extract the data to a new location in the Lakehouse and transform it so that a Purview connector can read its contents.
So, I ran the notebooks recommended in the Microsoft guide on how to publish metrics data for downstream application consumption. Specifying the year 2023.
Doing this created a new subfolder in the ‘Files’ section of my Lakehouse called ‘ReportingData’. Which contained a subfolder for 2023.
In addition to individual subfolders for various topics required for CSRD, it also created a ‘metadata.json’ file.
According to ESG data estate guide on how to publish metrics data for downstream application consumption, this file contains the metadata for all the metric extracts. Which is used by the connector you create in Purview Compliance Manager.
Create Purview service principal
Once all the notebooks had completed I then went into Microsoft Entra to create an app identity. Which automatically creates a service principal for the identity. In addition, I added it to a new Microsoft Entra group.
Once I had created the app identity and added it to a new group I went back into Microsoft Fabric. I then enabled the ‘Service principals can use Fabric APIs’ permission and added the new security group to is as you can see below.
After I had done this I went into the workspace and shared the computed Lakehouse with the Purview app (service principal). Giving it only the permission to read all Spark data. Which will allow it to read the files in the Lakehouse.
Before setting up a Purview connector
I had to do a couple of other things before I setup the Purview connector. Which may also apply to yourselves depending on your environment.
First of all, I had a message come up relating to permissions even though I was an existing admin in Purview.
To resolve this, I created a new group and added my account to it. I then assigned the new group to the Data Connector Admin role. You can manage role groups in the ‘Settings’ section within the new Purview Portal.
In addition, I had to turn on auditing as advised by Microsoft.
Microsoft document how to turn on auditing. However, due to my environment I got the below error message below relating to Enable-OrganizationCustomization. To resolve this I had to click on OK and wait for a while until the setting was enabled.
Be aware that patience is required because this can take some time.
Setup a Purview connector to import ESG data estate data in Fabric
Once the above items had been resolved I was ready to set up a Purview connector to import sustainability data. In order for Purview to consume the data stored in the Microsoft Fabric Lakehouse.
I did this in the new Purview Portal by going to the Settings section as below.
First I had to enter a connector name. On the next screen I entered the URL to the Lakehouse path where the metrics are stored. Plus I entered an organizational unit.
You can find the organization unit by looking for the ‘organizationalUnit’ unit in the ‘metadata.json’ file I mentioned earlier in the post. Like in the below example.
"organizationalUnit": {"id": "3", "name": "Contoso USA"}
One key point to remember is that it is perfectly valid for an organization group to be empty.
Anyway, after clicking next I then entered the Client ID for the service principal along with a secret for it. Just a reminder that when doing this from the Azure Portal copy the value of the secret ‘Value’ and not the ‘Secret ID’.
After validating the connection I then went on to create the connector. Once done, I was ready to create the assessment
Create an assessment in Purview to view ESG data estate data
I created a new assessment by going into the Compliance Manager solution within the new Purview Portal.
Creating this assessment is fairly straight forward once everything else is setup. As you can see in the below steps.
- Click on add assessment.
- Select the ‘Corporate Sustainability Reporting Directive’ regulation.
- Give the assessment a name and add it to a group.
- Select the service and then click Finish.
I then waited a day for the data too appear. Like Microsoft recommends in the guide on how to view metrics data against CSRD disclosure requirements in the Compliance Manager assessment.
I then returned to the Corporate Sustainability Reporting Directive Assessment in Purview.
Even though there is nothing on this page to signify that data had been ingested I was able to view the ingested data. By selecting ‘View all improvement actions’ at the bottom of the screen and searching for ‘E3-4-1-a’.
Just to check that the figures were legitimate I went back into the Microsoft Fabric Lakehouse that contains the ESG data estate data and navigated to Files->ReportingData->2023->E3_4_1_1.
I then opened up the json file for the metrics data. As you can see below it contained the same values.
{"ReportingYear":2023,"TotalWaterConsumption":1526470.86888}
{"ReportingYear":2022,"TotalWaterConsumption":6102779.403360002}
{"ReportingYear":2021,"TotalWaterConsumption":null}
Now the data is there I can work with it to audit CSRD disclosure metrics data.
Final words on how to create an assessment in Purview to view ESG data estate data in Fabric
I hope that me sharing how to create an assessment in Purview to view ESG data estate data in Fabric helps some of you. I also hopes it encourages more of you to look into this as well.
Especially now that a lot of companies are affected by the Corporate Sustainability Reporting Directive (CSRD). Because this can help report on it.
You can more about the Sustainability data solutions in Microsoft Fabric along with other things during the training day I will be co-presenting at Data Ceili.
Where I will be co-presenting “Microsoft Fabric and its place in the Microsoft Intelligent Data Platform” alongside Pragati Jain (l/t).
Of course, if you have any comments or queries about this post feel free to reach out to me.
[…] It now outputs the metrics into folders with descriptive names relating to the individual points. As opposed to the folders names as the requirement points like I showed in a previous post. […]