Leverage the power of Purview Data Governance Metadata Insights through self-service analytics.

Leverage the power of Purview Data Governance Metadata Insights through self-service analytics.

Purview data governance metadata overview

Microsoft Purview Data Governance provides comprehensive enterprise data governance solutions designed to help organizations across all industries manage and improve the health of their data estates. It offers a range of out-of-the-box reports that enable customers to gain insights into the health of their data governance controls. However, every customer has unique reporting needs to support specific use cases, assess their data estate health, align with OKRs, and govern their data assets to meet business goals. While Purview’s standard reports meet common data governance metadata requirements, they may not address every customer's unique needs. Customization is not enabled, as it would increase complexity and add weight to the reports.

To support democratized and responsible data governance practices, we are publishing all data governance metadata to OneLake. This opens up limitless opportunities for Purview Data Governance customers to analyze their entire data estate metadata, derive insights, and integrate those insights with other metadata to enhance the overall health of their data estate.

By empowering data analysts and stewards to analyze and derive insights from Microsoft Purview Data Governance metadata, customers gain full flexibility in choosing their tools and compute power (BYOC). This enables them to better manage and improve the health of their data estate, create leadership reports, drive fact-based decision-making, and foster a culture of data governance throughout their organization.

Subscribe to data governance metadata

To subscribe to Purview data governance metadata, please refer to the Microsoft Purview Learn documentation. A step-by-step guide is provided in the document to assist you through the process.

Two 3NF (Third Normal Form) data models, domains and dimensions, have been published for subscription to your OneLake workspace. The complete set of data governance metadata is refreshed daily, and subscribers can configure the refresh schedule to suit their needs. The Purview data governance metadata set includes:

  • Data governance domains
  • Data products
  • Data assets
  • Glossary terms
  • Custom attributes
  • Data subscription and access
  • Data Health controls
  • Data health actions
  • Data Quality rules
  • Data Quality scores
  • Data assets schema
  • Critical data elements
  • Objectives and Key Results (OKR)
  • Schedules and alerts
  • Data Quality actions
  • Health controls actions
  • Roles
  • And more to come.

Create a semantic model to derive insights

To create a semantic model, you first need to create shortcuts for both the domain and dimensions models from OneLake. Select the target path and the shortcut path to create the shortcut. The target path refers to the location where the shortcut points, while the shortcut path refers to the location where the shortcut appears. In OneLake, shortcuts appear as folders, and any workload or service with access to OneLake can use them.

A semantic model is required before building any reports with Purview data governance metadata. Begin by selecting the tables to include in the semantic model. You can also add other semantic modeling properties, such as hierarchies and descriptions. These properties are then used to define the tables in the Power BI semantic model.

Derive insights from metadata

Data Governance Metadata Insights involve using metadata to assess the current state of an organization's data estate health and to manage, control, and improve governance practices. In a well-governed data ecosystem, metadata plays a crucial role in understanding how data is created, accessed, stored, and utilized, offering key insights to ensure compliance, maintain data quality, and enhance operational efficiency. It provides business context for data, including data definitions, business rules, use cases, glossaries, and descriptions, ensuring that data is consistently understood and interpreted across the organization.

For instance, a metadata description might clarify that a column labeled "Customer ID" is a unique identifier used across all departments. More broadly, operational metadata—such as tracking data subscription requests, access validity, data freshness, and the data lifecycle (from origin to current state)—is also considered part of governance metadata.

Data governance metadata also provides information about policies, standards, and rules applied to data to ensure compliance with governance frameworks. This includes specifying who can access the data, how it should be handled, and how long it should be retained. For example, a policy might state that only users in the finance department can access sensitive payroll data, along with specific rules for data access approval and retention periods.

There are limitless opportunities to derive insights from Microsoft Purview data governance metadata. Here are a few examples:

  • Data Quality Metrics: Insights into accuracy, completeness, consistency, and freshness help organizations identify and resolve data quality issues proactively. For instance, if metadata shows a key data field is only 80% complete, actions can be taken to fill gaps or improve data collection processes. Alerts can be configured using tools like Data Activator, Power Automate, or custom-built in-house solutions.
  • Regulatory Compliance: Insights into privacy, data retention, classification, and access policies help organizations stay compliant with regulatory requirements, supporting responsible data governance. Much of this information can be reported and used to generate alerts if any predefined policies are violated.
  • Ownership and Stewardship: Metadata provides insights into who is accountable for governance tasks, such as data product ownership and stewardship. A report, for example, can highlight the data stewards responsible for managing sales data and outline necessary governance actions.
  • Governance and Curation: Metadata related to data domains, products, critical data elements, glossary terms, and data health offers leadership a clear view of the overall health of their data estate and the maturity of their governance practices.
  • Data Health Controls: Metadata on thresholds, scores, and actions for any rule failures or missed thresholds enable continuous improvement of the organization’s data estate health.

Monitoring and alerting

Initiating alerting and monitoring using data governance metadata involves setting up systems that continuously track and assess metadata, triggering alerts or actions when governance policies, data health scores, or data quality metrics deviate from expected standards. Continuous monitoring and alerting ensure that all data governance aspects are maintained without requiring manual oversight.

To begin, identify the key metrics you want to monitor within your data governance framework. These metrics will define what needs to be tracked and what should trigger alerts. Below are some examples of what can be monitored to improve data estate health:

  • Thresholds for Data Accuracy, Completeness, and Consistency: Set minimum thresholds for these metrics, and trigger alerts if data falls below the defined levels. For example, monitor the completeness of customer records to ensure that certain fields (e.g., name, email) are consistently populated at 100%.
  • Data Health Control Scores: Establish rules and thresholds to validate control scores, with alerts triggered when they drop below the expected levels.
  • Unauthorized Access: Track unauthorized access by linking access or subscription metadata with your system's physical access telemetry. This allows you to detect discrepancies and trigger alerts when unauthorized access occurs.
  • Incomplete or Missing Lineage: Compare lineage metadata with your data flow metadata and trigger alerts for inconsistencies that may disrupt data flow tracking.
  • Data Access Policy Violations: Monitor access to personally identifiable information (PII) or sensitive data. Track who is accessing the data, the duration of access, and ensure it follows policy limitations. Set rules for data access approval, and trigger alerts if access is granted without proper authorization or in violation of the policy.
  • Retention Policy Compliance: Set alerts for data that is not archived or deleted in accordance with established retention policies. You can link Purview data governance metadata with your system's data to track when data exceeds its retention period or is incorrectly archived.
  • Data Freshness and Processing Delays: Use data freshness scores to monitor and prevent delays in data processing. For example, if data from a key source is not refreshed within the agreed SLA or scheduled time, an alert can be triggered.

Automate Alerts Using Data Activator

To make alerting efficient, you can use Fabric Data Activator to connect Purview metadata in OneLake. Data Activator is a no-code Microsoft Fabric experience that empowers business analysts to drive actions automatically from your data. Data Activator provides a single place to define actionable patterns in your data. These patterns can range from simple thresholds (such as a value being exceeded) to more complex patterns over time (such as a value trending down).

When Data Activator detects an actionable pattern, it triggers an action. This action can be an email or a Teams alert sent to the relevant person in your organization. You can define alert conditions using metadata tags or monitoring thresholds. Below is an example of AI model output: the model output is produced from a candidate dataset, where 69% of the candidate ages were between 18 and 34. A bias detection rule is configured to send an alert to the responsible data steward if the percentage of candidates under 35 years old exceeds 50%.

Create a Report and generate a summary with Copilot

You can create a dashboard to display data governance and quality status, data estate controls status, compliance metrics, and any triggered alerts, allowing data stewards and governance teams to monitor the ongoing performance of their data estate. Here is an example report with Copilot integration for summary recommendation from the report:

For leadership and business data users, you can generate a summary with Copilot. You have the flexibility to refine or guide the summary by customizing prompts, such as 'Summarize this page using bullet points' or 'Provide a summary of underperforming data estate controls.' You can also use Microsoft Power Automate to:

  • Integrate data alerts
  • Export and email reports to leadership
  • Export paginated reports and save them to OneDrive or SharePoint.

Conclusion

Metadata insights are crucial for effective data governance, providing visibility into the structure, usage, and lifecycle of data. By leveraging metadata, organizations can enhance data quality, ensure compliance, and improve the management of data assets. This approach not only safeguards data integrity and security but also fosters trust and transparency across the organization.

Purview data governance metadata enables Chief Data Officers (CDOs) and Data Governance leaders to monitor the progress of strategic initiatives closely. The type of metadata generated by Purview, combined with insights from other sources such as time series data, this allows organizations to create comprehensive reports. As an example, these types of reports could offer CDOs and leaders a one-page overview with drill-down functionality, summarizing ongoing data governance initiatives, showcasing current status, and allow for following up on the progress of data governance KPI's and metrics over time. This provides a holistic view of the entire data governance program rather than focusing on isolated components. Additionally, it enables the gamification of data governance, encouraging engagement. For instance, domain owners can create reports to track how well their data assets and products are curated according to the organization's metadata policies and review progress in data governance meetings.

Using metadata for alerting and monitoring allows organizations to establish a robust governance framework that proactively identifies and addresses issues before they impact business operations. This proactive approach ensures data quality, compliance, and security with minimal manual intervention, fostering trust and reliability throughout the data ecosystem.

Organizations can also use their own computing resources and tools to analyze Purview metadata, gaining valuable insights to refine their governance practices and improve the health of their data estate. By linking Purview metadata with raw and system data, they can uncover patterns and address anomalies in a democratized manner. Purview metadata in OneLake facilitates the creation of custom reports on data governance, compliance, and data health, offering limitless opportunities for analytics and strengthening governance practices.

Note: This article is co-authored with Marco Österlin, twoday Data & AI, Copenhagen Denmark


Karthik Ravindran Marco Österlin Damon Buono Darren Lacy Alex Posar Sushma Rao Sita Dontharaju Sunetra Virdi Effie Kilmer Keith Homiski Jie Feng Nick Doughty Keshav Singh Anthony Pramod Manjali Naga Krishna Yenamandra

Dr Victoria Holt FBCS

Believes industry and research should work together to create innovation in a complex data world | Microsoft Data Platform MVP | International Speaker

3mo

Thanks for writing the article

Nicolas Le Jeune

Senior Director Architecture, Software and Data Engineering

3mo
🔏 Romain Dalle

Technical Officer Data Security, MCT, MVP Security / Microsoft Purview

3mo

Alexander Ingram c'est pour toi :-)

David Castro Reyes, PMP, CDMP

Senior Manager Data & Analytics | Data Management | Data Protection

3mo

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics