This is the third article in a five-part series that examines the successful adoption of enterprise Artificial Intelligence (AI). This blog discusses AI and data governance best practices and what governance foundations must be in place for successful AI implementation. The other articles in the series address AI strategy, building a foundation for AI success, ethics in AI, and MNP’s delivery framework. If you missed the earlier blogs, start from the beginning, “An Introduction Building an Artificial Intelligence Strategy.” You can find it here.
Outside of having a strong AI strategy and creating a proper foundation, there’s another secret to successfully scaling enterprise AI – have a clear AI/data governance framework. AI/ML run on data. If you aren’t properly managing your datasets in terms of compliance, integrity, quality, context, and connection with other datasets, or if your data is siloed and inconsistently classified, then you can’t scale AI. We have witnessed clients in situations far into their AI journey who did not put proper governance strategy and policies in place from the beginning, who later suffer from uncontrolled usage, poor performance of AI applications, and unsupported organization structure.
According to Gartner, through 2022, only 20% of analytics insights will deliver business outcomes. According to MIT Sloan, the low percentile of insight-to-outcome translation is, “because most companies aren’t following a set of established best practices, operating instead from a mostly haphazard and unproven playbook.”
Or, as reported in the 2020 State of Data Governance and Automation, “Many organizations have started their data governance journeys to achieve data intelligence, but they have not automated their data operations to create sustainable and repeatable practices. Without an accurate, high-quality, real-time enterprise data pipeline, it can be difficult to uncover the necessary insights to make optimal business decisions.”
Prashanth Southekal – business analytics author, professor, and head of the Data for Business Performance Institute – states that in his experience, “most companies have a lot of resources, they have the technology and very smart people, and they have tons and tons of data. But [success] isn’t about data collection, it’s about data management and insight.”
Not all data is created equally, therefore treat it accordingly. A company’s data hierarchy needs to be properly identified and supported. Some data, deemed mission-critical, receive “VID” (very important data) treatment and are stored in easily accessible, high-quality systems. Other data is stored in the cheapest possible manner to minimize costs. Some functionally applicable data sits somewhere in between.
It’s critical to understand who exactly is accountable for managing that data and that a common definition of data is upheld across the organization. At no point should humans not understand what the AI is doing, or what data is going into the AI solution. A good governance framework ensures that the AI solution can be trusted, and control of the solution is always in the hands of the organization.
An AI/data governance framework provides efficiency and transparency to your AI implementation and is part of a sound data strategy. It’s an ongoing process that starts before an AI project begins, continues through the various project stages, and continues once the AI is scaled. At MNP, we employ a five-step approach to AI governance.
The first step is to identify the areas where intelligent technology can be used to benefit and set the expectations for what it must achieve. Essentially, this is where an organization scopes its project, identifies all the benefits and risks, and ensures that the upside outweighs any downsides.
Once the project is scoped and the goals are set, identify the required data sources. Here, an organization must make every effort to ensure that its data is unbiased. This could mean collecting data from a variety of sources or using industry-standard techniques when dealing with unbalanced/biased data, such as under- and over-sampling. Other than bias within the data source, it is equally important to ensure that there is no bias in the collection of the data itself. Finally, if it is ultimately impossible to remove bias from the data, then it is important to keep that bias in mind and account for it when making predictions.
Data security and privacy are crucial for ensuring the integrity of an AI solution – and being able to trust its predictions. This step considers jurisdictions and governing bodies (such as HIPAA, GDPR, and regional privacy laws). All data storage, data transfer processes, and applications must be evaluated for security.
Furthermore, some procedures can be put in place, such as regularly backing up data so that it is not lost, as well as conducting periodic scans to ensure the data is as expected.
Additionally, it is important to maintain the privacy of those whose data is being collected, or those who will be affected by the results of the AI solution.
At the Solution Delivery stage, the organization plans out and executes the implementation of the AI solution. This step involves a lot of exploration and experimentation on the collected data. The goal is to find the best possible solution for the project. This is an iterative process and involves many cycles of creating a solution, testing it out, improving it further, testing it again, and so on.
At the end of this step, the ideal result would be a fully-fledged AI solution that satisfies all the goals and expectations according to the metrics of success laid out in the Opportunity Identification step.
This step refers to how well the people developing, owning, and using the AI solution understand how it works, what it does, and what it is supposed to do. People must be able to critically evaluate any AI solution and not just blindly trust that it will do what is expected.
Our five-step AI Governance Framework outlines critical procedures to ensure control of the AI solution being developed. For each step, there are important questions that need to be answered as well as a role that owns that step. For example, when it comes to Solution Delivery, the AI Engineers are responsible for asking the appropriate questions and then answering them. We have created a simple checklist for project managers to follow during each step while creating the governance strategy for the organization.
You can’t have a data strategy without building in governance. The steps are only a rudimentary outline to help you think about how your organization governs its data. AI/data governance is multi-faceted and therefore complex to architect and establish.
To go deeper into creating a data strategy and laying the foundation for AI adoption, consider joining us for our five-day immersive AI workshop. The workshop will cover how to assess and build an AI strategy customized for your organization as well as how to get full value from your implementation and avoid unnecessary risk.
Connect with an MNP advisor to discuss your Artificial Intelligence strategy.
Dev Mishra leads the National Data Engineering and AI/ML Practice comprised of a team of 15+ Senior Managers, Managers, and Azure Data Consultants. He and his team are focused on Azure, Databricks, and Alteryx, among other leading technologies. He has 16+ years of experience in delivering large-scale transformation projects leveraging Advanced Analytics techniques for leading global organizations in the North America Region.
Dawar Aziz Ahmad is a Machine Learning Engineer with the Data Engineering and AI & ML practice at MNP. He has experience developing machine learning models, data ETL pipelines and applications for some of the world’s leading technology companies. Dawar’s focus is on Machine Learning and Natural Language Processing.