This is a real-life Machine Learning use case about integrated risk.
Speakers: Thomas Rengersen, Product Owner of the Governance Risk and Compliance Tool for Rabobank, and Thomas Alderse Baas, Co-Founder and Director of The Bowmen Group.
*ML in GRC 2021: Virtual Conference.
ML in GRC: Supporting Human Decision Making for Regulatory Adherence with Machine Learning
1. Machine Learning in GRC
Supporting Human Decision Making for Regulatory Adherence with
Machine Learning
Thomas Alderse Baas
taldersebaas@bowmengroup.com
3. Approach to Building ML Applications
Keywords: start small, quick results, build trust, augmented intelligence to support human decision
making, increase workforce productivity, focus on qualitative / quantitative value, clearly defined success
criteria
Machine Learning
Introduction to
Stakeholders
Identify use cases
Sort use cases and plot
on roadmap based on
priority, complexity and
business value
Select initial use case
and build Proof of
Concept
Deploy to Production,
integrate and
automate, predict and
measure impact
4. Use Case: Regulatory Intelligence - The Challenge
Companies are receiving an ever increasing amount of regulatory items from various data providers and
issuing authorities. They are trying to understand the relevance and impact to their organisation.
5. Use Case: Regulatory Intelligence - The Numbers
On average, 450 new items per month are received. For each item, applicability, relevance and impact
to the organisation need to be determined based on the content.
Intelligence Type Description Percentage of Total
Regulatory Applicable to the organisation and
requires follow up
25%
Informational Applicable to the organisation but
no follow up required
40%
Not Relevant Not applicable to the organisation 35%
7. Machine Learning Use Case - Research Question
Can we determine the Intelligence Type based on the raw Content Provider data with a recall of
at least 75% and a precision of at least 70% for the "Regulatory" class?
8. We use the data provided by the content
provider (stored in the GRC tool).
Except for the prediction value, we will not use
any data manually modified by the end user.
Machine Learning Process
Learn from Data
State the problem as an ML task
Data Wrangling
Feature Engineering
Deploy to Production
Predict
Measure the Impact
Research Question
Can we determine the Intelligence Type based on
the raw Content Provider data with a recall of at
least 75% and a precision of at least 70% for the
"Regulatory" class?
Create models, test and evaluate
Deploy to production, integrate with GRC tool,
automate predictions, visualise model
performance in GRC tool, automate retraining
9. Learn from data: generate models
Import regulatory data from GRC Tool into BigML
Perform data wrangling and feature engineering.
Test, evaluate and select best performing model(s)
Record in GRC
Tool containing
regulatory data
from content
provider
Imported into
BigML
Various models are
generated using various
algorithms with various
hyperparameter settings.
Each model is tested and
evaluated. The best
performing models are
selected and used. New data is ran through model to get the predictions. Predictions are
automatically imported into the GRC tool.
Make predictions on
new data and validate
with business
Model (Re)Training & Evaluation
Periodically retrain
model
10. Anomaly Detection
Apart from a model, an anomaly detector has been built to understand if the new data is
recognised by the model. A high anomaly score indicates that the new data is very different
from the data the model has seen during training.
If many items have a low confidence score (which BigML provides together with a
prediction) and a high anomaly score, it can be an indication that the model needs to be
retrained.
A “Trust Level” field has been created to combine the confidence score and the anomaly
score into a Red, Amber or Green status.
11. Machine Learning Workflow
The machine learning workflow shows all steps that have been taken from the source up until the
model. To make retraining the model as easy as possible, the steps in the workflow have all been
automated in BigML.
14. Results
The implementation of the first ML use case gave us a number of positive outcomes
• Significant decrease in time required to keep the rules up-to-date (retraining the model vs analysing
and maintaining the manual business rules)
• Improvement of model / prediction quality over time vs equal or decrease in quality of manual business
rules with growing complexity
• Automation of work that is tedious and complex when done manually (ie analysing and writing the
growing number of business rules)
• Realisation that the data quality should be improved. This resulted in a project to improve the process
and formal definitions.
• Increased awareness of and sparked a lot of interest in ML, triggering ideas for other use cases
15. Next Steps
The stakeholders are very enthusiastic about the results. A number of actions have been defined:
• Increase knowledge of ML within the team to better understand the capabilities and limitations of ML
• Determine what should be done to optimally incorporate the ML outcomes into the existing business
process
• Further improve data quality to further improve the quality of the model / predictions
• Identify and implement new use cases
• Find opportunities to apply ML in other domains
16. Next Steps – Potential Use Cases
• Improvement of Intelligence Type use case: There are a number of things that can be done to improve the
current use case, including improving the quality of the historical data, including the full text of the intelligence
item (instead of only the title and abstract) and fully automate the model retraining process.
• Predict intelligence item assignment: Each intelligence item is assigned to a team of specialised experts.
This assignment is also done using manual business rules, based on the content of the item. Often, the items
are not assigned to the correct team, negatively impacting efficiency and leading to frustration with the
experts.
• Quick Scan automation: For each relevant intelligence item a quick scan needs to be performed. Using ML,
this could be automated.
• Impact Assessment automation: For each relevant intelligence item with relevance medium or high, a full
impact assessment needs to be performed. This includes linking it to any relevant objects, such as business
processes, policies, risks and/or controls. These linkages can be automatically proposed.