How Nationwide Automates Enterprise Data Quality at Scale
November 26, 2024
One of the world’s largest insurance and financial services companies, Nationwide has more than 25,000 employees, is #75 on the Fortune 500 list, and does over $60B in sales annually. High-quality financial data drives key business decisions, reports, and analytics for the entire organization.
With Anomalo, Databricks, and Alation, Nationwide was able to build a best-in-breed solution for data governance. At the 2024 Databricks Data + AI World Tour Chicago, we talked to Mike Randall, Director – Enterprise Data Governance at Nationwide about the results he’s seen, tips and best practices for data teams, and what’s next as data governance meets generative AI.
The Challenge: Complex data assembly lines, brittle manual rules
Nationwide handles critical financial data at a massive scale. Their data estate clocks in at over 13,000 databases across the enterprise, about 5,000 in production. Data at Nationwide flows through many complex transformations before being used for analytics and compliance. “You can think of it as the assembly line for a car,” Mike said.
Mike has been working in data quality at Nationwide for over 30 years. “I started out looking at BIN numbers to make sure they were keyed in correctly. I had a terminal in front of me, I had policies in front of me, I had books.” That was back then, but many data quality processes still felt stuck in time, relying on manual checks. To give one example: Before adopting Anomalo, Nationwide’s marketing team had over 3,000 custom business rules to maintain.
The result was that data quality was mostly “reactive,” Mike said. Issues might not be detected until the final output went to a report or regulator. Cue a scramble to trace the root cause back through the assembly line. “At a company like Nationwide, which has so many different business units and layers, trying to find the right teams to work with is a struggle.”
As Mike worked on applying Six Sigma principles to data, he knew there had to be a better way to catch and fix issues proactively. He was looking for a cutting-edge tool to help the team focus—scaling impact without scaling hours, costs, or personnel.
The Solution: Anomalo’s AI-powered monitoring + Databricks and Alation integrations
The enterprise data office evaluated several different solutions for data quality, but what stood out was Anomalo’s AI-powered monitoring. It learns which data changes represent normal fluctuations, and which are true deviations worth investigating. “I put my name behind the Anomalo selection because Anomalo was the one bringing statistical analysis to the field,” Mike said.
Crucially, Anomalo also had deep partnerships and integrations with Databricks, Databricks Unity Catalog, and Alation, all key parts of Nationwide’s infrastructure.
- Databricks: “We’re getting a huge footprint of Databricks within Nationwide. We’re using Anomalo within Delta Live tables and workflows, as well as Unity Catalog. And as we build out our Bronze, Silver, and Gold layers, we’re putting Anomalo in each of the different layers to make sure quality meets our expectations.”
- Alation: “We’re also using Alation as our data catalog for our metadata to support business users. What I like about Anomalo is, they also have an integration with Alation and we can link the quality of the data on those tables within Alation so people aren’t having to jump to different places, and they know the quality of the data is good.”
The Results: “The best day of my life in data governance”
“When we first brought Anomalo in, we put it on our most important analytics databases, and within that the most important table,” Mike said. “We had a really quick standup—within months we were up and running. It was probably the fastest I’ve ever seen anything at Nationwide move, in my 31 years.”
This kicked off a major shift in the organization’s approach to data quality. “In the past, everything was reactive,” Mike explained. “People found a problem, we dug through to find a root cause, and it always started with reporting. With Anomalo, we’re really excited because it’s helping us find those problems proactively. Being able to proactively address and catch issues and automate it, not having to have people write code, is a huge win.”
Since implementing Anomalo, Nationwide has been able to find and stop issues at the source, before they affect downstream reports and analytics. “We have some really good examples where Anomalo has detected a problem, it saved us from sending a bad report to regulators because of some business data, and we’ve been able to kick jobs off to fix the problem,” Mike said.
Going back to the marketing team that had 3,000 custom business rules, Mike shared the following:
“Since they started implementing Anomalo, Anomalo has actually found more data quality problems with its out-of-the-box, turnkey solution than their 3,000 rules.”
Now that the right tools are in place, it’s much easier for everyone to work together and implement best practices. “We have a Data Risk community at Nationwide, and they’ve created a policy: for our most important assets, you have to have it cataloged in Alation and you have to have data quality with Anomalo. That was the best day of my life in data governance, when they came up with that standard and policy.”
Breaking it Down: Advice for data teams
Mike shared the following advice for data governance leaders, data engineers, and data analysts looking to stand up data quality within their own organizations:
- Get some wins out of the gate. Standing up a fast and impactful POC with Anomalo was key to adoption. Mike advises, “Do the analytical stuff first, because it’s easy.”
- Data quality is a partnership between business and IT. “They have to work together, otherwise you’re only solving half of the equation.” Everyone’s job is quality, no matter where you are in the organization. “Whether you’re on the front lines entering something into a database, or an engineer, or anywhere else in the business, you have a responsibility to maintain the quality.”
- Finding the issue is only part of the battle. You then have to be able to remove bandaids and duct tape from the system and get to the root cause. “With Anomalo we’re bringing everyone together and making remediation plans, so for instance, finance can talk to IT and make sure they’ve resolved the problem,” Mike said.
- Focus on your most important data: “We use the term ‘appropriate data quality,’” Mike said. “There are data assets that don’t need the same level of data quality as those used for reporting or regulatory reasons. Out of the 5,000 databases that we have, only 100 fall into that top tier.” With Anomalo, you can apply out-of-box data observability to all your data and then use deep data quality monitoring across your most important data assets.
Looking Ahead: Unstructured data monitoring
What does the future of data governance hold? It’s clear that GenAI is becoming more and more critical, and data feeds these models—specifically, unstructured data from the enterprise.
“Data governance hasn’t really focused on unstructured data. We have to now,” Mike said. “One of our goals at Nationwide next year is to get our arms wrapped around what it means to govern unstructured data. And the main difference, compared to governing structured data, is going to be how we approach data quality.”
Monitoring unstructured data isn’t about finding nulls or missing data—it’s fundamentally different, and might require answering questions like “Does this document have PII?” or “Is there abusive language in it?”. And then there are questions like how do you figure out drift in unstructured data? What is drift in unstructured data? Mike’s take: “It’s going to come down to what the use case is and the business context. Every GenAI implementation may require a different definition of precision and quality.”
In June, Anomalo launched an unstructured data monitoring product to solve this problem hand in hand with customers like Nationwide. “It’s going to be an exciting journey,” Mike said. “It’s a new world for everybody.”
To learn more about our integration with Databricks, connect to Anomalo on Partner Connect. For more information about Anomalo and to explore how data quality will drive your business forward, request a demo.
Get Started
Meet with our expert team and learn how Anomalo can help you achieve high data quality with less effort.