Automated KPIs Changed This Agency

The director of DC’s Department of Consumer and Regulatory Affairs needed a number. Specifically, she needed to know how many building permits were currently backlogged beyond their SLA — broken out by unit, by type, and by month. The question took two analysts three days to answer. When they came back, the numbers didn’t match.

That was the moment the conversation shifted from “we should probably improve our data” to “this is broken and we’re fixing it now.”

What 160 Hours a Month of Manual KPI Assembly Actually Looks Like

DCRA operated five distinct business units: Buildings, Business Licensing, Inspections, Residential, and Surveyors. Each had its own systems, its own data formats, and its own way of tracking performance. When the agency needed to report KPIs — which it did regularly, to the Mayor’s office and to the DC Council — the process looked like this:

An analyst in each unit would pull data from their system. They’d format it into a spreadsheet. Those spreadsheets would get emailed to a central analyst, who would reconcile them, check for obvious errors, and assemble a master report. Then someone would format that into a presentation deck.

This consumed roughly 160 hours of analyst time every month. And the results were still inconsistent — different units used different definitions for the same metrics, and the reconciliation step introduced its own errors when an analyst had to decide which conflicting number was “right.”

The worst part: by the time the report was assembled, the data was already stale. Leadership was making decisions based on a snapshot from three weeks ago.

The Real Problem Wasn’t the Spreadsheets

It would have been easy to look at this and say: “We need a dashboard.” That’s what most consultants would have proposed — wire up some BI tool to the existing systems, build some charts, call it done.

But the dashboards would have inherited every problem the spreadsheets had. The underlying issue was structural: there was no single place where DCRA’s data lived in a consistent, trustworthy format. You can’t build reliable dashboards on unreliable data.

So we built the data warehouse first.

Building the Foundation Nobody Sees

The data warehouse wasn’t glamorous work. It meant connecting to five separate source systems, understanding each one’s schema and quirks, building extraction pipelines, and — critically — defining a unified data model that could represent permits, licenses, inspections, and complaints in a way that was consistent across all five units.

Some of the problems we hit:

Definition mismatches. The Buildings unit counted a “completed permit” differently than Inspections counted a “completed inspection.” Same word, different meanings. We had to sit down with each unit and agree on canonical definitions before writing a single line of ETL.

Historical data gaps. One unit had clean data going back four years. Another had six months and a bunch of Access databases from the early 2000s that no one could explain. We made a pragmatic call: start with what’s clean, backfill what’s recoverable, and clearly mark what’s missing rather than pretending it doesn’t exist.

Schema drift. Source systems changed without warning. A vendor update would add a column, rename a field, or change a date format. We built automated schema monitoring into the ingestion pipelines — if the upstream schema changed, we knew about it before the downstream reports broke.

None of this is the kind of work that makes a good slide deck. But without it, the dashboards would have been just as unreliable as the spreadsheets they replaced.

From Warehouse to Automated KPIs

With the data warehouse in place, automating the KPI pipeline was almost straightforward. Almost.

The nightly pipeline pulled fresh data from all five units, applied the agreed-upon metric definitions, and pushed results into Tableau dashboards. What previously required a team of analysts working for days now ran unattended overnight.

But “automated” doesn’t mean “fire and forget.” We built validation checks into every stage of the pipeline:

Completeness checks: Did every source system deliver data? If a source was missing, the pipeline flagged it rather than silently reporting zeros.
Range checks: Are today’s numbers within a reasonable range of yesterday’s? A permit count that suddenly doubled probably means a data problem, not a productivity miracle.
Reconciliation: The pipeline automatically compared its totals against the source systems to catch transformation errors.

These checks caught real problems. In the first month alone, we identified a source system that was silently dropping records during peak load — an issue that had been invisibly corrupting reports for months.

Making It Public

The internal dashboards were a clear win. But the agency had a bigger ambition: making performance data available to the public. DC residents and businesses who interacted with DCRA — permit applicants, licensed businesses, property owners — deserved to see how the agency was performing.

This added a layer of complexity. Internal dashboards can be messy and annotated; public dashboards need to be clear, contextualized, and honest. We designed the public-facing views to show:

Current processing times by permit type (with historical averages for context)
Inspection completion rates
License application status metrics
Trend lines so users could see if things were getting better or worse

The dashboards went live and had 30,000 users in the first month. No marketing, no press release — just a link on the agency’s website. People wanted this data.

The Part Most People Miss

The dashboards got the attention. The director could show them to the Mayor’s office. Reporters cited them. Residents used them to check on their permits.

But the real transformation was quieter. Those 160 hours a month of analyst time didn’t just disappear — they got redirected. The analysts who had been assembling reports were now analyzing data. Instead of answering “how many permits are backlogged?” they could investigate why permits were backlogged in specific categories, and what could be done about it.

One analysis led to a machine learning model for permit risk scoring that identified low-risk permits for expedited processing — a 15% efficiency improvement across all permit types. That analysis would have been impossible when the team’s entire bandwidth was consumed by manual report assembly.

What This Means If You’re Not a Government Agency

The specifics here are government-specific, but the pattern isn’t. If any of this sounds familiar, you’re probably dealing with the same structural problem:

Multiple systems, no single source of truth. Whether it’s departments using different CRMs or warehouse data that doesn’t match accounting data, the root issue is the same.
Manual report assembly consuming analyst time. If smart people are spending days formatting spreadsheets instead of finding insights, you’re burning your most expensive resource on your lowest-value task.
Stakeholder trust in the numbers is low. When people routinely question report accuracy, they stop using the data for decisions. And that’s when gut-feel management takes over.

The fix follows the same sequence every time: unify the data first, automate the reporting second, and only then build the dashboards and analytics that people actually see.

Starting with the dashboards — which is what most vendors will sell you — just gives you a prettier version of the same unreliable numbers.

This engagement was conducted prior to founding Blue Ridge Dataworks. Read the full case study for detailed metrics and technical approach.