Safety Indicators and Metrics

Metric Hygiene Explained: 4 Data Defects

Metric hygiene keeps safety dashboards credible by removing four data defects that make trends, comparisons, and executive decisions unreliable.

By 5 min read
metrics dashboard representing metric hygiene explained 4 data defects — Metric Hygiene Explained: 4 Data Defects

Key takeaways

  1. 01Audit duplicate counts before comparing sites, because one unresolved field condition can appear in several trackers and inflate perceived action.
  2. 02Mark every definition change in the trend line so leaders do not confuse reporting drift with real safety improvement.
  3. 03Replace generic denominators when the exposure question needs task counts, permit counts, entries, inspections, or other risk-specific units.
  4. 04Calibrate classifications with real cases so the dashboard compares risk patterns rather than the labeling habits of different supervisors.
  5. 05Use Andreza Araujo's safety culture work to turn safety metrics into decision-grade evidence, not attractive charts with weak foundations.

Metric hygiene is the discipline of keeping safety indicators clean enough for decisions, which means each metric has a stable definition, reliable source, current owner, clear calculation rule, and visible limit. Without hygiene, dashboards can look precise while quietly mixing duplicates, stale records, weak denominators, and inconsistent classifications.

A safety dashboard can be visually polished and still be unsafe as a management instrument. The risk appears when leaders treat a number as neutral evidence, although the number was built from records whose definitions, ownership, and timing changed without anyone noticing.

The thesis is simple enough to test in any monthly review. Metric hygiene matters because a dirty indicator does not merely describe risk badly. It can send budget, supervision, audits, and corrective action toward the wrong problem.

Definition

Metric hygiene refers to the routines that keep safety data fit for operational and executive decisions. It includes metric definitions, source ownership, refresh dates, calculation logic, classification rules, denominator discipline, and review cadence. A hygienic metric can be challenged and rebuilt by another competent person using the same rule.

Andreza Araujo's work on safety culture treats culture as observable decision patterns, not as declared intention. The same logic applies to indicators. If leaders claim to manage risk but tolerate numbers nobody can explain, the decision pattern is not disciplined risk management. It is confidence borrowed from formatting.

What are the 4 data defects in metric hygiene?

The four common data defects are duplicate counts, stale definitions, weak denominators, and mixed classifications. Each defect changes the meaning of the indicator before the board, EHS manager, or supervisor ever discusses the trend.

Duplicate counts
The same event, action, observation, or exposure appears more than once because systems, departments, or sites record it separately.
Stale definitions
The metric name stays the same while the rule behind it changes, which makes month-to-month comparison unreliable.
Weak denominators
The rate uses hours, headcount, shifts, tasks, or exposure units that do not match the risk being measured.
Mixed classifications
Different people classify similar events differently, so the dashboard compares judgment habits instead of risk patterns.

Why do duplicate counts distort safety decisions?

Duplicate counts distort decisions because they inflate activity and hide whether risk actually changed. A corrective action may appear in an audit tracker, a maintenance backlog, and a management review list, even though the field condition is only one unresolved weakness.

The trap is not only administrative. When duplicate actions create an impression of high activity, leaders may believe the system is responding faster than it is. In the opposite direction, duplicate incidents can make a site look worse than another site whose recording practice is simply cleaner.

A practical hygiene test is to sample ten recent items and ask whether each one represents a unique condition, a unique event, or a repeated record of the same issue. If the team cannot answer that question in under 30 minutes, the metric is not ready for comparison across sites.

How do stale definitions weaken trend analysis?

Stale definitions weaken trend analysis when the dashboard compares periods that were not measured by the same rule. A near-miss rate, action closure rate, or leading indicator may look improved because the reporting threshold changed, not because control quality improved.

This is why a safety metric dictionary matters. The article on building a safety metric dictionary explains how each indicator should carry its definition, source, owner, and calculation rule. Without that discipline, the metric name becomes a label detached from evidence.

Across 25+ years in executive EHS roles, Andreza Araujo has seen dashboards fail when leadership reviews the line chart but not the rule behind the line. A clean trend requires a stable definition, or at least a visible note showing when the rule changed.

When is the denominator the real problem?

The denominator is the real problem when the rate appears mathematical but does not represent exposure. TRIR uses work hours, which may help with regulatory comparison, although it is a poor denominator for many fatal risk questions. A confined-space metric, for example, may need entries, permits, rescue drills, or atmospheric tests rather than total site hours.

Weak denominators create false fairness. A small maintenance team doing high-energy work can disappear inside site-wide hours, while a large low-risk administrative population dilutes the signal. The cleaner question is not which denominator is easiest to obtain. The cleaner question is which denominator matches the exposure the metric claims to measure.

For high-consequence risk, connect hygiene to control health. The article on SIF rate, TRIR, and precursors shows why different safety questions need different metric families instead of one universal dashboard number.

How do mixed classifications hide field reality?

Mixed classifications hide field reality when similar events receive different labels across supervisors, departments, or countries. One site may classify a dropped object as a near miss, another as a property event, and another as a housekeeping observation. The database then compares classification habits rather than risk.

James Reason's work on latent conditions is useful here because classification should help the organization see system weakness before harm, not only count visible injuries after harm. If the label depends mostly on who entered the record, the organization loses the pattern that the data was supposed to reveal.

The fix is not a longer list of categories. Start with a small decision tree, calibrate reviewers with real examples, and hold a monthly disagreement review where borderline cases are discussed. The goal is not perfect taxonomy. The goal is enough consistency for trend, comparison, and learning to mean something.

How to differentiate data defects in practice

Question Likely defect First hygiene action
Can one field condition appear in three trackers? Duplicate counts Create a unique item ID and link related records.
Did the rule change while the chart stayed continuous? Stale definitions Add a definition log and mark the trend break.
Does the rate use exposure units that match the hazard? Weak denominators Replace generic hours with task, permit, or exposure counts where needed.
Would two sites classify the same case differently? Mixed classifications Run calibration using recent examples before the monthly review.

When should EHS fix metric hygiene first?

EHS should fix metric hygiene before launching a new dashboard, comparing sites, linking safety KPIs to bonuses, or using indicators to justify investment. The cost of cleaning data after leaders have already acted is higher because the organization must correct both the number and the decision built on that number.

A useful minimum standard is a 30-day hygiene sprint for the ten metrics used most often in management review. Check definitions, owners, sources, refresh dates, denominator logic, duplicate risk, and classification consistency. If a metric cannot pass that review, label it as directional rather than decision grade.

In Safety Culture: From Theory to Practice, Andreza Araujo argues that leaders reveal culture through repeated choices. Choosing to clean the evidence before asking people to trust the dashboard is one of those choices. It tells the organization that safety is not managed by attractive charts, but by decisions whose evidence can survive scrutiny.

Topics metric-hygiene safety-indicators ehs-dashboard data-quality leading-indicators ehs-manager

Frequently asked questions

What is metric hygiene in safety?
Metric hygiene in safety is the routine of keeping indicators clean enough for decisions. It checks whether each metric has a stable definition, reliable source, named owner, current refresh date, clear calculation rule, and known limitation. Without hygiene, a dashboard can look precise while mixing duplicates, stale definitions, weak denominators, and inconsistent classifications.
How often should EHS review metric hygiene?
EHS should review metric hygiene before every major dashboard launch and then at least monthly for the indicators used in management review. A deeper quarterly review is useful for cross-site comparisons, bonus-linked KPIs, and metrics that influence investment decisions. The review should test definitions, duplicates, denominators, sources, owners, and classification consistency.
What is the biggest warning sign of poor metric hygiene?
The biggest warning sign is when leaders cannot explain how a metric is calculated or what changed in the rule behind the trend. Other warning signs include unexplained site-to-site differences, sudden improvement after a reporting change, repeated action counts, old data sources, and categories that different supervisors interpret differently.
Is metric hygiene the same as a safety metric dictionary?
No. A safety metric dictionary is one tool inside metric hygiene. The dictionary stores definitions, sources, owners, and calculation rules, while metric hygiene also includes duplicate checks, denominator tests, refresh discipline, classification calibration, and review routines. This related process is expanded in the article on building a safety metric dictionary.
How does safety culture affect dashboard quality?
Safety culture affects dashboard quality because leaders decide whether weak data is challenged or accepted. As Andreza Araujo argues in Safety Culture: From Theory to Practice, culture appears in repeated decisions. A leadership team that reviews attractive charts without testing the evidence is teaching the organization that appearance matters more than risk truth.

About the author

Andreza Araújo

Safety Culture Expert | Senior EHS Executive

Andreza Araújo is a safety culture expert and senior EHS executive with more than 25 years of experience in environment, health and safety. She is a Civil Engineer and Occupational Safety Engineer from Unicamp, holds a Master's degree in Environmental Diplomacy from the University of Geneva, and completed sustainability studies at IMD Switzerland. Andreza has served in Global Head of EHS roles in Fortune 500 environments, leading cultural transformation programs across multinational operations. She has represented Brazil as a speaker at the United Nations in Paris and has spoken at the International Labour Organization in Turin. She is the author of more than 16 books on safety culture in Portuguese, Spanish, English and German. Her work has earned more than 10 EHS awards, including two recognitions from Indra Nooyi, former PepsiCo CEO.

  • Civil & Safety Engineer (Unicamp)
  • M.A. Environmental Diplomacy (University of Geneva)
  • Sustainability Cert (IMD Switzerland)
  • People Management & Coaching (Ohio University)
  • UN Paris speaker representative for Brazil
  • ILO Turin speaker
  • LinkedIn Top Voice
  • Indra Nooyi PepsiCo CEO recognition (2x)

Documentaries

Watch Andreza's documentaries

Three productions on safety culture, organizational failure and the human lessons behind major disasters.

Podcasts

Listen to Andreza's podcasts

She hosts three shows on safety leadership, EHS and organizational culture, in English and Portuguese.

Summarize with AI