Safety Metric Dictionary: Build It in 21 Days
A practical 21-day method for EHS managers to define safety metrics, owners, formulas, thresholds, and data-quality rules before dashboard review.

Key takeaways
- 01Define each safety metric with one formula, one owner, one data steward, and one escalation rule before redesigning dashboard visuals.
- 02Limit the first dictionary to 12 to 18 metrics so the team fixes governance quality before expanding the reporting catalog.
- 03Separate activity volume from risk quality, because observation counts and audit totals can rise while critical controls remain weak.
- 04Audit formulas, contractor rules, extraction dates, and thresholds before comparing sites, since small data defects can distort executive safety decisions.
- 05Use Andreza Araújo's safety culture advisory work when your dashboard needs to reveal risk clearly without rewarding silence or cosmetic performance.
A safety dashboard does not fail only because the numbers are late. It fails when the same number carries 3 meanings in 3 meetings. A plant manager reads near-miss rate as participation, the EHS manager reads it as exposure visibility, and the operations manager reads it as evidence that the crew is creating paperwork. After 2 review cycles, the dashboard becomes a negotiation over language instead of a decision tool.
A safety metric dictionary fixes that problem before it reaches the board pack. It defines each metric, its formula, owner, source system, frequency, threshold, interpretation rule, and escalation path. ISO 45001:2018 specifies performance evaluation as part of the occupational health and safety management system, while OSHA describes leading indicators as preventive measures that can reveal whether safety activities are working before harm occurs. Those 2 anchors matter because a metric that cannot be defined and acted on is not a control signal, even when it looks precise.
Across 25+ years of executive EHS work, Andreza Araújo has seen the same pattern in multinational operations: leaders rarely lack data, but they often lack agreement about what the data means. The thesis of this guide is narrow and practical. A safety metric dictionary should be built before dashboard redesign, because dashboards only display definitions that the organization has already disciplined.
Key Takeaways
- A safety metric dictionary gives every KPI one definition, one formula, one owner, and one decision rule.
- The first 21 days should cover no more than 12 to 18 metrics, because the goal is governance quality, not catalog size.
- Lagging indicators need data-quality rules, while leading indicators need interpretation rules that prevent activity counting from replacing risk insight.
- Each metric needs a threshold, a review cadence, and an escalation owner, otherwise the dashboard creates awareness without action.
- Andreza Araújo's work on safety culture reinforces the same principle: measurement must make risk visible without rewarding silence.
Step 1: Select the 12 metrics that drive real decisions
Start with the metrics that already appear in executive, site, and supervisor reviews. Do not begin by asking what else could be measured, because that question expands the list before the organization has fixed the meaning of the existing numbers. In most mid-size operations, the first dictionary should contain 12 to 18 entries, including 4 lagging indicators, 4 leading indicators, 2 control-verification indicators, and 2 to 4 participation or quality indicators.
The practical test is simple enough for a 60-minute working session. If a metric has been discussed in the last 90 days, kept in the first version. If no one can name a decision that depends on it, park it for version 2. This prevents the dictionary from becoming a glossary that nobody uses.
For reference, connect this selection to leading indicator quality, because weak leading indicators usually survive because they are easy to count rather than because they change risk.
Step 2: Write the operational definition in plain language
Each metric needs a definition that a supervisor, EHS analyst, and plant manager would interpret in the same way. "Safety observation completed" is not enough. The dictionary should say whether the observation counts only after field discussion, whether anonymous entries qualify, whether duplicate observations are excluded, and whether the observation must include a risk category.
Use 2 lines for the first definition. The first line says what the metric measures. The second line says what it does not measure. That second line prevents political misuse. For example, near-miss rate measures reported precursor events per hour worked, but it does not measure the true number of precursor events that occurred, because reporting culture and supervisor response shape the numerator.
Step 3: Lock the formula before dashboard work starts
Formula drift is one of the quietest sources of safety distortion. One site calculates TRIR on employee hours only, another includes contractors, and a third removes restricted-work cases after medical review. OSHA recordkeeping rules define recordable cases for the United States, but global companies still need a corporate rule for contractor inclusion, temporary labor, and joint ventures when comparing sites.
Write the formula with numerator, denominator, multiplier, inclusion rule, exclusion rule, and example calculation. If the rate uses 200,000 hours, state that base. If a global report uses 1,000,000 hours, state that base too. A metric dictionary can carry both, but it must not allow managers to choose the denominator that makes the month look better.
This is where metric hygiene and data defects become operational, because a clean chart depends on a clean formula before it depends on software.
Step 4: Assign one accountable owner and one data steward
A metric owner decides what the number means and what action follows. A data steward protects the source, timing, and completeness of the data. These roles may sit in the same person in a small site, but the dictionary should still name both responsibilities because confusion between interpretation and data entry creates avoidable conflict.
For a corrective-action aging metric, the owner may be the EHS manager while the steward may be the system administrator or EHS analyst. For a critical-control verification metric, the owner may be operations, because field control quality depends on line execution. That distinction matters. If EHS owns every metric, line leaders can treat safety performance as a department report instead of a management responsibility.
Step 5: Define the source system and the extraction date
The dictionary should name the source system, the exact field used, and the extraction date. A monthly dashboard built on data pulled at 8 a.m. on the first business day will not match a site report extracted at 5 p.m. after late closures. Without that rule, the review spends 15 minutes debating why 2 charts disagree.
Set a single extraction rule for each metric. For monthly safety review, the cleanest rule is often "data extracted on the third business day at 12:00 local site time." If the organization works across 5 or more countries, add the time zone and cut-off treatment for late incident classification. ISO 45001:2018 expects documented information where needed for confidence in the process, and metric extraction rules are exactly that kind of discipline.
Step 6: Add a threshold that triggers discussion, not decoration
Every metric needs an interpretation threshold. Green, amber, and red bands are useful only when they trigger different questions. A TRIR threshold may trigger a review of case classification, but a leading indicator threshold should trigger a review of whether risk controls are being tested in the right places. Treating both thresholds the same way is a common executive error.
Use 3 layers: attention threshold, escalation threshold, and stop-and-review threshold. For example, if critical-control verification drops below 95 percent completion, the EHS manager reviews missing checks. Below 90 percent, the operations manager explains the gap. Below 80 percent, the site pauses the affected high-risk work until the control-verification backlog is understood. The exact values must fit the operation, but the escalation logic must be written before the first red cell appears.
For control-heavy indicators, connect the dictionary to control assurance evidence, because a threshold that cannot be verified in the field is only a dashboard color.
Step 7: Separate activity volume from risk quality
Many leading indicators reward volume. The crew completed 120 observations, 40 toolbox talks, and 18 audits. None of those numbers proves that the highest-risk work was touched. OSHA's leading indicator guidance emphasizes preventive activity, but the metric dictionary still has to distinguish activity from risk quality because volume can become a substitute for judgment.
Add a quality field to every activity-based metric. For observations, the quality field may be "percentage linked to a critical risk." For toolbox talks, it may be "percentage based on a current site exposure rather than a generic topic." For audits, it may be "percentage with field evidence attached." This is the difference between counting work and seeing whether work reduced exposure.
Step 8: Define the review cadence and decision forum
A metric without a forum becomes archival data. The dictionary should state whether the metric is reviewed daily by supervisors, weekly by operations, monthly by site leadership, or quarterly by the executive team. It should also state which decision is expected in that forum.
Do not send every metric to every forum. The board does not need every housekeeping finding, and the shift supervisor does not need a quarterly enterprise heat map. Good cadence design protects attention. In Andreza Araújo's book Safety Culture: From Theory to Practice, the management routine around safety is treated as part of culture, not as administrative ceremony. The metric dictionary should make that routine explicit.
If the organization already has a monthly review, align this work with monthly safety metric review steps so the dictionary supports the meeting instead of becoming another spreadsheet.
Step 9: Add a misuse warning for each sensitive metric
Some metrics are dangerous because they invite the wrong behavior. Zero-accident targets can suppress reporting. Observation quotas can create shallow entries. Bonus-linked injury rates can turn case management into reputation management. A good dictionary names the misuse risk in direct language so leaders cannot claim surprise later.
Write the warning as one sentence. "Do not use this metric to rank supervisors unless reporting quality is independently reviewed." "Do not treat lower near-miss reporting as safer performance without worker voice evidence." "Do not connect injury rate to bonus payout unless underreporting controls are active." These warnings are not legal decoration. They protect the integrity of the signal.
Step 10: Test the dictionary in one review cycle before scaling
After 21 days, use the dictionary in one real review. Ask each participant to interpret 3 metrics before the meeting starts. If interpretations differ, revise the definition, formula, threshold, or owner. The first version is successful when disagreements become visible before decisions are made, not when the document looks complete.
The minimum pilot package should include the dictionary, 1 dashboard page, 1 threshold table, and 1 decision log. After the review, record which metric led to a decision, which metric only created discussion, and which metric should be removed. That test keeps the dictionary tied to risk control rather than document control.
Example structure for the first dictionary
| Field | What to write | Why it matters |
|---|---|---|
| Definition | 1 measure and 1 exclusion | Prevents 2 meanings for the same KPI |
| Formula | Numerator, denominator, multiplier, and example | Stops site-by-site calculation drift |
| Owner | One decision owner and one data steward | Separates accountability from data entry |
| Threshold | Attention, escalation, and stop-and-review levels | Turns numbers into action rules |
| Misuse warning | 1 sentence naming the distortion risk | Protects reporting culture and signal quality |
Final checklist for the EHS manager
- Limit version 1 to 12 to 18 metrics.
- Write one definition, one formula, one owner, and one threshold for each metric.
- Include contractor and temporary-labor rules where rates are compared across sites.
- Add one misuse warning to every injury, observation, and bonus-sensitive metric.
- Test the dictionary in one real monthly review before scaling it to other plants.
A safety metric dictionary is not a documentation exercise. It is a governance tool for deciding what the organization will believe, challenge, and act on when safety numbers move. If your dashboard currently creates more argument than decisions, Andreza Araújo's safety culture advisory work can help rebuild the measurement routine around risk visibility, leadership accountability, and field evidence.
Frequently asked questions
What is a safety metric dictionary?
How long does it take to build a safety metric dictionary?
Who should own safety metric definitions?
What is the difference between a safety metric and a safety KPI?
How does a metric dictionary support safety culture?
About the author
Andreza Araújo
Safety Culture Expert | Senior EHS Executive
Andreza Araújo is a safety culture expert and senior EHS executive with more than 25 years of experience in environment, health and safety. She is a Civil Engineer and Occupational Safety Engineer from Unicamp, holds a Master's degree in Environmental Diplomacy from the University of Geneva, and completed sustainability studies at IMD Switzerland. Andreza has served in Global Head of EHS roles in Fortune 500 environments, leading cultural transformation programs across multinational operations. She has represented Brazil as a speaker at the United Nations in Paris and has spoken at the International Labour Organization in Turin. She is the author of more than 16 books on safety culture in Portuguese, Spanish, English and German. Her work has earned more than 10 EHS awards, including two recognitions from Indra Nooyi, former PepsiCo CEO.
- Civil & Safety Engineer (Unicamp)
- M.A. Environmental Diplomacy (University of Geneva)
- Sustainability Cert (IMD Switzerland)
- People Management & Coaching (Ohio University)
- UN Paris speaker representative for Brazil
- ILO Turin speaker
- LinkedIn Top Voice
- Indra Nooyi PepsiCo CEO recognition (2x)
Documentaries
Watch Andreza's documentaries
Three productions on safety culture, organizational failure and the human lessons behind major disasters.
Podcasts
Listen to Andreza's podcasts
She hosts three shows on safety leadership, EHS and organizational culture, in English and Portuguese.