Severity Weighting Model: Build It in 30 Days

Build a severity weighting model that helps EHS managers rank SIF exposure, failed controls, and weak signals before clean metrics mislead leaders.

By Andreza Araújo June 14, 2026 8 min read updated June 17, 2026

metrics dashboard representing severity weighting model build it in 30 days — Severity Weighting Model: Build It in 30 Days

Key takeaways

01Diagnose dashboard distortion by separating frequency metrics from severity signals, especially when 1 high-potential near miss outweighs several minor cases.
02Build a 5-level severity scale that supervisors can apply in minutes, with SIF potential and failed critical controls clearly defined.
03Test the model against 12 months of old data before leaders use it, so the score reflects real exposure rather than spreadsheet theater.
04Assign review thresholds from local supervisor to executive visibility, including a 48-hour escalation rule for the highest weighted events.
05Apply Andreza Araujo's safety culture diagnostics when clean numbers hide serious exposure and your EHS dashboard needs decision-quality evidence.

A severity weighting model is a safety metric method that gives higher value to events with fatality, permanent disability, SIF exposure, or failed critical controls, instead of treating every event as one equal count. The model helps an EHS manager see where a 2026 dashboard looks clean while serious risk is still moving through the operation.

This severity layer also protects leaders from recordable injury rate traps, especially when low-consequence volume distracts attention from high-potential events.

ILO reported in 2023 that nearly 3 million people die each year from work-related accidents and diseases, while another 395 million sustain non-fatal work injuries. This guide shows how to build a severity weighting model in 30 days so the safety dashboard stops rewarding low counts and starts exposing the events that can change a life.

Why raw incident counts distort safety decisions

Severity weighting improves prioritization, but it still needs the right exposure base. For that reason, EHS teams should compare exposure hours, task volume and control verification before deciding whether a weighted rate truly represents the work.

Raw counts distort safety decisions because they give the same visual weight to a first-aid cut, a high-potential near miss, and a failed control that could have killed someone. A plant with 3 hand cuts may look worse than a plant with 1 dropped object near miss, even though the second site may be carrying the more serious exposure.

ILO reported in 2023 the scale of work-related harm precisely because injury prevention cannot be reduced to neat monthly totals. When a dashboard treats all cases as equal, it protects the denominator more than the worker.

As Andreza Araujo argues in *Muito Além do Zero*, the English gloss is *Far Beyond Zero*, lagging indicators look in the rearview mirror and can protect the number instead of protecting life. That position matters when leaders celebrate 90 days without a recordable while a failed isolation, an uncontrolled suspended load, or a near fall is quietly normalized.

The practical move is to separate frequency from severity. Keep TRIR, DART, LTIFR, and near-miss volume for trend analysis, but add a weighted severity score whose purpose is to tell leaders where the next serious injury or fatality could come from.

Step 1: Define the decision the model must support

Step 1 is to name the decision before naming the formula. In the first 2 days, decide whether the model will support monthly executive review, weekly EHS prioritization, critical-control assurance, contractor governance, or capital allocation, because each decision needs a different level of precision.

ISO 45001:2018 specifies requirements for an occupational health and safety management system that manages risks and improves OH&S performance. A severity model only earns its place if it helps that management system choose work, money, supervision, and escalation faster than the old dashboard did.

Across 25+ years leading EHS at multinationals, Andreza Araujo has observed that measurement fails when it is built for reporting rather than decision. A 12-column spreadsheet can still be useless if no one can say which exposure needs action by Friday.

Write one decision sentence before touching Excel or Power BI. For example, "This model will rank the top 10 weekly exposures that require field verification by the EHS manager and operations manager."

Step 2: Build a 5-level severity scale

Step 2 is to create a 5-level severity scale with plain definitions. A useful first version is Level 1 for minor first aid, Level 2 for medical treatment or restricted work, Level 3 for lost-time injury or serious property damage, Level 4 for SIF potential, and Level 5 for fatality, permanent disability, or multiple-person exposure.

The trap is making the scale too elegant. If a supervisor needs 15 minutes to classify one event, the model will be abandoned or delegated to one analyst whose interpretation becomes the hidden standard.

In *Diagnostico de Cultura de Seguranca*, Andreza Araujo's central metric thesis is that quantity is not the same as quality and commitment. The same applies here: a high volume of weak classifications does not create risk intelligence.

Use a one-page guide with 2 examples per level. A dropped wrench inside an exclusion zone may be Level 2, while the same dropped wrench outside the exclusion zone but passing within 1 meter of a worker may become Level 4 because the control failed where life was exposed.

Step 3: Add probability without hiding severity

Step 3 is to add probability as a second axis, not as a way to dilute severity. Use 4 simple probability bands: rare, possible, likely, and recurring, then record the evidence behind the choice, such as 3 similar near misses in 60 days or 2 failed inspections in the same area.

HSE's HSG65 guidance explains the Plan, Do, Check, Act approach and links health and safety management to ordinary management discipline. Severity weighting belongs in the Check step, because it tests whether the organization is seeing the right signals before harm appears in the lagging metrics.

The mistake is averaging severity and probability until a Level 5 event with "rare" probability becomes moderate. That arithmetic may look objective, although it can bury the exact exposure the executive team needed to see.

Set a rule that any Level 5 credible exposure is automatically escalated, even if probability is low. The score may guide sequence, but the escalation rule protects against the dashboard making a fatal risk look administratively acceptable.

Step 4: Weight critical controls separately

Step 4 is to score failed critical controls as a distinct factor. Give a separate multiplier when the event involves failed isolation, bypassed machine guarding, missing fall protection, ineffective confined-space monitoring, failed LOTO verification, or a permit-to-work that did not match the real task.

This is where many dashboards lose the plot. They count the event but ignore whether the barrier that failed was the final credible defense between routine work and a life-changing outcome.

During the PepsiCo South America tenure, where the accident ratio fell 50% in 6 months, Andreza Araujo learned that visible improvement depends on changing how leaders react to weak signals, not only on asking for more reports. A failed critical control is one of those weak signals.

Use a multiplier of 1.5 or 2.0 for critical-control failure in the first version. The exact number matters less than the governance rule that sends those cases into the weekly control-assurance review, where the owner must prove restoration.

Step 5: Create the scoring formula

Step 5 is to convert the scale into a transparent formula. A practical starter formula is Severity Score = severity level x probability band x critical-control multiplier, with an added escalation flag for every credible SIF exposure.

Do not pretend the formula is science if the input quality is weak. The purpose is disciplined prioritization, not mathematical theater, because a flawed score presented with 2 decimal places can mislead faster than a qualitative review.

A simple example is useful. A Level 4 near miss with possible recurrence, scored as 4 x 2 x 2.0 because a critical control failed, produces a score of 16 and triggers review. A Level 2 minor injury with recurring pattern, scored as 2 x 4 x 1.0, produces 8 and remains important, but it does not outrank the SIF exposure.

Connect the formula to the safety metric dictionary so every site uses the same definitions. Without a shared dictionary, severity weighting becomes a debate about vocabulary rather than a decision tool.

Step 6: Test the model on 12 months of old data

Step 6 is to back-test the model on 12 months of events before using it for leadership decisions. The goal is not to rewrite history, but to check whether the new score surfaces the cases that experienced EHS people already know were serious.

Pull recordables, first-aid cases, near misses, high-potential observations, failed inspections, permit deviations, and critical-control verification failures. If the data set has fewer than 50 records, extend the window to 18 or 24 months so the test is not driven by one odd month.

In more than 250 cultural-transformation projects supported by Andreza Araujo's team, a repeated pattern appears: leaders trust dashboards faster when the tool explains past events they remember. If the model ranks only paperwork deviations at the top, the formula needs adjustment.

Compare the weighted ranking with indicator triangulation, including worker voice, field verification, lagging results, and control evidence. A severity model is stronger when it agrees with several evidence streams, and more useful when disagreement forces a focused review.

Step 7: Set review thresholds and ownership

Step 7 is to assign thresholds and owners before the first live dashboard goes out. A common structure is score 1 to 5 for local supervisor review, 6 to 10 for EHS manager review, 11 to 15 for operations manager escalation, and 16 or above for executive visibility within 48 hours.

The threshold is not a punishment ladder. It is a response-time ladder, which means the organization is deciding in advance how fast serious exposure deserves attention.

As Andreza Araujo argues in *Safety Culture: From Theory to Practice*, safety culture is tested by what happens when pressure rises. A threshold without ownership fails under pressure because everyone sees the red signal and no one owns the next move.

Put the threshold table inside the monthly EHS governance pack and the weekly operations meeting. The model should point to a named owner, a due date, and the evidence required to close the exposure, not to a generic "follow up" note.

Step 8: Calibrate the model every month

Step 8 is to run a 30-minute calibration meeting every month for the first 3 months, then quarterly after the model stabilizes. Calibration compares 5 to 10 recently scored events and asks whether different people would classify them the same way.

OSHA describes leading indicators as proactive and preventive measures that can reveal potential problems before harm occurs. Severity weighting becomes a stronger leading signal when calibration reduces personal bias and keeps the same event from receiving 3 different scores across 3 sites.

The market often minimizes calibration because it feels like administration. In practice, calibration is where the company discovers whether supervisors, EHS analysts, and operations managers share the same understanding of SIF exposure.

Use the first calibration to compare the weighted model with leading indicator quality. If the highest scores come from weak data, fix the data capture. If strong field signals never reach the model, fix the reporting channel.

Comparison: raw counts vs severity weighting

A raw-count dashboard measures volume, while a severity-weighted dashboard measures decision urgency. The comparison matters because a site can improve its recordable count by 30% and still carry an uncontrolled SIF exposure that deserves immediate executive attention.

Dashboard choice	What it shows	What it hides	Best use
Raw incident count	Number of cases in a period	Severity, failed controls, and SIF potential	Basic trend tracking
TRIR or LTIFR	Recordable or lost-time frequency	High-potential near misses and underreporting	External comparison with caution
Severity weighting	Decision urgency based on harm potential	Data quality problems if definitions are weak	Weekly prioritization and escalation
Control assurance score	Whether critical controls still work	Injury outcomes when no event occurred	Barrier restoration and field verification

The best dashboard does not replace one view with another. It combines weighted severity, control assurance evidence, leading indicators, and lagging indicators so leaders do not confuse a quiet month with a safe system.

Conclusion: weight what can change a life

A severity weighting model helps EHS managers move from counting events to ranking exposures, especially when SIF potential, failed critical controls, and weak signals are present but injury frequency still looks acceptable.

If your dashboard still rewards low numbers while field leaders worry about serious exposure, start with this 30-day model and connect it to Andreza Araujo's safety culture diagnostics, Safety School resources, or ACS Global Ventures consulting at Andreza Araujo.

Topics safety-metrics severity-rate sif critical-controls ehs-manager leading-indicators

Frequently asked questions

What is a severity weighting model in safety metrics?

A severity weighting model gives higher score to events with greater harm potential, such as SIF exposure, permanent disability risk, fatality potential, or failed critical controls. It does not replace TRIR, LTIFR, DART, or near-miss volume. It adds a decision layer so leaders can see which events require faster escalation, stronger control verification, or executive review.

How long does it take to build a severity weighting model?

A practical first version can be built in 30 days. Use the first week to define the decision and severity scale, the second week to add probability and critical-control multipliers, the third week to test 12 months of data, and the fourth week to set thresholds, owners, and calibration routines.

Should SIF potential always outrank minor injury frequency?

Credible SIF potential should receive automatic escalation even when probability looks low. Minor injury frequency still matters because patterns can reveal weak controls, but a high-potential event with failed isolation, fall protection, lifting control, or machine guarding deserves a different governance path. Andreza Araujo's critique in *Muito Alem do Zero* supports this distinction between protecting the number and protecting life.

What is the difference between severity weighting and TRIR?

TRIR measures recordable incident frequency, usually normalized by hours worked. Severity weighting ranks events by harm potential, recurrence probability, and failed controls. TRIR helps with external comparison, while severity weighting helps managers decide what to verify, fund, and escalate this week.

How does severity weighting connect with leading indicators?

Severity weighting improves leading indicators by telling the organization which weak signals deserve priority. A leading indicator may count inspections, observations, or near-miss reports, while severity weighting asks whether those signals involve serious exposure, failed critical controls, or repeated patterns. It pairs naturally with leading indicator quality audits.

About the author

Andreza Araújo

Safety Culture Expert | Senior EHS Executive

Andreza Araújo is a safety culture expert and senior EHS executive with more than 25 years of experience in environment, health and safety. She is a Civil Engineer and Occupational Safety Engineer from Unicamp, holds a Master's degree in Environmental Diplomacy from the University of Geneva, and completed sustainability studies at IMD Switzerland. Andreza has served in Global Head of EHS roles in Fortune 500 environments, leading cultural transformation programs across multinational operations. She has represented Brazil as a speaker at the United Nations in Paris and has spoken at the International Labour Organization in Turin. She is the author of more than 16 books on safety culture in Portuguese, Spanish, English and German. Her work has earned more than 10 EHS awards, including two recognitions from Indra Nooyi, former PepsiCo CEO.

Civil & Safety Engineer (Unicamp)
M.A. Environmental Diplomacy (University of Geneva)
Sustainability Cert (IMD Switzerland)
People Management & Coaching (Ohio University)
UN Paris speaker representative for Brazil
ILO Turin speaker
LinkedIn Top Voice
Indra Nooyi PepsiCo CEO recognition (2x)

Follow Andreza

Documentaries

Watch Andreza's documentaries

Three productions on safety culture, organizational failure and the human lessons behind major disasters.

Um Dia Para Não Esquecer

73 Segundos — O Desastre Anunciado

TITANIC — O Silêncio Que Ainda Ouvimos

Podcasts

Listen to Andreza's podcasts

She hosts three shows on safety leadership, EHS and organizational culture, in English and Portuguese.

Headline Podcast in English

Headline Podcast in Portuguese

O Conselho de Segurança

safety-indicators-and-metrics

Safety Dashboards: 5 Executive Questions That Expose Metric Theater

A diagnostic F1 article for executives who need the monthly dashboard to separate reporting motion from field control before a clean slide creates false comfort.

Andreza Araújo July 09, 2026 8 min

safety-indicators-and-metrics

How PepsiCo South America Cut Its Accident Ratio 50% in 6 Months and What the Metric Still Hid

A PepsiCo South America case study showing why one improved accident ratio was not enough, and how precursor metrics, field checks, and ownership made the result usable.

Andreza Araújo July 08, 2026 7 min

safety-indicators-and-metrics

How to Validate a Safety Dashboard for the Monthly Review in 8 Steps

A safety dashboard only matters when it changes a decision. Use eight checks to test definitions, ownership, field evidence, and follow-up before the monthly review.

Andreza Araújo July 07, 2026 5 min