Safety Indicators and Metrics

Control Assurance: Audits vs Checks vs Field Evidence

Control assurance needs more than audit scores, because leaders need current proof that critical controls still work where real work happens.

By 8 min read
metrics dashboard representing control assurance audits vs checks vs field evidence — Control Assurance: Audits vs Checks vs

Key takeaways

  1. 01Separate audit scores, control checks and field evidence because each one answers a different safety decision.
  2. 02Use audit scores for management-system discipline, not as proof that fatal-risk barriers are working today.
  3. 03Prioritize control checks when leaders need current evidence that critical controls are present, functional and owned.
  4. 04Add field evidence to expose work as performed, especially when dashboards are green but reporting is quiet.
  5. 05Request an Andreza Araújo safety-culture diagnostic when executive confidence depends on proving control, not protecting a score.

Control assurance is the difference between a safety dashboard that looks calm and a safety system that can prove its barriers still work. Audit scores, control checks and field evidence all have value, but they answer different questions. Treating them as interchangeable is how leaders end up with green reports while risk is quietly accumulating in the work.

The central thesis is simple enough to test in any plant, logistics site or construction project. Audit scores tell leaders whether the management system has evidence. Control checks tell them whether a defined barrier exists and is working today. Field evidence tells them whether the work, as performed by real people under real pressure, matches the story in the procedure, the training file and the monthly dashboard.

Across 25+ years leading EHS in multinational environments, Andreza Araújo has seen that executives often ask for more metrics when they actually need better proof. In her Portuguese title *Diagnóstico de Cultura de Segurança* (Safety Culture Diagnosis), she argues that quantity is not a synonym for quality and commitment in health and safety. That position matters here because a metric pack can grow every month while the organization still has no reliable answer to a more serious question: which critical controls would fail if production pressure increased tomorrow?

Evaluation Criteria For Safety Decisions

The right metric method depends on the decision being made. A board that is deciding whether to fund machine guarding upgrades needs a different signal than an EHS manager who is testing whether permit-to-work quality improved after supervisor coaching. Because the decision changes, the proof standard also changes.

Use five criteria before choosing the method. First, decide whether the question is about compliance, control effectiveness, behavior under pressure, trend movement or executive accountability. Second, define the time horizon, since an annual audit and a weekly field verification cannot carry the same signal. Third, test whether the method can reveal weak signals before injury data appears. Fourth, decide who can challenge the result. Fifth, identify whether the number can be gamed, especially when targets influence bonuses, reputation or promotion.

This is where metric hygiene becomes a governance issue rather than a data-cleaning exercise. If definitions change by site, if missing observations are quietly excluded, or if green status depends on self-reported closure, the dashboard may reward the easiest story instead of the safest work.

Option 1: Audit Scores

Audit scores are useful when leaders need a structured view of management-system discipline. ISO 45001:2018 specifies monitoring, measurement, analysis, performance evaluation and internal audit as part of the occupational health and safety management system, which means audit evidence belongs in the assurance architecture. The weakness begins when a high score is treated as proof that the field is controlled.

An audit score usually measures whether evidence exists, whether procedures are documented, whether training records are complete and whether responsible people can explain the system. Those are legitimate questions. They protect against improvisation, undocumented decisions and the slow erosion of accountability. They also create a common language across sites, which is helpful when a regional EHS director needs comparability across countries.

The trap is that audit scores are often periodic, prepared and heavily mediated by documentation. A site can score well because the right binder exists, because the interviewees are rehearsed, or because the audit samples miss the high-risk work that happens at night, during maintenance or under contractor pressure. As Andreza Araújo often emphasizes in cultural diagnosis work, the official version of safety and the operated version of safety can diverge sharply, especially when the organization has learned to protect the score.

Choose audit scores when the decision is about system maturity, certification readiness, governance coverage or consistency between sites. Do not choose them as the primary proof for serious injury and fatality prevention unless they are paired with real-time verification of the barriers that would prevent the event.

Option 2: Control Checks

Control checks are stronger than audit scores when the question is whether a named barrier is present, functional and owned. A control check does not ask whether the procedure exists. It asks whether the interlock works, whether the isolation was tested, whether the rescue kit is complete, whether the scaffold tag matches the actual condition, or whether the supervisor has authority to stop work when the control is missing.

This method fits high-risk work because it moves from paperwork to barrier condition. It also improves executive conversations. A plant manager can debate whether a 94 percent audit score is good enough, but it is much harder to normalize a failed control check on a fatal-risk task. The discussion becomes concrete, which is why control verification belongs beside SIF prevention rather than buried in a generic compliance dashboard.

There is still a failure mode. Control checks become theater when they turn into tick-box routines, especially when the verifier is rushed, lacks technical competence, or feels pressure to avoid stopping production. The article on safety KPIs, bonuses and control checks explains why incentives can distort the signal when the organization rewards green status more than honest escalation.

Choose control checks when the decision involves fatal-risk exposure, contractor work, permit-to-work quality, barrier restoration after incidents, or capital prioritization. The method is weaker for culture diagnosis by itself because it can tell leaders what failed, although it may not explain why people accepted the failure.

Option 3: Field Evidence

Field evidence is the most revealing option when leaders need to know whether work as done matches work as imagined. It includes structured observations, worker interviews, photos of actual conditions, pre-task conversations, near-miss narratives, supervisor debriefs and evidence from deviations that did not become injuries. The point is not to collect more stories, but to test whether the system behaves as designed when real constraints appear.

This is the option most leaders underestimate because it is less tidy than an audit score. It requires judgment, calibration and time in the operation. It can also expose uncomfortable contradictions. The procedure may say that a task requires two people, while the night shift routinely performs it with one. The dashboard may show closed corrective actions, while the same crew keeps building informal workarounds because the original fix slowed the job without reducing risk.

Field evidence is especially valuable for weak signals. The board may see no injuries, but the field may already be showing equipment bypasses, repeated delays in closing high-risk actions, unexplained silence in near-miss reporting, or supervisors who avoid escalation because escalation is interpreted as poor performance. That is why weak-signal metrics for boards should include evidence from the worksite, not only aggregate injury rates.

Choose field evidence when the decision is about cultural truth, operational drift, leadership behavior, procedure usability or the credibility of reported metrics. Its weakness is comparability, since qualitative evidence must be coded carefully if leaders want to compare one site with another. In practice, the solution is not to discard field evidence, but to standardize the sampling method and keep the evidence close to the decision.

Decision Matrix

The comparison below is a practical way to decide which option should lead the assurance process. Most mature organizations need all three, although one method should be primary for each decision.

Decision needAudit scoresControl checksField evidence
Certification or system disciplineStrongSupportingSupporting
Fatal-risk barrier confidenceWeak aloneStrongStrong when sampled near work
Executive dashboard credibilitySupportingStrong for critical controlsStrong for weak signals
Culture diagnosisWeak aloneSupportingStrong
Cross-site comparabilityStrongModerate if definitions are stableModerate if coded consistently
Resistance to gamingModerate to weakModerateStrong when triangulated

The matrix creates a practical rule. Audit scores are the best entry point for governance, control checks are the best entry point for high-risk reliability, and field evidence is the best entry point for truth. When leaders confuse those roles, they ask a weak instrument to answer a strong question.

Recommendations By Context

For a board or C-level team, the best monthly view is not a single safety score. It is a three-layer assurance pack that separates system discipline, critical-control status and field truth. The board should see the audit trend, but it should also see failed control checks on fatal-risk work and a short narrative of the weak signals that do not yet appear in injury statistics.

For an EHS manager, control checks should carry more weight than audit scores when resources are limited. A low audit score may reveal system disorder, but a failed control on energized work, confined space, working at height or vehicle-pedestrian separation tells the EHS manager where loss potential is alive today. The article on SIF rate, TRIR and precursor indicators expands this point because severe-risk metrics need a different logic from general recordability.

For a site manager, field evidence should be used to challenge comfort. If the site has low injury rates, high audit scores and little reporting, that silence should not be celebrated. In more than 250 cultural transformation projects supported by Andreza Araújo's team, one recurring pattern is that organizations often find their most important risk information after they stop treating bad news as an embarrassment.

For a regional EHS analyst, statistical tools still matter. A signal that appears in one field visit may be anecdotal, while a signal repeated across sites deserves escalation. That is why SPC, run charts and heat maps should sit beside the assurance model. Trend tools show movement, but field and control evidence explain what the movement means.

Board Questions That Expose Weak Assurance

A board does not need to become a technical audit team, but it does need better questions. The first question is whether the organization can name its critical controls for fatal-risk scenarios and prove their condition with current evidence. If the answer is a procedure list, the assurance model is still immature.

The second question is whether red signals are increasing because risk is getting worse or because reporting is becoming healthier. Andreza Araújo's critique in *Muito Além do Zero* (Far Beyond Zero) is useful here because the absence of accidents can protect the number instead of protecting life. A company that punishes red signals will eventually get fewer red signals, although that does not mean it got safer.

The third question is whether the same evidence would survive an unannounced visit to the field. This is where many executive dashboards fail. They are built from data that has already passed through filters, summaries and status meetings, which can remove the hesitation, conflict and uncertainty that made the original field signal valuable.

The fourth question is who owns the decision after assurance fails. If a failed control check creates a dashboard note but no resource decision, no design correction and no operating authority, the organization has recorded risk rather than controlled it. That is a governance failure, not a reporting gap.

Where To Start In The Next 30 Days

Start by choosing one fatal-risk scenario and mapping the three proof layers. Do not begin with every site and every metric. Select one scenario such as machine intervention, loading dock traffic, confined-space entry, work at height or energized maintenance. List the audit evidence, the critical controls and the field evidence that would prove whether the work is controlled.

During the first week, clean the definitions. Decide what counts as a control check, what counts as field evidence, who can verify the item and what makes the evidence unacceptable. During the second week, sample the work without warning the team to prepare a performance. During the third week, compare the official score with the control condition and field reality. During the fourth week, present the contradictions to leadership as decisions, not as observations.

The most important output is not a prettier dashboard. It is a sharper conversation about risk. If the audit score is green but field evidence shows shortcuts, leaders should fund supervision, redesign or workload correction. If control checks are failing repeatedly, leaders should stop treating the problem as awareness and identify the constraint that makes the control hard to maintain. If field evidence is silent, leaders should investigate whether people believe it is safe to tell the truth.

Control assurance works when the organization stops asking one metric to carry every meaning. Audit scores protect system discipline, control checks protect barrier reliability, and field evidence protects truth. The executive task is to keep those meanings separate long enough for the right decision to become visible.

Topics control-assurance safety-metrics field-verification critical-controls ehs-dashboard c-level

Frequently asked questions

What is control assurance in safety?
Control assurance is the disciplined process of proving that the controls meant to prevent serious harm are defined, present, functional and owned. It goes beyond checking whether a procedure exists.
Are audit scores enough for a safety dashboard?
Audit scores are useful for governance and system discipline, but they are not enough for a safety dashboard that guides serious-risk decisions. They should be paired with control checks and field evidence.
When should leaders use field evidence?
Leaders should use field evidence when they need to know whether work as performed matches the procedure, the training file and the dashboard. It is especially useful for weak signals, operational drift and culture diagnosis.
How does this connect with Andreza Araújo's safety-culture work?
Andreza Araújo's books and diagnostic work argue that good indicators do not automatically prove good practices. This article applies that thesis to executive assurance, where evidence quality matters more than score volume.
What should a board ask about safety metrics?
A board should ask which critical controls prevent fatal events, how recently those controls were verified, what field evidence challenges the dashboard and who owns the decision when assurance fails.

About the author

Andreza Araújo

Safety Culture Expert | Senior EHS Executive

Andreza Araújo is a safety culture expert and senior EHS executive with more than 25 years of experience in environment, health and safety. She is a Civil Engineer and Occupational Safety Engineer from Unicamp, holds a Master's degree in Environmental Diplomacy from the University of Geneva, and completed sustainability studies at IMD Switzerland. Andreza has served in Global Head of EHS roles in Fortune 500 environments, leading cultural transformation programs across multinational operations. She has represented Brazil as a speaker at the United Nations in Paris and has spoken at the International Labour Organization in Turin. She is the author of more than 16 books on safety culture in Portuguese, Spanish, English and German. Her work has earned more than 10 EHS awards, including two recognitions from Indra Nooyi, former PepsiCo CEO.

  • Civil & Safety Engineer (Unicamp)
  • M.A. Environmental Diplomacy (University of Geneva)
  • Sustainability Cert (IMD Switzerland)
  • People Management & Coaching (Ohio University)
  • UN Paris speaker representative for Brazil
  • ILO Turin speaker
  • LinkedIn Top Voice
  • Indra Nooyi PepsiCo CEO recognition (2x)

Documentaries

Watch Andreza's documentaries

Three productions on safety culture, organizational failure and the human lessons behind major disasters.

Podcasts

Listen to Andreza's podcasts

She hosts three shows on safety leadership, EHS and organizational culture, in English and Portuguese.

Summarize with AI