Triaging Critical Incidents
This guide builds on Setting up Monitoring and Validation and demonstrates how to approach investigating, triaging, and resolving critical incidents using Validio's automated Root Cause Analysis feature.

Source validators with detected incidents
Understanding Incidents in Validio
An incident is a data quality issue detected by a Validator when data deviates from configured thresholds. Incidents represent potential problems that require investigation and resolution to maintain data reliability. Each incident has a severity level and a priority to help you triage and determine which to address first.
Validio automatically categorises incidents into different severity levels (High, Medium, Low) based on their deviation. The severity level can indicate:
- High-Significant problems that may impact business operations and data consumers
- Medium-Moderate issues that require attention but are not urgent
- Low-Minor deviations that should be monitored
Incidents inherit the priority (Critical, High, Medium, Low, None) that you assigned to the sources and validators that detect them.
For detailed information about how incidents work, see About Validator Incidents.
Prerequisites
Before starting this tutorial, ensure you have:
- At least one configured source with active validators
- Appropriate permissions to view and manage incidents in Validio
Investigating Incidents
One of your validators on a critical source (gold__sales_summary) has detected a series of high-level incidents.
To start your investigation:
-
Navigate to details page for the validator (Mean of “total_sales_amount”). The validator details page lists all the incidents discovered by this validator. To help you understand the scope and scale of the problem, the table includes actual values and deviations that triggered each incident on the validator.

Mean of "total_sales_amount" validator details
-
Select one of the incidents, and change its Status from Triage to Investigating.

Updating the status of an incident
-
On the same incident, click Debug to open the debug panel and Load samples to look at its anomalous records.

Investigating with Debug
When investigating individual incidents, the debug feature generates sample SQL code from captured incidents, allowing you to investigate the exact data causing quality issues directly in your database environment. See Debugging an Incident.
-
Click the ⋮ menu and Open group details to see more information about the incidents in this group.

Viewing the incident group overview
The incident group details page provides an overview of all the incidents and their status. The page also includes an Activity log where you can record your findings with Comments, and the Root Cause Analysis tab where you can review Validio's analysis of the incident group.
Leveraging Root Cause Analysis

Root cause analysis
Validio Root Cause Analysis (RCA) automatically identifies relationships between incidents to help you understand the underlying causes and downstream impacts.
To triage critical incidents with RCA:
-
Navigate to the Root cause tab. Validio's automated analysis of the incident group includes a lineage map and an RCA relationship table.
-
Review the lineage map, which provides insight into:
- Data flow relationships - How data moves through your pipeline
- Upstream dependencies - Sources that may have caused the incident
- Downstream impacts - Systems and processes affected by the issue
- Correlation patterns - Related incidents occurring simultaneously
-
Review the RCA relationships table, which categorizes relationships to help explain the root cause of the incident:
- Causal - Direct cause-and-effect relationships
- Correlational - Incidents that occurred around the same time
- Impact-based - Downstream effects of upstream issues
For more detailed information about the lineage map and RCA relationships, see Root Cause Analysis.
Updating Incident Status
Based on your investigation and Validio's root cause analysis, update the status and document your findings.
-
Choose the appropriate Status based on your investigation:
- Resolved - Issue has been fixed and validated
- False Positive - Incident was incorrectly flagged (provides model feedback)
- Investigating - Requires additional analysis or external coordination
- Triage - Initial assessment complete, awaiting detailed investigation
-
Document your reasoning when changing status, add comments that include resolution details or notes for future reference.
Properly updating the incident statuses is important for improving Validio's anomaly detention algorithms. For example, marking false positives retrains the model, reducing noise and improving accuracy over time. For more information, see Model Feedback and Retraining and Managing Incidents.
Next Steps
This guide demonstrated how to investigate and triage critical incidents in Validio. Next you will set up notification rules and channels to get alerted on future incidents.
Updated 8 days ago