HomeDocumentationRecipesChangelog
HomeRequest DemoContact
Documentation
HomeRequest DemoContact

Managing Incident Groups

You can manage individual incidents or all incidents in a group using its Group Details page.

Incident Group Actions

At the top of the page, you have buttons for bulk actions on all incidents in the group:

  • Owner–Update or assign ownership to a user who will manage and resolve the incident group.
  • Status–Update the incident group status to track its progress towards resolution.
  • Comments–Add notes to the incident group to facilitate resolving the issue.
  • Mute–Silence repeated notifications from this incident group for a period of time or until you click Unmute. For more information, see Muting Incident Notifications.

📘

Note

If you have notification rules to track when an incident occurs, the notification includes a link directly to the Incident details page where you can manage it.

Update the Incident Owner

The ownership of sources and validators default to the user who created the resource. Incident groups will automatically inherit the owner based on the validator that detected the incident, if that validator has an owner. You can then update and assign a new owner to the incident group during investigation and triage.

To change or assign an owner,

  1. Click the Owner button at the top of the page.
  2. Select a new owner to apply to the incident group.

Update the Incident Status

You can use the incident status to track the progress of the incident resolution and retrain the anomaly detection algorithms.

To change the status of all the incidents in the group,

  1. Click the Status button at the top of the page.
  2. Select the new status to apply to incident group.

To change the status of an individual incident,

  1. Under the Status column, click the Triage button for the incident you want to update.
  2. Select the new status to apply to the incident.

The following table lists the available status options:

StatusDescription
TriageThe default for new incidents and indicates that it requires review.
InvestigatingThe incident is currently being addressed.
ResolvedThe incident has been resolved.
False PositiveThe incident has been addressed and is not an anomaly.

📘

Note

Changing the status of a detected incident to False Positive, provides feedback to retrain the anomaly detection algorithms so that it is less likely to wrongly detect similar data points as incidents when they occur in the future. This feedback cannot be undone. For more information, see Model Retraining.

Comment on Incidents

Comments are recorded as part of the activities log on the page. You can edit or delete comments.

If you configured notifications to Slack and Microsoft Teams, notification messages posted in those channels include a Comment button. Commenting in Slack and Microsoft Teams will also update the activities log for the relevant incident in Validio.

Also, commenting on incidents in Validio will post a message to the relevant threads in those channels. The message will include the user who commented on the incident, the comment they wrote, and a link back to the incident group details.

Group Overview Tab

The Group Details > Overview tab provides a comprehensive summary of the incident, including the current status and owner, with a graph showing the validator metric values over time and a table of the individual incidents. The Overview also displays a log of activities, which includes when the incidents were reported, how many incidents were reported, and a timeline of comments that have been added to the group.

The group summary includes the following information:

FieldDescription
PriorityThe priority (High, Medium, Low) is automatically determined based on the severity of the incidents in the group and how long the incident has been ongoing.
First SeenThe date when the first incident in the group occurred.
Last SeenThe date when the last incident in the group occurred.
SourceThe source where the incident occurred. You can click on the source to navigate to its details page.
ValidatorThe validator and metric that captured the incident. You can click on the validator to navigate to its details page.

Metric Graph

The metric graph displays a history of the field values tracked by the validator. You can see when the incident occurred and the values before and after the incident.

The graph includes information about the severity of the incidents (High, Medium, Low) and a count of the occurrences of each severity. When you hover on a datapoint in the graph, a tooltip will display the time that the incident occurred, its Value, and its Upper and Lower boundaries.

Incident Table

The incident table lists the individual incidents in the group and includes the following information:

Column NameDescription
ValueThe value of the validator metric that caused the incident.
DeviationThe prominence of the incident, defined as the difference between Value and the breached boundary.
StatusThe progress of the incident resolution: Triage, Investigating, and Resolved, and False Positive.
SeverityThe severity of the incident: High, Medium, Low.
Seen AtRelative time when the incident was seen.
Reported AtRelative time when the incident was reported.

You can use the Debug button to find information to help you troubleshoot the incident. The information that you see depends on the type of source. Debug is not available for all source types. For more information, see Debugging an Incident.

Root Cause Tab

The Group Details > Root Cause tab provides an analysis of the current incident group to help you troubleshoot and resolve the incident. Root cause uses data lineage to trace where the incident occurs, what causes it, and its impacts on related upstream and downstream assets. Validio uses information from these incident groups to identify causal and correlational relationships among them.

For more information, see Root Cause Analysis.

Past Groups Tab

The Group Details > Past Groups tab provides a list of past occurrences of similar incident groups, to give context on how often the same incident has been seen and whether it happens at a regular frequency. You can also use this tab to perform batch operations on all similar incident groups.