Anomaly Detection

VectorFlow continuously monitors your pipeline metrics and automatically detects statistical anomalies -- unusual spikes or drops that may indicate problems with your data pipelines.

What anomalies are detected

The anomaly detector monitors three core metrics for each deployed pipeline:

Metric	Spike anomaly	Drop anomaly
Events In (throughput)	Throughput spike	Throughput drop
Errors Total	Error rate spike	-- (drops are expected)
Latency Mean (ms)	Latency spike	-- (drops are expected)

Error and latency drops are not flagged because decreasing errors or latency is a positive signal, not an anomaly.

Sigma-based detection methodology

Anomaly detection uses a statistical sigma (standard deviation) approach:

Baseline computation -- VectorFlow computes the mean and standard deviation of each metric over a rolling historical window (default: 7 days).
Current comparison -- The most recent metric value is compared against the baseline.
Deviation factor -- The number of standard deviations the current value is from the mean is calculated: deviation = |current - mean| / stddev.
Threshold check -- If the deviation factor exceeds the configured sigma threshold, an anomaly is raised.

A minimum standard deviation floor (default: 5% of the mean) prevents false positives on metrics that are nearly constant. For example, if throughput has been steady at 1000 events/interval with near-zero variance, a fluctuation of 50 events would not be flagged because it falls within the 5% floor.

At least 24 data points are required to compute a reliable baseline. New pipelines will not generate anomalies until enough historical data has been collected.

Sensitivity presets

The sigma threshold controls how sensitive the detector is. Lower values catch more anomalies but may produce more false positives:

Preset	Sigma threshold	Description
Sensitive	2.0 sigma	Catches subtle changes. Higher false positive rate.
Moderate	2.5 sigma	Balanced sensitivity for most environments.
Balanced	3.0 sigma (default)	Standard statistical significance. Good for stable pipelines.
Relaxed	4.0 sigma	Only flags extreme outliers. Minimal false positives.

The sigma threshold and other parameters can be configured in Settings > System by a super admin. Changes are picked up within 60 seconds.

Additional configuration options

Setting	Default	Description
Baseline window	7 days	How much historical data is used for computing the baseline.
Sigma threshold	3.0	Number of standard deviations to trigger an anomaly.
Min stddev floor	5%	Minimum standard deviation as a percentage of the mean.
Dedup window	4 hours	Cooldown before creating a duplicate anomaly for the same pipeline and type.
Enabled metrics	`eventsIn`, `errorsTotal`, `latencyMeanMs`	Which metrics to monitor.

Severity levels

Each detected anomaly is assigned a severity based on how far the metric has deviated:

Severity	Condition
Warning	Deviation is between the sigma threshold and threshold + 1
Critical	Deviation exceeds sigma threshold + 1

For example, with a 3-sigma threshold, a deviation of 3.5 sigma is a warning and a deviation of 4.2 sigma is critical.

Viewing anomalies

Anomalies appear in the Anomalies section of the environment dashboard. The list shows:

Pipeline name -- Which pipeline the anomaly was detected on
Type -- The anomaly type (throughput drop, throughput spike, error rate spike, latency spike)
Severity -- Warning or critical
Message -- A human-readable description including the current value, baseline mean, standard deviation, and sigma factor
Detected at -- When the anomaly was first detected

Open anomaly counts are also shown as badges on pipeline cards throughout the UI.

Acknowledging and dismissing anomalies

Anomalies have three statuses:

Open -- Newly detected, awaiting review
Acknowledged -- A team member has reviewed the anomaly and is investigating
Dismissed -- The anomaly has been resolved or determined to be a false positive

From the anomaly list:

Click Acknowledge to mark an anomaly as under investigation
Click Dismiss to close the anomaly

Acknowledging or dismissing anomalies requires the Editor role or above.

Deduplication

To avoid alert fatigue, the detector will not create a new anomaly if an open or acknowledged anomaly already exists for the same pipeline and anomaly type within the deduplication window (default: 4 hours). This means you will see at most one active anomaly per pipeline per type at any given time.

Detection schedule

The anomaly detector runs as a background job on the leader server instance every 5 minutes. It evaluates all deployed (non-draft) pipelines using two optimized SQL queries:

A batch query to fetch the latest metric values for all pipelines
Per-pipeline baseline queries (cached for 15 minutes) to compute mean and standard deviation

This design ensures detection scales efficiently even with hundreds of pipelines.

On this page