Z-score anomaly detection

I wanted to find times where the available memory of the machines had significant changes. Instead of using a fixed threshold, I decided to use a dynamic approach that determines anomalies using the Z-score.

Here’s the VictoriaMetrics query using ClickHouse metrics, but it should work with any value you track.

with (
    q = (ClickHouseMetrics_MemoryTracking / ClickHouseAsyncMetrics_OSMemoryAvailable),
    qnow = avg_over_time(q[5h]),
    qavg = avg_over_time(q[14d]),
    qstd = stddev_over_time(q[14d]),
    qz = ((qnow - qavg) / qstd),
    absqz = abs(qz)
) absqz

You can filter on absqz > 3 to get significant differences compared to recent values.