Methodology

How PowerOut.ai measures AI model performance

What We Measure

PowerOut.ai makes real API calls to 385 AI models across 4 provider gateways every 5 minutes. Each call is a lightweight inference request — a short prompt that requires the model to generate a response. We measure the complete request lifecycle:

  • Response Time (TTFT) — Time from request sent to first token received. This is the latency a user experiences before the model starts "typing."
  • Total Time — Time from request sent to last token received.
  • Connect Time — TCP/TLS handshake duration. Isolates network latency from model processing time.
  • HTTP Status — Whether the API returned successfully, errored, or timed out.
  • Token Counts — Input and output token counts reported by the provider.

How We Classify Status

Each model is classified into one of four states based on consecutive check results:

HealthyModel responded successfully with normal latency.
Degraded1-2 consecutive failures. Could be transient. Not classified as an outage.
Down3+ consecutive failures. Confirmed outage. Incident created.
RecoveringModel was down but has started responding. Needs 2 consecutive successes to return to healthy.

This prevents false alarms from transient network blips. A single failed check does NOT trigger an outage — it takes 3 consecutive failures (15 minutes at default check intervals).

Error Classification

When a check fails, we classify the error to distinguish provider outages from other issues:

  • Credential Failure (401/403) — Our API key is invalid or expired. This is our problem, not a provider outage. These checks are quarantined from status calculations.
  • Rate Limited (429) — We've been throttled. Not a provider outage.
  • Provider Error (5xx) — Server-side issue at the provider.
  • Timeout — Request didn't complete within 30 seconds.
  • Network Error — Connection couldn't be established.

Network Baseline

We maintain baseline pings to Cloudflare (1.1.1.1), Google DNS, and Quad9 DNS alongside model checks. If baseline pings degrade simultaneously with model checks, it indicates a network issue on our end — not a provider outage. This helps us distinguish "our internet is slow" from "the provider is down."

Data Pipeline

Every check result is written to disk immediately as a JSONL record with a SHA-256 checksum. A directory watcher tails these files and pushes records to MongoDB within seconds. Completed files are uploaded to S3 for durable backup. The raw data on disk is the authoritative source of truth — MongoDB can be rebuilt from it at any time.

AI100 Index

The AI100 is a composite score representing the health of the AI ecosystem. It is calculated as a running window average across all monitored models, weighted by status: healthy = 100%, recovering = 75%, degraded = 50%, down = 0%.

The score is available at six time windows: 1 minute, 5 minutes, 1 hour, 1 day, 1 week, and 1 month. The 5-minute window is the default "hero" score displayed on the homepage.

Independence

PowerOut.ai is not affiliated with any AI provider. We pay for our own API access and receive no compensation or preferential treatment from any provider. Our monitoring infrastructure runs on independent servers with independent network connectivity.

Open Data

Aggregate methodology and scores are published openly. Raw per-check telemetry data is retained as our proprietary dataset — this historical data is the foundation of the platform and cannot be replicated retroactively.