See what your LLM calls cost

One command. No signup. No cloud.

Local observability for your AI projects. Track costs, tokens, and latency for OpenAI, Anthropic, Gemini, Ollama, and more.

npx llmflow

Then point your SDK at localhost:8080/v1

LLMFlow Dashboard LLMFlow Dashboard

Cost Tracking

Real-time pricing for 2000+ models

Request Logging

Every request, response, and latency

Hierarchical Traces

Agents, chains, tools in a tree view

OpenTelemetry

LangChain, LlamaIndex, OTLP support

Zero Config

No API keys, no setup, no accounts

Local Storage

SQLite. Your data stays local.

Quick Start

Up and running in 30 seconds

1

Start LLMFlow

# Using npx (recommended)
npx llmflow

# Or with Docker
docker run -p 3000:3000 -p 8080:8080 helgesverre/llmflow
2

Point your SDK

# Python
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8080/v1")

# JavaScript
const client = new OpenAI({ baseURL: "http://localhost:8080/v1" });
3

View your traces

Open localhost:3000 to see costs, tokens, latency, and full request/response details.

Integrations

Works with your existing tools. Pick your integration method.

Point your SDK's base URL at LLMFlow. Works with any OpenAI-compatible client.

OpenAI http://localhost:8080/v1
Anthropic http://localhost:8080/anthropic/v1
Gemini http://localhost:8080/gemini/v1
Ollama http://localhost:8080/ollama/v1
Groq http://localhost:8080/groq/v1
Mistral http://localhost:8080/mistral/v1
Azure OpenAI http://localhost:8080/azure/v1
Together http://localhost:8080/together/v1
OpenRouter http://localhost:8080/openrouter/v1

Send traces from LangChain, LlamaIndex, OpenLLMetry, or any OpenTelemetry-instrumented app.

Python (OpenLLMetry / Traceloop)

from traceloop.sdk import Traceloop

Traceloop.init(
    api_endpoint="http://localhost:3000/v1/traces",
    disable_batch=True
)

Python (Manual OTLP)

from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor

provider = TracerProvider()
exporter = OTLPSpanExporter(endpoint="http://localhost:3000/v1/traces")
provider.add_span_processor(SimpleSpanProcessor(exporter))

JavaScript / TypeScript

import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http';
import { SimpleSpanProcessor } from '@opentelemetry/sdk-trace-base';

const exporter = new OTLPTraceExporter({
    url: 'http://localhost:3000/v1/traces'
});
provider.addSpanProcessor(new SimpleSpanProcessor(exporter));

Go

import "go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"

exporter, _ := otlptracehttp.New(ctx,
    otlptracehttp.WithEndpoint("localhost:3000"),
    otlptracehttp.WithURLPath("/v1/traces"),
    otlptracehttp.WithInsecure(),
)

Ruby

require 'opentelemetry-exporter-otlp'

OpenTelemetry::SDK.configure do |c|
  c.add_span_processor(
    OpenTelemetry::SDK::Trace::Export::SimpleSpanProcessor.new(
      OpenTelemetry::Exporter::OTLP::Exporter.new(
        endpoint: 'http://localhost:3000/v1/traces'
      )
    )
  )
end

LLMFlow accepts OTLP/HTTP JSON for traces, logs, and metrics on port 3000.

Track usage from AI coding assistants like Claude Code, Codex CLI, Aider, and more.

Claude Code

# Use passthrough mode
export ANTHROPIC_BASE_URL=http://localhost:8080/passthrough/anthropic
claude

Aider

# OpenAI models
aider --openai-api-base http://localhost:8080/v1

# Anthropic models
aider --anthropic-api-base http://localhost:8080/passthrough/anthropic

Codex CLI

# In ~/.codex/config.toml:
[otel.exporter."otlp-http"]
endpoint = "http://localhost:3000/v1/logs"

Cursor / Continue

# Set OpenAI base URL in settings
http://localhost:8080/v1