Telemetry Middleware#
Comprehensive observability middleware with W3C Trace Context support and optional OpenTelemetry integration for distributed tracing, request correlation, and performance monitoring.
Overview#
The Telemetry Middleware provides two modes of operation:
Minimal Mode (Default) - Zero dependencies, W3C Trace Context only
OpenTelemetry Mode - Full observability with automatic span export (auto-enabled when SDK is installed)
Both modes provide:
W3C Trace Context propagation
Request ID generation and correlation
Response timing headers
Access logging with trace correlation
Global trace context via
TraceContexthelper
Quick Start#
Basic Usage (Minimal Mode)#
use Infocyph\Webrick\Middleware\TelemetryMiddleware;
$preGlobal = [
new TelemetryMiddleware($logger),
];
Response headers:
X-Response-Time: 45.2ms
Server-Timing: app;dur=45.2
X-Request-Id: 1a2b3c4d5e6f7g8h9i0j1k2l3m4n5o6p
Trace-Id: a4c9e2b8f1d3a7e5c2b1f8e3d4a5c6b7
Access log:
127.0.0.1 (direct) "GET /api/users" 200 1234 45.2ms id=abc123 trace=def456 span=789xyz [w3c]
Integration with RouterKernel#
Basic Setup#
use Infocyph\Webrick\Router\Kernel\RouterKernel;
use Infocyph\Webrick\Router\Matching\ShardedMatcher;
use Infocyph\Webrick\Middleware\TelemetryMiddleware;
$kernel = RouterKernel::bootWithRegistrar(
log: $logger,
matcher: ShardedMatcher::make(),
register: $register,
preGlobal: [
TelemetryMiddleware::class, // Add as first middleware
// ... other middleware
]
);
With Custom Configuration#
use Infocyph\Webrick\Middleware\TelemetryMiddleware;
$preGlobal = [
new TelemetryMiddleware(
log: $logger,
addXResponseTime: true,
addServerTiming: true,
emitRequestId: true,
emitTraceIdHeader: true,
respectIncomingTraceparent: true
),
// ... other middleware
];
$kernel = RouterKernel::bootWithRegistrar(
log: $logger,
matcher: ShardedMatcher::make(),
register: $register,
preGlobal: $preGlobal
);
OpenTelemetry Mode (Full Observability)#
Install OpenTelemetry SDK:
composer require open-telemetry/sdk open-telemetry/exporter-otlp
Configure OpenTelemetry (in bootstrap):
use OpenTelemetry\SDK\Trace\TracerProvider;
use OpenTelemetry\SDK\Trace\SpanProcessor\BatchSpanProcessor;
use OpenTelemetry\Contrib\Otlp\SpanExporter;
use OpenTelemetry\API\Globals;
$exporter = new SpanExporter('http://localhost:4318/v1/traces');
$tracerProvider = new TracerProvider(
new BatchSpanProcessor($exporter)
);
Globals::registerInitializer(fn() => $tracerProvider);
Use middleware (need a simple change):
$preGlobal = [
new TelemetryMiddleware(
$logger,
enableOtelIntegration: true
),
];
Access log (note [otel] indicator):
127.0.0.1 (direct) "GET /api/users" 200 1234 45.2ms id=abc123 trace=def456 span=789xyz [otel]
Plus: Spans automatically exported to Jaeger/Zipkin/OTLP collectors!
Constructor Reference#
public function __construct(
?LoggerInterface $log = null, // PSR-3 logger (defaults to NullLogger)
bool $addXResponseTime = true, // Add X-Response-Time header
bool $addServerTiming = true, // Add Server-Timing header
bool $emitRequestId = true, // Generate/emit request IDs
string $requestIdHeader = 'X-Request-Id', // Request ID header name
bool $respectExistingRequestId = true, // Honor incoming request IDs
?string $nelGroup = null, // NEL group name (null = disabled)
?string $nelEndpoint = null, // NEL reporting endpoint
int $nelTtlSeconds = 86400, // NEL policy TTL (seconds)
bool $nelIncludeSubdomains = true, // NEL for subdomains
bool $nelCollectSuccesses = false, // Report successful requests
bool $emitTraceIdHeader = true, // Emit Trace-Id header
string $traceIdHeader = 'Trace-Id', // Trace ID header name
bool $respectIncomingTraceparent = true, // Honor incoming traceparent
bool $emitTraceparentHeader = false, // Emit traceparent header
bool $enableOtelIntegration = false, // Enable OpenTelemetry (auto-detected)
?string $otelServiceName = null, // Service name in traces
?string $otelServiceVersion = null, // Service version in traces
)
Configuration#
Full Configuration Example#
use Infocyph\Webrick\Middleware\TelemetryMiddleware;
$preGlobal[] = new TelemetryMiddleware(
log: $logger, // PSR-3 logger
// Response timing
addXResponseTime: true, // X-Response-Time header
addServerTiming: true, // Server-Timing header
// Request ID
emitRequestId: true, // Generate/propagate request IDs
requestIdHeader: 'X-Request-Id', // Header name
respectExistingRequestId: true, // Honor incoming IDs
// Network Error Logging (NEL)
nelGroup: 'default', // NEL group name
nelEndpoint: 'https://nel.example.com/report', // NEL collector
nelTtlSeconds: 86400, // NEL policy TTL (24 hours)
nelIncludeSubdomains: true, // NEL for subdomains
nelCollectSuccesses: false, // Log successful requests (bandwidth)
// W3C Trace Context
emitTraceIdHeader: true, // Trace-Id header in response
traceIdHeader: 'Trace-Id', // Header name
respectIncomingTraceparent: true, // Honor incoming traceparent
emitTraceparentHeader: false, // Emit traceparent header (opt-in)
// OpenTelemetry (auto-detected)
enableOtelIntegration: true, // Enable OTel mode if SDK available
otelServiceName: 'my-awesome-api', // Service name in traces
otelServiceVersion: '2.0.0', // Service version
);
Minimal Configuration (Performance-Focused)#
$preGlobal[] = new TelemetryMiddleware(
log: $logger,
addXResponseTime: false, // Skip header
addServerTiming: false, // Skip header
emitRequestId: true, // Keep for correlation
emitTraceIdHeader: true, // Keep for correlation
);
Features#
1. W3C Trace Context#
Standard compliance with W3C Trace Context specification:
traceparent: 00-a4c9e2b8f1d3a7e5c2b1f8e3d4a5c6b7-1234567890abcdef-01
tracestate: congo=ucfJifl5GOE,rojo=00f067aa0ba902b7
Format: version-traceid-spanid-flags
Middleware behavior:
Generates new
trace-id(32 hex chars) if missingCreates new
span-id(16 hex chars) per requestRespects incoming
traceparent/tracestateheadersValidates trace IDs (non-zero, correct length)
Propagates to downstream services
Request attributes:
$r->getAttribute('trace.trace_id'); // a4c9e2b8f1d3a7e5c2b1f8e3d4a5c6b7
$r->getAttribute('trace.span_id'); // 1234567890abcdef
$r->getAttribute('trace.parent_span_id'); // 0000000000000000
$r->getAttribute('trace.flags'); // 01
$r->getAttribute('trace.tracestate'); // congo=ucfJifl5GOE
2. Request ID Correlation#
Unique identifier per request:
X-Request-Id: 1a2b3c4d5e6f7g8h9i0j1k2l3m4n5o6p
Generation:
If missing:
bin2hex(random_bytes(16))(32 hex chars)If present: Forwards existing ID from load balancers/proxies
Stored as:
$r->getAttribute('request_id')
Use cases:
Support tickets (users report request ID)
Log correlation within single request
Request-specific debugging
3. Response Timing Headers#
X-Response-Time#
X-Response-Time: 45.2ms
Human-readable response time in milliseconds.
Server-Timing#
Server-Timing: app;dur=45.2
W3C Server Timing API format. Can be extended by other middleware:
Server-Timing: cache;dur=2.3, db;dur=12.5, app;dur=45.2
Browser DevTools integration:
Visible in Network tab → Timing
Programmatically accessible via Performance API
4. Network Error Logging (NEL)#
Browser error reporting configuration:
NEL: {"group":"default","max_age":86400,"include_subdomains":true,"success_fraction":0.0,"failure_fraction":1.0}
Report-To: {"group":"default","max_age":86400,"endpoints":[{"url":"https://nel.example.com/report"}]}
What it monitors:
DNS failures
TCP connection errors
TLS handshake failures
HTTP protocol errors
Request timeouts
Configuration:
success_fraction: 0.0= Only report failures (saves bandwidth)failure_fraction: 1.0= Report all failures
5. Access Logging#
Structured log entries with trace correlation:
Minimal mode:
127.0.0.1 (direct) "GET /api/users" 200 1234 45.2ms id=abc123 trace=def456 span=789xyz [w3c]
OpenTelemetry mode:
127.0.0.1 (direct) "GET /api/users" 200 1234 45.2ms id=abc123 trace=def456 span=789xyz [otel]
Log components:
Client IP (with proxy detection)
Connection type (
directorproxy)HTTP method and path
Status code and response size
Request duration in milliseconds
Request ID
Trace ID and Span ID
Mode indicator (
[w3c]or[otel])
Global Trace Context Access#
The TraceContext helper provides universal access to trace IDs throughout your application.
Basic Usage#
use Infocyph\Webrick\Support\TraceContext;
// In controllers
$traceId = TraceContext::getTraceId(); // 32 hex chars
$spanId = TraceContext::getSpanId(); // 16 hex chars
$requestId = TraceContext::getRequestId(); // 32 hex chars
// Get all context
$context = TraceContext::getAll();
// ['trace_id' => 'abc...', 'span_id' => 'def...', 'request_id' => 'xyz...']
// Formatted for logging
$context = TraceContext::getLogContext();
// "trace=abc123 span=def456 request=xyz789"
In Controllers#
final class UserController
{
public function __construct(
private LoggerInterface $logger
) {}
public function show(int $id): Response
{
$this->logger->info('Fetching user', [
'user_id' => $id,
'trace_id' => TraceContext::getTraceId(),
'request_id' => TraceContext::getRequestId(),
]);
$user = UserRepository::find($id);
return Response::json($user);
}
}
In Database Queries#
Add trace context as SQL comments for query tracking:
final class UserRepository
{
public static function find(int $id): ?array
{
$context = TraceContext::getLogContext();
$sql = "
/* {$context} */
SELECT * FROM users WHERE id = ?
";
return DB::queryOne($sql, [$id]);
}
}
MySQL slow query log:
/* trace=a4c9e2b8... span=def456... request=xyz789... */
SELECT * FROM users WHERE id = 42;
In Cache Operations#
final class CacheService
{
public function get(string $key): mixed
{
$value = Cache::get($key);
$this->logger->debug($value ? 'Cache hit' : 'Cache miss', array_merge(
['key' => $key, 'hit' => $value !== null],
TraceContext::getLogArray() // Adds trace_id, span_id, request_id
));
return $value;
}
}
In Exception Handlers#
final class ExceptionHandler
{
public function handle(\Throwable $e): Response
{
$context = TraceContext::getAll();
$this->logger->error('Unhandled exception', [
'exception' => get_class($e),
'message' => $e->getMessage(),
'trace_id' => $context['trace_id'],
'span_id' => $context['span_id'],
'request_id' => $context['request_id'],
]);
// Return trace ID to client for support
return Response::json([
'error' => 'Internal server error',
'trace_id' => $context['trace_id'],
'request_id' => $context['request_id'],
'message' => 'Please provide this trace ID to support',
], 500);
}
}
Propagate to External Services#
final class PaymentService
{
public function charge(int $amount): array
{
$response = $this->http->post('https://api.payment.com/charge', [
'headers' => TraceContext::getPropagationHeaders(),
'json' => ['amount' => $amount],
]);
return json_decode($response->getBody(), true);
}
}
Headers sent:
traceparent: 00-a4c9e2b8f1d3a7e5c2b1f8e3d4a5c6b7-1234567890abcdef-01
X-Trace-Id: a4c9e2b8f1d3a7e5c2b1f8e3d4a5c6b7
X-Request-Id: xyz789...
Available Methods#
TraceContext::getTraceId(); // Trace ID (32 hex)
TraceContext::getSpanId(); // Span ID (16 hex)
TraceContext::getRequestId(); // Request ID (32 hex)
TraceContext::getParentSpanId(); // Parent span ID
TraceContext::getFlags(); // Trace flags (01 = sampled)
TraceContext::getTraceState(); // Trace state (vendor data)
TraceContext::getAll(); // All context as array
TraceContext::getLogContext(); // Formatted string for logs
TraceContext::getLogArray(); // Array for structured logging
TraceContext::getTraceparent(); // W3C traceparent header value
TraceContext::getPropagationHeaders(); // HTTP headers for propagation
TraceContext::isAvailable(); // Check if context available
TraceContext::isOtelMode(); // Check if OTel mode active
TraceContext::isSampled(); // Check if trace is sampled
OpenTelemetry Integration#
Automatic Detection#
The middleware automatically detects OpenTelemetry SDK at runtime:
// Checks for these classes:
- OpenTelemetry\API\Globals
- OpenTelemetry\API\Trace\SpanKind
- Infocyph\Webrick\Support\OpenTelemetryHandler
// If found: Full OTel mode
// If not found: Minimal W3C mode
No configuration needed! Just install the SDK and it works.
What OpenTelemetry Mode Adds#
Feature |
Minimal Mode |
OTel Mode |
|---|---|---|
W3C Trace Context |
✅ Yes |
✅ Yes |
Request/Trace IDs |
✅ Yes |
✅ Yes |
Timing Headers |
✅ Yes |
✅ Yes |
Access Logging |
✅ Yes |
✅ Yes |
Span Creation |
❌ No |
✅ Yes |
Span Export |
❌ No |
✅ Jaeger/Zipkin/OTLP |
Span Attributes |
❌ No |
✅ HTTP semantics |
Exception Recording |
❌ No |
✅ Full stack traces |
Distributed Tracing UI |
❌ No |
✅ Yes |
Span Attributes (OTel Mode)#
Automatically added to spans following OpenTelemetry semantic conventions:
HTTP attributes:
http.method- Request method (GET, POST, etc.)http.target- Request path (/api/users/42)http.scheme- Protocol (http, https)http.host- Hostname (api.example.com)http.url- Full URLhttp.status_code- Response status (200, 404, etc.)http.user_agent- Client user agenthttp.route- Route name (users.show)http.request_content_length- Request sizehttp.response_content_length- Response size
Network attributes:
net.peer.ip- Client IP addressnet.host.port- Server porthttp.flavor- HTTP version (1.1, 2.0)
Custom attributes:
enduser.id- Authenticated user ID (if available)enduser.role- User role (if available)client.type- Client type (web, mobile, api)api.version- API version (if versioned)
Observability Backend Examples#
Jaeger UI#
Service: my-awesome-api
Trace: a4c9e2b8f1d3a7e5c2b1f8e3d4a5c6b7
├─ Span: GET users.show (45.2ms)
├─ http.method: GET
├─ http.route: users.show
├─ http.status_code: 200
├─ net.peer.ip: 192.168.1.100
└─ enduser.id: 42
Zipkin UI#
my-awesome-api: GET /api/users/42
Duration: 45.2ms
Tags:
http.method: GET
http.target: /api/users/42
http.status_code: 200
enduser.id: 42
Force Minimal Mode#
Disable OpenTelemetry even if SDK is present:
new TelemetryMiddleware(
log: $logger,
enableOtelIntegration: false // Force minimal mode
)
Structured Logging Integration#
Monolog Processor#
Automatically add trace context to all log entries:
use Infocyph\Webrick\Support\TraceContext;
use Monolog\Processor\ProcessorInterface;
final class TraceContextProcessor implements ProcessorInterface
{
public function __invoke(array $record): array
{
if (TraceContext::isAvailable()) {
$record['extra'] = array_merge(
$record['extra'],
TraceContext::getLogArray()
);
$record['extra']['otel_mode'] = TraceContext::isOtelMode();
}
return $record;
}
}
// Register with Monolog
$logger->pushProcessor(new TraceContextProcessor());
Every log entry now includes:
{
"message": "User login successful",
"level": "info",
"extra": {
"trace_id": "a4c9e2b8f1d3a7e5c2b1f8e3d4a5c6b7",
"span_id": "1234567890abcdef",
"request_id": "xyz789...",
"otel_mode": false
}
}
ELK Stack (Elasticsearch + Kibana)#
Search all logs from a single request:
trace_id:"a4c9e2b8f1d3a7e5c2b1f8e3d4a5c6b7"
Returns:
Controller logs
Repository/database logs
Cache operation logs
External API call logs
Error logs
All correlated by trace ID!
Use Cases#
1. User Support Tickets#
// Error response includes trace ID
return Response::json([
'error' => 'Payment failed',
'trace_id' => TraceContext::getTraceId(),
'message' => 'Please provide this ID to support'
], 500);
User reports: “My payment failed, trace ID: a4c9e2b8…”
Support searches:
trace_id:"a4c9e2b8f1d3a7e5c2b1f8e3d4a5c6b7"
Finds: Complete request flow with all logs!
2. Performance Debugging#
Slow query log:
/* trace=a4c9e2b8... span=def456... request=xyz789... */
SELECT * FROM orders WHERE status = 'pending' AND created_at > '2024-01-01';
Duration: 2.5 seconds
Correlation:
Find slow query in database logs
Extract trace ID from SQL comment
Search application logs for that trace ID
See what triggered the slow query
3. Distributed Tracing#
Service A (API Gateway):
// Middleware generates trace context
// trace_id: a4c9e2b8f1d3a7e5c2b1f8e3d4a5c6b7
// span_id: 1234567890abcdef
Service B (User Service):
// Receives traceparent header
// Creates child span with same trace_id
// New span_id: fedcba0987654321
Service C (Payment Service):
// Receives traceparent from Service B
// Creates child span with same trace_id
// New span_id: abcd1234efgh5678
Result: Complete trace across all services in Jaeger/Zipkin!
4. Error Rate Monitoring#
Prometheus query:
rate(http_requests_total{status=~"5.."}[5m])
Spike detected → Trace IDs from logs → Root cause analysis
Performance#
Overhead#
Minimal mode:
~0.1ms per request
~100 bytes memory
OpenTelemetry mode:
~1-2ms per request
~100KB memory per span
Async export (non-blocking)
Optimization#
Use batch span processor in production:
use OpenTelemetry\SDK\Trace\SpanProcessor\BatchSpanProcessor;
$processor = new BatchSpanProcessor($exporter, [
'maxQueueSize' => 2048,
'scheduledDelayMillis' => 5000,
'exportTimeoutMillis' => 30000,
'maxExportBatchSize' => 512,
]);
Sampling:
use OpenTelemetry\SDK\Trace\Sampler\ParentBased;
use OpenTelemetry\SDK\Trace\Sampler\TraceIdRatioBasedSampler;
// Sample 10% of traces
$sampler = new ParentBased(
new TraceIdRatioBasedSampler(0.1)
);
Disabling Features#
Minimal Overhead Configuration#
For maximum performance when you only need basic trace correlation:
$preGlobal = [
new TelemetryMiddleware(
log: $logger,
addXResponseTime: false, // No timing header
addServerTiming: false, // No server-timing
emitRequestId: false, // No request ID header
emitTraceIdHeader: false, // No trace header
enableOtelIntegration: false // No OTel overhead
)
];
This configuration:
✅ Still generates trace IDs internally
✅ Still makes TraceContext available
✅ Still logs access with trace correlation
❌ Doesn’t emit response headers
❌ Doesn’t create OpenTelemetry spans
Disable Specific Features#
// Disable only timing headers (keep trace context)
new TelemetryMiddleware(
log: $logger,
addXResponseTime: false,
addServerTiming: false
)
// Disable Network Error Logging
new TelemetryMiddleware(
log: $logger,
nelGroup: null, // Disables NEL
nelEndpoint: null
)
// Disable request ID propagation (but keep trace IDs)
new TelemetryMiddleware(
log: $logger,
emitRequestId: false
)
Troubleshooting#
Check OpenTelemetry Mode#
use Infocyph\Webrick\Support\TraceContext;
if (TraceContext::isOtelMode()) {
echo "✅ OpenTelemetry mode active\n";
} else {
echo "ℹ️ Minimal mode (W3C only)\n";
}
Verify Trace Context#
use Infocyph\Webrick\Support\TraceContext;
var_dump([
'available' => TraceContext::isAvailable(),
'trace_id' => TraceContext::getTraceId(),
'span_id' => TraceContext::getSpanId(),
'request_id' => TraceContext::getRequestId(),
'otel_mode' => TraceContext::isOtelMode(),
]);
Debug OpenTelemetry Setup#
# Check if SDK is installed
composer show open-telemetry/sdk
# Check class availability
php -r "var_dump(class_exists('OpenTelemetry\\API\\Globals'));"
# Check tracer provider
php -r "var_dump(\OpenTelemetry\API\Globals::tracerProvider());"
Best Practices#
1. Middleware Order (CRITICAL)#
⚠️ ALWAYS place TelemetryMiddleware FIRST or VERY EARLY in preGlobal:
// ✅ Correct - Telemetry captures everything
$preGlobal = [
TelemetryMiddleware::class, // FIRST - captures all timing
GatewayHardeningMiddleware::class,
MaintenanceModeMiddleware::class,
RequestLimitsMiddleware::class,
// ... other middleware
];
// ❌ Wrong - Misses early middleware timing
$preGlobal = [
GatewayHardeningMiddleware::class,
MaintenanceModeMiddleware::class,
TelemetryMiddleware::class, // TOO LATE
];
Why first?
Starts timing before any other processing
Wraps all middleware in try-catch for exception tracking (OTel mode)
Initializes TraceContext before other middleware/controllers need it
Captures complete request lifecycle in timing headers
1. Always Use in preGlobal#
1. Always Use in preGlobal#
// ✅ Correct - Captures full request lifecycle
$preGlobal = [
TelemetryMiddleware::class,
// ... other middleware
];
// ❌ Wrong - Misses pre-processing time
$postGlobal = [
TelemetryMiddleware::class,
];
3. Use TraceContext Everywhere#
// ✅ Good - Consistent correlation
$logger->info('Action', [
'user_id' => $userId,
'trace_id' => TraceContext::getTraceId(),
]);
// ❌ Bad - No correlation
$logger->info('Action', ['user_id' => $userId]);
4. Add Trace Context to SQL Comments#
// ✅ Good - Queryable in slow query log
$sql = "/* " . TraceContext::getLogContext() . " */ SELECT * FROM users";
// ❌ Bad - Can't correlate slow queries
$sql = "SELECT * FROM users";
5. Return Trace IDs in Error Responses#
// ✅ Good - User can report trace ID for support
return Response::json([
'error' => 'Something went wrong',
'trace_id' => TraceContext::getTraceId(),
], 500);
// ❌ Bad - No way to find error in logs
return Response::json(['error' => 'Something went wrong'], 500);
6. Enable OpenTelemetry in Staging/Production#
// Development: Minimal mode (faster)
if ($_ENV['APP_ENV'] === 'development') {
$middleware = new TelemetryMiddleware(
log: $logger,
enableOtelIntegration: false
);
} else {
// Staging/Production: Full observability
$middleware = new TelemetryMiddleware(
log: $logger,
enableOtelIntegration: true
);
}