Telemetry & Observability¶
Dahlke PressCenter integrates OpenTelemetry for metrics collection, providing visibility into both machine status (PLC connection, messages, symbols) and application health (Quartz jobs, runtime metrics).
Metrics are exported via OTLP/HTTP to a local collector and visualized in Grafana, accessible over Tailscale.
Architecture¶
PressCenter ──OTLP/HTTP──> OTEL Collector ──> Prometheus (metrics)
──> Loki (logs)
│
Grafana (:3000)
The Dahlke.Lib.Telemetry project owns all OpenTelemetry setup — meters, instrument definitions, and OTLP exporter configuration. Other projects reference it and record measurements against its static instruments.
Important
PressCenter is an Avalonia desktop app — it does not have an ASP.NET Core IHost. The telemetry setup uses OpenTelemetrySdk.Create() (introduced in SDK 1.10.0), which is the official API for non-hosted applications. This starts the MeterProvider and its periodic exporting reader immediately, without requiring a host to call StartAsync.
Project Structure¶
Dahlke.Lib.Telemetry/
├── Dahlke.Lib.Telemetry.csproj
├── TelemetrySetup.cs # OpenTelemetrySdk.Create() + DI registration
├── Instruments/
│ ├── PlcInstruments.cs # Meter "Dahlke.Plc"
│ ├── JobInstruments.cs # Meter "Dahlke.Jobs"
│ ├── AppInstruments.cs # Meter "Dahlke.App"
│ └── StatisticsInstruments.cs # Meter "Dahlke.Statistics"
└── Settings/
└── TelemetrySettings.cs # IOptions<TelemetrySettings>
Configuration¶
Telemetry is configured in appsettings.json:
{
"Telemetry": {
"Enabled": true,
"ServiceName": "Dahlke.PressCenter",
"OtlpEndpoint": "http://localhost:4318",
"ExportIntervalMs": 15000
}
}
Key |
Default |
Description |
|---|---|---|
|
|
Master switch. When |
|
|
Service name in exported metrics. |
|
|
OTLP/HTTP receiver base URL. The code appends |
|
|
Periodic export interval in milliseconds. |
Note
When setting exporterOptions.Endpoint programmatically in the .NET OTLP SDK, the URL is used as-is — the SDK does not auto-append /v1/metrics (unlike the OTEL_EXPORTER_OTLP_ENDPOINT environment variable, which does). TelemetrySetup.cs therefore appends /v1/metrics to the configured OtlpEndpoint value. Keep OtlpEndpoint set to the base URL (e.g. http://localhost:4318).
Custom Instruments¶
Four custom meters expose domain-specific metrics.
Meter: Dahlke.Plc¶
PLC connection health, message flow, and operation latencies.
Instrument |
Type |
Unit |
Description |
|---|---|---|---|
|
ObservableGauge<int> |
– |
Current PLC state (0=INVALID, 1=RUN, 2=STOP) |
|
Counter<long> |
reconnects |
Cumulative reconnect count |
|
Counter<long> |
events |
CommunicationLost events fired |
|
Counter<long> |
messages |
Total PLC messages received |
|
ObservableGauge<int> |
messages |
Currently active PLC messages |
|
ObservableGauge<int> |
symbols |
Number of loaded PLC symbols |
|
Histogram<double> |
ms |
RPC method call duration |
|
Histogram<double> |
ms |
Symbol write duration |
|
Histogram<double> |
ms |
Heartbeat TryReadState latency |
Meter: Dahlke.Jobs¶
Quartz job execution metrics. All instruments are tagged with job.name.
Instrument |
Type |
Unit |
Description |
|---|---|---|---|
|
Histogram<double> |
ms |
Per-job execution time |
|
Counter<long> |
executions |
Executions per job |
|
Counter<long> |
errors |
Failed executions |
|
Counter<long> |
messages |
Messages polled by MessagePollingJob |
Meter: Dahlke.App¶
Application-level health metrics.
Instrument |
Type |
Unit |
Description |
|---|---|---|---|
|
ObservableGauge<double> |
seconds |
Time since application start |
|
Counter<long> |
errors |
Unhandled exception count |
Meter: Dahlke.Statistics¶
Production statistics and power monitoring metrics. Power data is sourced from PowerMonitoringPollingJob; sheet count from TotalSheetCounter.
Instrument |
Type |
Unit |
Description |
|---|---|---|---|
|
ObservableGauge<long> |
sheets |
Cumulative total sheet count |
|
ObservableGauge<double> |
W |
Total active power |
|
ObservableGauge<double> |
Wh |
Total active energy consumed |
|
ObservableGauge<double> |
Hz |
Grid frequency |
|
ObservableGauge<double> |
– |
Total power factor |
|
ObservableGauge<double> |
V |
Per-phase voltage (tagged |
|
ObservableGauge<double> |
A |
Per-phase current (tagged |
|
ObservableGauge<long> |
sheets |
Good sheets produced per counter group (tagged |
|
ObservableGauge<long> |
sheets |
Waste sheets per counter group (tagged |
|
ObservableGauge<long> |
sheets |
Target sheet count per counter group (tagged |
|
ObservableGauge<int> |
sheets/h |
Current production speed |
Built-in Instrumentation¶
Additionally, OpenTelemetry.Instrumentation.Process and OpenTelemetry.Instrumentation.Runtime provide:
CPU time, memory usage, thread count (process)
GC collections, heap size, thread pool queue length (runtime)
Integration Points¶
App.axaml.cs¶
Telemetry is registered during startup. AddDahlkeTelemetry creates the OpenTelemetrySdk eagerly — the MeterProvider and its periodic reader start exporting immediately:
services.AddDahlkeTelemetry(config);
The OpenTelemetrySdk instance is registered as a singleton. On shutdown, App.This_ShutdownRequested disposes it to flush any remaining metrics:
var otelSdk = Ioc.Default.GetService<OpenTelemetry.OpenTelemetrySdk>();
otelSdk?.Dispose();
The AppDomain.CurrentDomain.UnhandledException handler records unhandled errors via AppInstruments.
Warning
Do not use services.AddOpenTelemetry() from the OpenTelemetry.Extensions.Hosting package. That API registers an IHostedService that is only started by IHost — Avalonia has no host, so the MeterProvider would never activate. Use OpenTelemetrySdk.Create() from the base OpenTelemetry package instead.
AdsPlcCommunication¶
Accepts an optional PlcInstruments via constructor injection. Records:
Symbol count after
Init()Heartbeat latency and connection state
Reconnect and communication-lost events
Message received counts
RPC and write operation durations
OfflineAdsPlcCommunication¶
Same injection pattern, so dashboards work in simulation mode. Reports state as RUN and records simulated messages.
Quartz Jobs¶
Each job accepts an optional JobInstruments. Execution is wrapped with a Stopwatch to record duration, count, and errors:
MessagePollingJob— also recordsjob.messages.polledPowerMonitoringPollingJob— also updatesStatisticsInstrumentspower metricsMetalcolorBackgroundJob
TotalSheetCounter¶
Accepts an optional StatisticsInstruments via constructor. Updates production.sheets.total whenever the cumulative sheet count changes.
Docker Compose Stack¶
An observability stack is provided at docker/observability/, ready for deployment when Docker is available on the target machine.
cd docker/observability
cp .env.example .env # edit .env to set GF_SECURITY_ADMIN_PASSWORD
docker compose up -d
The Grafana admin password is read from the .env file (not committed to git). Copy .env.example and set a password before starting the stack.
Service |
Image |
Port |
Purpose |
|---|---|---|---|
otel-collector |
|
4318 |
Receive and route telemetry |
prometheus |
|
9090 |
Metrics storage |
loki |
|
3100 |
Log aggregation |
grafana |
|
3000 |
Dashboards (accessible via Tailscale) |
The stack includes:
otel-collector-config.yaml— OTLP receiver, batch processor, Prometheus + Loki exportersprometheus.yml— scrapes otel-collector every 15 secondsgrafana/provisioning/datasources/datasources.yaml— auto-provisions Prometheus and Loki datasources
Native Installation (Windows)¶
When Docker is not available, each component can be installed natively on Windows. This section walks through downloading, configuring, and running the full observability stack as Windows services.
Services are managed with WinSW — a single-binary, XML-configured
service wrapper (MIT license). Download WinSW-net461.exe from the
WinSW releases page and copy it into each
service directory, renaming it to match the service ID (e.g. OtelCollector.exe).
Place a matching <service-id>.xml file next to it.
Directory Layout¶
Create a base directory for the stack:
mkdir C:\Observability
mkdir C:\Observability\otel-collector
mkdir C:\Observability\prometheus
mkdir C:\Observability\prometheus\data
mkdir C:\Observability\loki
mkdir C:\Observability\loki\data
mkdir C:\Observability\grafana
Step 1: OpenTelemetry Collector¶
Download the latest release from the OpenTelemetry Collector Contrib releases page. Choose the otelcol-contrib_<version>_windows_amd64.tar.gz asset.
Extract otelcol-contrib.exe to C:\Observability\otel-collector\.
Create the configuration file at C:\Observability\otel-collector\config.yaml:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
timeout: 10s
send_batch_size: 1024
exporters:
prometheus:
endpoint: 0.0.0.0:8889
resource_to_telemetry_conversion:
enabled: true
otlphttp/loki:
endpoint: http://localhost:3100/otlp
service:
pipelines:
metrics:
receivers: [otlp]
processors: [batch]
exporters: [prometheus]
logs:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp/loki]
Note
Compared to the Docker Compose config, http://loki:3100 becomes http://localhost:3100 since all services run on the same machine.
Test it manually first:
C:\Observability\otel-collector\otelcol-contrib.exe --config C:\Observability\otel-collector\config.yaml
Register as a Windows service with WinSW. Copy WinSW-net461.exe to
C:\Observability\otel-collector\OtelCollector.exe and create
C:\Observability\otel-collector\OtelCollector.xml:
<service>
<id>OtelCollector</id>
<name>OpenTelemetry Collector</name>
<description>Receives and routes telemetry data</description>
<executable>%BASE%\otelcol-contrib.exe</executable>
<arguments>--config "%BASE%\config.yaml"</arguments>
<startmode>Automatic</startmode>
<log mode="roll"/>
<onfailure action="restart" delay="10 sec"/>
<onfailure action="restart" delay="30 sec"/>
<workingdirectory>%BASE%</workingdirectory>
</service>
C:\Observability\otel-collector\OtelCollector.exe install
C:\Observability\otel-collector\OtelCollector.exe start
Step 2: Prometheus¶
Download the latest release from the Prometheus downloads page. Choose the prometheus-<version>.windows-amd64.zip asset.
Extract prometheus.exe to C:\Observability\prometheus\.
Create the configuration file at C:\Observability\prometheus\prometheus.yml:
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: "otel-collector"
static_configs:
- targets: ["localhost:8889"]
Test it manually:
C:\Observability\prometheus\prometheus.exe --config.file=C:\Observability\prometheus\prometheus.yml --storage.tsdb.path=C:\Observability\prometheus\data
Register as a Windows service with WinSW. Copy WinSW-net461.exe to
C:\Observability\prometheus\Prometheus.exe and create
C:\Observability\prometheus\Prometheus.xml:
<service>
<id>Prometheus</id>
<name>Prometheus</name>
<description>Prometheus metrics storage</description>
<executable>%BASE%\prometheus.exe</executable>
<arguments>--config.file="%BASE%\prometheus.yml" --storage.tsdb.path="%BASE%\data"</arguments>
<startmode>Automatic</startmode>
<log mode="roll"/>
<onfailure action="restart" delay="10 sec"/>
<onfailure action="restart" delay="30 sec"/>
<workingdirectory>%BASE%</workingdirectory>
</service>
Note
WinSW renames itself internally — the executable must share the same base name as the
XML file (e.g. Prometheus.exe + Prometheus.xml).
C:\Observability\prometheus\Prometheus.exe install
C:\Observability\prometheus\Prometheus.exe start
Prometheus UI will be available at http://localhost:9090.
Step 3: Loki¶
Download the latest release from the Grafana Loki releases page. Choose the loki-windows-amd64.exe.zip asset.
Extract loki-windows-amd64.exe to C:\Observability\loki\ and rename to loki.exe.
Create a minimal configuration file at C:\Observability\loki\loki-config.yaml:
auth_enabled: false
server:
http_listen_port: 3100
common:
path_prefix: C:\Observability\loki\data
storage:
filesystem:
chunks_directory: C:\Observability\loki\data\chunks
rules_directory: C:\Observability\loki\data\rules
replication_factor: 1
ring:
kvstore:
store: inmemory
schema_config:
configs:
- from: "2024-01-01"
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
limits_config:
allow_structured_metadata: true
Test it manually:
C:\Observability\loki\loki.exe -config.file=C:\Observability\loki\loki-config.yaml
Register as a Windows service with WinSW. Copy WinSW-net461.exe to
C:\Observability\loki\Loki.exe and create
C:\Observability\loki\Loki.xml:
<service>
<id>Loki</id>
<name>Grafana Loki</name>
<description>Grafana Loki log aggregation</description>
<executable>%BASE%\loki.exe</executable>
<arguments>-config.file="%BASE%\loki-config.yaml"</arguments>
<startmode>Automatic</startmode>
<log mode="roll"/>
<onfailure action="restart" delay="10 sec"/>
<onfailure action="restart" delay="30 sec"/>
<workingdirectory>%BASE%</workingdirectory>
</service>
C:\Observability\loki\Loki.exe install
C:\Observability\loki\Loki.exe start
Step 4: Grafana¶
Download the latest Windows installer (.msi) from the Grafana download page.
Run the installer. The default installation directory is C:\Program Files\GrafanaLabs\grafana.
Grafana installs itself as a Windows service automatically. After installation:
Open
http://localhost:3000in a browser.Log in with the default credentials (
admin/admin), then set a new password.Add datasources manually via Connections > Data sources:
Prometheus:
Type: Prometheus
URL:
http://localhost:9090Set as default
Loki:
Type: Loki
URL:
http://localhost:3100
Alternatively, provision datasources via file. Create C:\Program Files\GrafanaLabs\grafana\conf\provisioning\datasources\datasources.yaml:
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
access: proxy
url: http://localhost:9090
isDefault: true
editable: true
- name: Loki
type: loki
access: proxy
url: http://localhost:3100
editable: true
Then restart the Grafana service:
net stop Grafana
net start Grafana
Verification¶
Once all services are running, verify the stack:
# Check OTEL Collector is accepting OTLP
curl http://localhost:4318
# Check Prometheus targets (should show otel-collector as UP)
curl http://localhost:9090/api/v1/targets
# Check Loki is ready
curl http://localhost:3100/ready
# Open Grafana
Start-Process http://localhost:3000
In Grafana, navigate to Explore, select the Prometheus datasource, and query plc_connection_state to confirm metrics are flowing from PressCenter through the collector.
Windows Firewall¶
If accessing Grafana remotely via Tailscale, allow inbound traffic on port 3000:
New-NetFirewallRule -DisplayName "Grafana" -Direction Inbound -LocalPort 3000 -Protocol TCP -Action Allow
The other services (ports 4317, 4318, 8889, 9090, 3100) only need local access and should not be exposed externally.
Managing Services¶
Each WinSW-managed service supports status, start, stop, restart, and
uninstall commands via its own executable:
# Check service status
C:\Observability\otel-collector\OtelCollector.exe status
C:\Observability\prometheus\Prometheus.exe status
C:\Observability\loki\Loki.exe status
sc query Grafana
# Stop all observability services
C:\Observability\otel-collector\OtelCollector.exe stop
C:\Observability\prometheus\Prometheus.exe stop
C:\Observability\loki\Loki.exe stop
net stop Grafana
# Start all observability services
C:\Observability\otel-collector\OtelCollector.exe start
C:\Observability\prometheus\Prometheus.exe start
C:\Observability\loki\Loki.exe start
net start Grafana
# Remove a service (if uninstalling)
C:\Observability\otel-collector\OtelCollector.exe uninstall
C:\Observability\prometheus\Prometheus.exe uninstall
C:\Observability\loki\Loki.exe uninstall
Log Export¶
When telemetry is enabled, Serilog logs are exported via OTLP/HTTP to the OpenTelemetry Collector, which forwards them to Loki for storage and querying in Grafana.
The pipeline is:
Serilog ──WriteTo.OpenTelemetry──> OTEL Collector (:4318) ──otlphttp/loki──> Loki (:3100)
This is configured automatically: when Telemetry.Enabled is true and Telemetry.OtlpEndpoint is set, Log.UpdateDefaultLogger() adds the OpenTelemetry sink alongside the existing Console and File sinks.
The sink uses OtlpProtocol.HttpProtobuf and sets service.name = Dahlke.PressCenter as a resource attribute, so logs can be filtered by service in Grafana’s Explore view with the Loki datasource.
To verify, start PressCenter with the observability stack running, then open Grafana > Explore > Loki and query {service_name="Dahlke.PressCenter"}.
NuGet Packages¶
The Dahlke.Lib.Telemetry project references:
Package |
Version |
|---|---|
|
1.11.2 |
|
1.11.2 |
|
1.11.0-beta.1 |
|
1.11.0 |
Note
The base OpenTelemetry package provides OpenTelemetrySdk.Create(). Do not add OpenTelemetry.Extensions.Hosting — it is designed for ASP.NET Core / Generic Host apps and will not work in Avalonia.
Disabling Telemetry¶
Set Telemetry.Enabled to false in appsettings.json. When disabled, no OpenTelemetry SDK is created and no metrics are exported. The application behaves identically — instrument classes are not instantiated.