Table of Contents

Namespace Virtufin.WorkManager.Services

Classes

ApiLifecyclePublisher

Publishes worker lifecycle events to the virtufin-api's system-events topic. Events are CloudEvents v1.0 envelopes with the WorkManager's URN as the source. The publisher is best-effort: failures are logged but never propagated, so an unreachable API does not break the WorkManager's worker state machine.

AppMetrics

Application-level metrics for the WorkManager service.

DaprCircuitBreakerHealthCheck

Health check that reports Dapr circuit breaker state.

DaprResiliencePipeline

Provides resilience policies for Dapr operations with retry and circuit breaker. Constants are configurable via DaprResilienceOptions (env-var-driven).

LeaderLease

Persistent lease record stored at LeaseKey.

RecoveryHealthCheck

Health check that verifies worker recovery has completed successfully.

RecoveryLeaderElector

Lease-based leader election backed by the Dapr state store. Multiple WorkManager instances can race to acquire a lease for the workmanager/recovery/leader key; only the holder performs worker recovery. The lease has a TTL so a crashed instance's lease expires automatically and another instance can take over.

RecoveryState

Tracks the state of worker recovery on startup. Used by the recovery health check to gate Kestrel readiness.

ResilientDaprPublisher

Buffers outbound pub/sub messages during Dapr outages and flushes on reconnect.

WorkManagerGrpcService

gRPC service implementation for the WorkManager.

WorkManagerRecoveryHostedService

Hosted service that automatically recovers workers on application startup, gated by Enabled and coordinated across multiple WorkManager instances via RecoveryLeaderElector.

When multiple instances start concurrently, only the leader performs recovery — the others wait for the leader lease to expire (or be released) before they would attempt to take over. This avoids the N×W engine-load waste described in WM #4.

Kestrel readiness is gated via RecoveryHealthCheck:

  • Degraded while recovery is in progress.
  • Unhealthy if recovery failed.
  • Healthy once recovery completes (whether this instance performed it or another leader did).

Interfaces

IRecoveryLeaderElector

Abstraction over the leader-election primitive. Allows the hosted service to be unit-tested without touching a real Dapr state store.

IWorkerRecoveryExecutor

Abstraction over the WorkManager's recovery entry point. Allows the hosted service to be unit-tested without mocking the sealed WorkManager class.