Skip to content

NativeDllEngine Workers

The NativeDllEngine loads per-architecture native shared libraries (.so / .dylib / .dll) into the (NativeAOT-compiled) WorkManager process. Workers implement a small C ABI defined in virtufin_worker_api_c.h and exchange CloudEvents with the engine via FlatBuffers-encoded buffers.

Scope: NativeDllEngine is in-process. A segfault, illegal instruction, or stack overflow in a native worker terminates the entire WorkManager. Authors MUST treat the worker as part of the engine's address space — defensive bounds checks, NULL checks, and input validation are required. See Caveats below.

Quick Start

1. Vendor the C ABI headers

# From your native worker project
cp /path/to/virtufin-workmanager/src/Virtufin.Worker.DevKit/Schemas/virtufin_worker_api_c.h .
cp /path/to/virtufin-workmanager/src/Virtufin.Worker.DevKit/Schemas/virtufin_worker_api_fb.h .

virtufin_worker_api_c.h is the C-only header (your worker includes this). virtufin_worker_api_fb.h is the C++ header generated from worker_api.fbs by flatc; only C++ workers need it. Both are pinned to a specific FlatBuffers runtime version via static_assert; rebuild them with flatc if you ship a newer FlatBuffers runtime.

2. Implement the Worker

The C ABI is a single header plus three exported functions. No external deps beyond libc:

#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include "virtufin_worker_api_c.h"

#define SOURCE_URN "urn:com.example.myworker"
#define REPLY_TYPE "com.example.myworker.reply"

int32_t Process(const VirtufinHost* host,
                const uint8_t* in_buf,
                int32_t in_len,
                uint8_t** out_buf,
                int32_t* out_len) {
    /* 1. Decode the input CloudEvent. The engine passes the FlatBuffer
     *    bytes via (in_buf, in_len); pure-C workers can either parse
     *    the buffer via the FlatBuffers C runtime (vendored with
     *    virtufin_worker_api_fb.h) or treat the bytes as opaque and
     *    just respond with a fixed payload.
     *
     * 2. Use host->gateway_call() to invoke the virtufin-api Gateway
     *    for backend service calls. host->log() for structured logging.
     *
     * 3. Build a WorkerResponse FlatBuffer and return it via *out_buf.
     *    The engine calls FreeResult(*out_buf) exactly once after this
     *    function returns; you MUST use the same allocator (malloc)
     *    that FreeResult releases (free).
     */
    (void)host;
    (void)in_buf;
    (void)in_len;

    /* Minimal reply: type="com.example.myworker.reply". The engine
     * publishes this CloudEvent on the worker's reply topic. */
    static const char reply_type[] = REPLY_TYPE;
    static const char source_urn[] = SOURCE_URN;
    static const char msg_id[] = "ok";

    /* Build the CloudEvent + WorkerResponse FlatBuffers inline, or use
     * the vendored C++ helpers if you compile as C++. */
    uint8_t* buf = build_minimal_worker_response(
        msg_id, reply_type, source_urn, out_len);
    *out_buf = buf;
    return 0;
}

void FreeResult(uint8_t* result) {
    free(result);
}

Option B: C++ via FlatBuffers helpers

If you compile as C++, the vendored virtufin_worker_api_fb.h provides CreateCloudEvent, CreateWorkerResponse, and CreateExtension builders:

#include "virtufin_worker_api_c.h"
#include "virtufin_worker_api_fb.h"
using namespace Virtufin::Worker::FlatBuffers;

int32_t Process(const VirtufinHost* /*host*/,
                const uint8_t* in_buf,
                int32_t in_len,
                uint8_t** out_buf,
                int32_t* out_len) {
    flatbuffers::FlatBufferBuilder fbb;
    auto id     = fbb.CreateString("reply-id");
    auto type   = fbb.CreateString("com.example.myworker.reply");
    auto source = fbb.CreateString(SOURCE_URN);
    auto ce = CreateCloudEventDirect(fbb, id, type, source, "1.0",
                                      nullptr, nullptr, nullptr, nullptr,
                                      nullptr, nullptr);
    auto resp = CreateWorkerResponse(fbb, ce, 0 /* error_message */);
    fbb.Finish(resp);

    int32_t size = (int32_t)fbb.GetSize();
    uint8_t* bytes = (uint8_t*)malloc(size);
    memcpy(bytes, fbb.GetBufferPointer(), size);
    *out_buf = bytes;
    *out_len = size;
    return 0;
}

3. Add a <virtufin*> extension to the nuspec

The engine reads Virtufin-specific metadata from virtufin* elements in the worker's .nuspec (lives inside <metadata>):

<?xml version="1.0" encoding="utf-8"?>
<package xmlns="http://schemas.microsoft.com/packaging/2013/05/nuspec.xsd">
  <metadata>
    <id>Virtufin.Worker.MyWorker</id>
    <version>0.0.1</version>
    <authors>...</authors>
    <description>...</description>
    <!-- Virtufin extensions (all optional; defaults shown). -->
    <virtufinAbiVersion>1</virtufinAbiVersion>
    <virtufinEntryPoint>Process</virtufinEntryPoint>
    <virtufinFreeResult>FreeResult</virtufinFreeResult>
    <virtufinLibrary>myworker</virtufinLibrary>
  </metadata>
</package>
Element Required Default Notes
<id> yes Standard NuGet id. Used as fallback for virtufinLibrary.
<version> yes Standard NuGet version. Informational.
virtufinAbiVersion no 1 Integer. The engine supports { 1 } today; higher values are rejected.
virtufinLibrary no <id> Library basename (no extension, no lib prefix). Must match [A-Za-z0-9_.-]+, no .., length ≤ 128.
virtufinEntryPoint no Process The name of the exported function the engine calls.
virtufinFreeResult no FreeResult The name of the exported function that releases the result buffer.

4. Build per-architecture

The worker nupkg follows the standard NuGet convention: runtimes/<rid>/native/<file>. Supported RIDs in v1 are linux-x64 and linux-arm64. The engine looks up the entry whose RID matches RuntimeInformation.RuntimeIdentifier of the running process.

worker.nupkg/
├── Virtufin.Worker.MyWorker.nuspec
└── runtimes/
    └── linux-x64/
        └── native/
            └── libmyworker.so

Build script (Linux x86_64 example, pure-C worker):

#!/usr/bin/env bash
set -euo pipefail
RID="${1:-linux-x64}"
OUT="runtimes/${RID}/native"
mkdir -p "${OUT}"
cc -std=c11 -O2 -fPIC -shared \
   -I. \
   -o "${OUT}/libmyworker.so" \
   myworker.c

# Package as a nupkg. -j flattens the .nuspec into the zip root;
# -r packages the runtimes/ tree recursively.
rm -f worker.nupkg
zip -j worker.nupkg Virtufin.Worker.MyWorker.nuspec >/dev/null
( cd runtimes && zip -r ../worker.nupkg . >/dev/null )

For Windows and macOS support, change the file extension to .dll and .dylib respectively. The engine's LibraryName.BuildFileName derives the right filename per platform.

5. Deploy

WORKER_ID=$(curl -s -X POST "http://localhost:25001/v1/workers" \
  -H "Content-Type: application/json" \
  -d '{
    "mimeType": "application/x-native-dll",
    "topic": "commands.myworker",
    "group": "myworker-group",
    "codeSource": { "content": "'$(base64 -i worker.nupkg)'" }
  }' | jq -r '.id')

curl -s -X POST "http://localhost:25001/v1/workers/$WORKER_ID/start"

Or load from URL:

curl -s -X POST "http://localhost:25001/v1/workers/$WORKER_ID/code-from-url" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://artifacts.example.com/myworker.nupkg"}'

Discovery

The engine loads the worker's library via NativeLibrary.Load and resolves two exports via NativeLibrary.GetExport:

  • entry_point (default Process) — invoked once per CloudEvent the engine delivers to the worker.
  • free_result (default FreeResult) — invoked exactly once per Process call that returned a non-null *out_buf (including on the error path).

A worker is considered "found" if both exports resolve; otherwise the engine refuses the load with a clear error.

The C ABI

Process function

int32_t Process(const VirtufinHost* host,
                const uint8_t*    in_buf,
                int32_t           in_len,
                uint8_t**         out_buf,
                int32_t*          out_len);
Parameter Direction Notes
host in Opaque pointer to a VirtufinHost* struct. Always non-null. See below.
in_buf in FlatBuffer-encoded CloudEvent matching worker_api.fbs. Length in in_len.
in_len in Byte length of in_buf.
out_buf out On success, set to a WorkerResponse FlatBuffer the worker allocated via malloc (or compatible). On failure, set to NULL.
out_len out Byte length of *out_buf.

Return value: 0 on success, non-zero on internal failure (e.g. malloc returned NULL).

FreeResult function

void FreeResult(uint8_t* result);

Releases a buffer previously returned by Process via *out_buf. The engine calls this exactly once per non-null *out_buf, including on the error path. The worker MUST use the same allocator (malloc) that this function releases (free).

VirtufinHost struct

typedef struct VirtufinHost {
    void*  engine_ptr;       // opaque GCHandle to the engine
    int32_t abi_version;     // currently 1
    void   (*log)(void*, int32_t, const char*);
    int32_t (*gateway_call)(void*, const char*, const char*,
                            const uint8_t*, int32_t,
                            uint8_t**, int32_t*);
    void    (*free_response)(void*, uint8_t*);
} VirtufinHost;
Field Type Purpose
engine_ptr opaque Workers MUST NOT dereference this. It's an opaque handle for the engine's use.
abi_version int32 Currently 1. Workers may branch on this for forward-compat.
log fn Structured logging callback. level: 0=trace, 1=debug, 2=info, 3=warn, 4=error. message is a NUL-terminated UTF-8 string valid for the duration of the call.
gateway_call fn Call a virtufin-api Gateway service. Returns 0 on success; engine-allocated response is freed by free_response.
free_response fn Releases a buffer returned by gateway_call.

The struct layout is 40 bytes on 64-bit platforms (void* = 8, int32_t = 4 padded to 8, three function pointers = 24). On 32-bit platforms the layout differs and is not currently supported.

CloudEvent FlatBuffer wire format

The engine serialises the input CloudEvent into a FlatBuffer matching worker_api.fbs. See src/Virtufin.Worker.DevKit/Schemas/worker_api.fbs for the full schema. Key fields:

  • id, type, source, specversion — CloudEvents core attributes.
  • datacontenttype, dataschema, subject, time — optional core attributes.
  • data — opaque byte payload (the CloudEvent's Data field, JSON-encoded if non-string).
  • extensions — vector of (name, value) string pairs. The correlationid extension is always the first entry if present in the input.

The worker returns a WorkerResponse FlatBuffer with exactly one of: - result_event (a CloudEvent to publish on the reply topic) - error_message (a human-readable error string; the engine surfaces this as a worker.error lifecycle event)

Both are valid; if both are set, the engine publishes the result and surfaces the error.

Caveats

In-process crash isolation

A native worker runs in the same address space as the AOT-compiled WorkManager. An unrecoverable native fault (segfault, illegal instruction, stack overflow) terminates the entire WorkManager process — there is no managed try/catch that can rescue such a fault.

Engine Crash isolation
PythonEngine Process boundary (subprocess)
DotNetDllEngine None — in-process (JIT via hostfxr)
CSharpSourceEngine AppDomain equivalent (Roslyn-compiled)
NativeDllEngine None — in-process

Author defensively: bounds-check every input length, validate pointers, prefer malloc-family allocation that the engine can free, and unit-test your worker against malformed inputs.

Unmanaged timeout

MessageHandlingTimeoutSeconds is enforced via a CancellationToken on the managed side. The engine sets the token and then calls Process. The token's cancellation does not interrupt native code — a hung Process blocks the caller until the timeout fires, at which point the engine surfaces a worker.error lifecycle event and leaves the worker in an errored state. The native call continues to run until it returns or the process crashes.

Workers that may run for a long time should honor the timeout cooperatively (e.g. via a watchdog thread setting a "should stop" flag the worker polls) or be guaranteed to return within the configured timeout.

ABI version policy

The engine accepts workers declaring any ABI version in WorkerManifest.SupportedAbiVersions (currently { 1 }). When the WorkManager ships a new major ABI version, the previous version is retained in the supported set for at least one release cycle. Workers declaring ABIs older than the lowest supported version are refused with a clear error.

Performance

Phase Duration (typical)
First LoadCodeAsync (per-RID library resolve + dlopen) ~10 ms
Per-message ProcessAsync (FlatBuffer encode + native call + decode) ~50-200 µs
DisposeAsync (library unload + host struct free + temp file delete) ~5 ms

Compared to DotNetDllEngine, NativeDllEngine has the same in-process isolation profile and trades per-message FlatBuffer round-trip for no JIT startup cost. Compared to CSharpSourceEngine, NativeDllEngine skips Roslyn compilation at load time (the library is pre-compiled) but adds a per-message FlatBuffer round-trip.

Debugging

  • The engine logs each LoadCodeAsync and DisposeAsync at info level with the resolved entry path, library basename, and ABI version.
  • Native worker logs flow through the engine's Log callback and appear in the WorkManager log as [Information] Native worker: <message> or [Warning] / [Error] depending on the worker's chosen level.
  • Set Logging:LogLevel:Virtufin.WorkManager.Engine.NativeDll to Debug to see the per-call host callback invocations.

CI Notes

The engine's integration test (Virtufin.WorkManager.Engine.NativeDll.Tests) builds a small C++ shim (libecho_worker.dylib/.so) via a BeforeTargets="Build" target that invokes Resources/build_echo_worker.sh. The script is a no-op when clang++ is not on PATH, so CI runners without a C++ toolchain (e.g. the Gitea act runner) build the test project without building the shim. Local macOS dev builds have clang++ via xcode-select and run the script normally. To force the shim build off in any environment, pass -p:BuildEchoWorkerShim=false to dotnet build.