Skip to main content

Map-Reduce Framework

tip

Phase 0 complete. stage_exec_replay with a noop consumer has landed (PR #19806). Subsequent phases are planned.

The Map-Reduce Framework is an extensible parallel block replay engine. It unifies every use case in Erigon that shares the same core pattern — execute blocks in parallel using history readers, collect results, process them in order — under a single engine with pluggable components.

Today each such use case has its own wiring: stage_custom_trace for receipt/log/trace indices, stage_exec_replay for benchmarking, debug_traceBlockByNumber for RPC traces, QMTree file reproduction, and future user-defined indexers. The Map-Reduce Framework replaces all of these with a composable engine where TracerFactory components produce trace data and Consumer components process it — independently registered, independently deployable.


Key Capabilities

Conflict-free parallel execution: Each worker has its own read-only kv.TemporalRo transaction. No shared VersionMap — zero write conflicts. Workers execute transactions and populate TxResult.Extra with tracer-specific data; the reduce side delivers results in txNum order via a min-heap.

MCP agent-extensible: AI agents can define custom map-reduce services over block history through an MCP interface — no Go required. Services survive restarts and backfill any gap automatically.

Isolated execution tiers: Extensions run in tiered environments from in-process trusted code through Starlark, WASM (Wazero), and out-of-process gRPC — with per-job resource budgets that protect the node from runaway extensions.


How It Works

┌─── Worker 0 (HistoryReaderV3, own tx) ───┐
│ │
Input Queue ───────>├─── Worker 1 (HistoryReaderV3, own tx) ───├──> ResultsQueue ──> Consumer (ordered)
(TxTasks) │ │ (min-heap)
└─── Worker N (HistoryReaderV3, own tx) ───┘

Map side: Each worker has its own read-only kv.TemporalRo transaction. No shared VersionMap — zero write conflicts. Workers execute transactions and populate TxResult.Extra with tracer-specific data.

Reduce side: Results arrive in txNum order via a min-heap. The Consumer processes them.

TracerFactory

Each worker needs its own tracer instance. A factory creates one per worker:

type TracerFactory interface {
component.Configurable
NewTracer() *tracing.Hooks
}

Factories register via init() + components.cfg:

func init() {
components.Register("tracer-calltrace", factory)
}

Built-in factories: CallTracerFactory, PrestateTracerFactory, StorageWriteTracerFactory, QmtreeTracerFactory. Tracers compose — multiple factories produce multiple tracing.Hooks that chain together (already supported by the EVM).

Consumer

Consumers implement the Component[P] provider lifecycle:

type Consumer interface {
component.Configurable
component.Activatable
component.Deactivatable
Reduce(br *BlockResult, result *TxResult, tx kv.TemporalTx) error
}

Built-in consumers:

ConsumerInputOutputUse case
NoopConsumeranynothingBenchmarking
DomainConsumerreceipts, logs, tracesSharedDomains writesstage_custom_trace
JSONStreamConsumeranyJSON lines to io.WriterCLI output, file export
RPCStreamConsumeranystreaming JSON-RPC responsedebug_traceBlockRange
MCPToolConsumeranyMCP tool resultAI agent queries
AggregateConsumeranyin-memory aggregationAnalytics, metrics
QmtreeConsumerqmtree tracer dataqmtree filesCommitment tree reproduction

Entry Points

All entry points construct a MapReduceJob and submit it to the engine via the service bus. The engine resolves components by name from the registry.

CLI:

integration stage_exec_replay --datadir=... --chain=hoodi \
--tracer=calltracer --output=jsonl --output-file=traces.jsonl

JSON-RPC:

{
"method": "debug_traceBlockRange",
"params": [{"fromBlock": "0x100000", "toBlock": "0x100100", "tracer": "callTracer"}]
}

MCP tool:

{
"name": "erigon_trace_block_range",
"parameters": {"from_block": 1000000, "to_block": 1000100, "tracer": "callTracer", "filter": {"to": "0x..."}}
}

Persistence Layer

Outputs that need to survive restarts are written to Erigon's domain storage — temporal key-value backed by MDBX and frozen snapshot files:

  • Every value is keyed by txNum for point-in-time queries (GetAsOf)
  • Domains are periodically merged into frozen snapshot files by the aggregator
  • Both live and historic data are accessible through the same TemporalTx interface

Custom extension services get their own domain namespace: ext_usdc_transfers, ext_storage_writes, etc.

Live Stream Attachment

For chain-tip data, consumers subscribe to the Execution component's StateChangeEvent via the service bus. A PersistAndStreamConsumer combines both paths:

Execution (live) ──> Consumer ──┬──> Persistent Domain (serves historic RPCs via GetAsOf)
└──> Live Stream channel (serves subscriptions + MCP streams)

An RPC client requesting data spanning both historic and live blocks sees one continuous stream.

MCP Agent-Extensible Services

The MCP interface enables AI agents to define custom map-reduce services without writing Go:

Agent: "Track all USDC transfers > $1M, persist results, notify me of new ones"

Framework:
1. Creates MapReduceJob with transfer-tracer + amount-filter
2. Backfills from genesis into a new extension domain
3. Attaches live stream consumer to chain-tip execution
4. Exposes query tool: erigon_query_usdc_large_transfers(from, to)
5. Exposes subscription: erigon_subscribe_usdc_large_transfers()

Services survive restarts. On restart, the framework detects the domain's progress, backfills any gap, and re-attaches the live stream.

Isolation Tiers

Extensions are isolated from the core node through tiered execution environments:

TierModelIsolationUse case
0In-process, compiledNone (trusted)Built-in tracers/consumers
1In-process, StarlarkMemory/CPU limitsUser-defined filters
2In-process, WASM (Wazero)Full sandboxThird-party extensions
3Out-of-process, gRPCProcess boundaryUntrusted code

Resource budgets per job: max memory, max CPU time, max workers, max domain size, rate limit on DB reads. Exceeding limits pauses or kills the job, not the node.

Implementation Phases

PhaseScopeStatus
0stage_exec_replay with noop consumerDone (PR #19806)
1TracerFactory registry + CLI --tracer/--output flagsPlanned
2Consumer abstraction (DomainConsumer, JSONStreamConsumer)Planned
3RPC entry point (debug_traceBlockRange)Planned
4Persistence + live stream (PersistAndStreamConsumer, ServiceRegistry)Planned
5MCP agent-extensible servicesFuture
6Isolation (Starlark, resource budgets)Future
7WASM runtime, out-of-process extensionsFuture