Map-Reduce Framework
Phase 0 complete. stage_exec_replay with a noop consumer has landed (PR #19806). Subsequent phases are planned.
The Map-Reduce Framework is an extensible parallel block replay engine. It unifies every use case in Erigon that shares the same core pattern — execute blocks in parallel using history readers, collect results, process them in order — under a single engine with pluggable components.
Today each such use case has its own wiring: stage_custom_trace for receipt/log/trace indices, stage_exec_replay for benchmarking, debug_traceBlockByNumber for RPC traces, QMTree file reproduction, and future user-defined indexers. The Map-Reduce Framework replaces all of these with a composable engine where TracerFactory components produce trace data and Consumer components process it — independently registered, independently deployable.
Key Capabilities
Conflict-free parallel execution:
Each worker has its own read-only kv.TemporalRo transaction. No shared VersionMap — zero write conflicts. Workers execute transactions and populate TxResult.Extra with tracer-specific data; the reduce side delivers results in txNum order via a min-heap.
MCP agent-extensible: AI agents can define custom map-reduce services over block history through an MCP interface — no Go required. Services survive restarts and backfill any gap automatically.
Isolated execution tiers: Extensions run in tiered environments from in-process trusted code through Starlark, WASM (Wazero), and out-of-process gRPC — with per-job resource budgets that protect the node from runaway extensions.
How It Works
┌─── Worker 0 (HistoryReaderV3, own tx) ───┐
│ │
Input Queue ───────>├─── Worker 1 (HistoryReaderV3, own tx) ───├──> ResultsQueue ──> Consumer (ordered)
(TxTasks) │ │ (min-heap)
└─── Worker N (HistoryReaderV3, own tx) ───┘
Map side: Each worker has its own read-only kv.TemporalRo transaction. No shared VersionMap — zero write conflicts. Workers execute transactions and populate TxResult.Extra with tracer-specific data.
Reduce side: Results arrive in txNum order via a min-heap. The Consumer processes them.
TracerFactory
Each worker needs its own tracer instance. A factory creates one per worker:
type TracerFactory interface {
component.Configurable
NewTracer() *tracing.Hooks
}
Factories register via init() + components.cfg:
func init() {
components.Register("tracer-calltrace", factory)
}
Built-in factories: CallTracerFactory, PrestateTracerFactory, StorageWriteTracerFactory, QmtreeTracerFactory. Tracers compose — multiple factories produce multiple tracing.Hooks that chain together (already supported by the EVM).
Consumer
Consumers implement the Component[P] provider lifecycle:
type Consumer interface {
component.Configurable
component.Activatable
component.Deactivatable
Reduce(br *BlockResult, result *TxResult, tx kv.TemporalTx) error
}
Built-in consumers:
| Consumer | Input | Output | Use case |
|---|---|---|---|
NoopConsumer | any | nothing | Benchmarking |
DomainConsumer | receipts, logs, traces | SharedDomains writes | stage_custom_trace |
JSONStreamConsumer | any | JSON lines to io.Writer | CLI output, file export |
RPCStreamConsumer | any | streaming JSON-RPC response | debug_traceBlockRange |
MCPToolConsumer | any | MCP tool result | AI agent queries |
AggregateConsumer | any | in-memory aggregation | Analytics, metrics |
QmtreeConsumer | qmtree tracer data | qmtree files | Commitment tree reproduction |
Entry Points
All entry points construct a MapReduceJob and submit it to the engine via the service bus. The engine resolves components by name from the registry.
CLI:
integration stage_exec_replay --datadir=... --chain=hoodi \
--tracer=calltracer --output=jsonl --output-file=traces.jsonl
JSON-RPC:
{
"method": "debug_traceBlockRange",
"params": [{"fromBlock": "0x100000", "toBlock": "0x100100", "tracer": "callTracer"}]
}
MCP tool:
{
"name": "erigon_trace_block_range",
"parameters": {"from_block": 1000000, "to_block": 1000100, "tracer": "callTracer", "filter": {"to": "0x..."}}
}
Persistence Layer
Outputs that need to survive restarts are written to Erigon's domain storage — temporal key-value backed by MDBX and frozen snapshot files:
- Every value is keyed by
txNumfor point-in-time queries (GetAsOf) - Domains are periodically merged into frozen snapshot files by the aggregator
- Both live and historic data are accessible through the same
TemporalTxinterface
Custom extension services get their own domain namespace: ext_usdc_transfers, ext_storage_writes, etc.
Live Stream Attachment
For chain-tip data, consumers subscribe to the Execution component's StateChangeEvent via the service bus. A PersistAndStreamConsumer combines both paths:
Execution (live) ──> Consumer ──┬──> Persistent Domain (serves historic RPCs via GetAsOf)
└──> Live Stream channel (serves subscriptions + MCP streams)
An RPC client requesting data spanning both historic and live blocks sees one continuous stream.
MCP Agent-Extensible Services
The MCP interface enables AI agents to define custom map-reduce services without writing Go:
Agent: "Track all USDC transfers > $1M, persist results, notify me of new ones"
Framework:
1. Creates MapReduceJob with transfer-tracer + amount-filter
2. Backfills from genesis into a new extension domain
3. Attaches live stream consumer to chain-tip execution
4. Exposes query tool: erigon_query_usdc_large_transfers(from, to)
5. Exposes subscription: erigon_subscribe_usdc_large_transfers()
Services survive restarts. On restart, the framework detects the domain's progress, backfills any gap, and re-attaches the live stream.
Isolation Tiers
Extensions are isolated from the core node through tiered execution environments:
| Tier | Model | Isolation | Use case |
|---|---|---|---|
| 0 | In-process, compiled | None (trusted) | Built-in tracers/consumers |
| 1 | In-process, Starlark | Memory/CPU limits | User-defined filters |
| 2 | In-process, WASM (Wazero) | Full sandbox | Third-party extensions |
| 3 | Out-of-process, gRPC | Process boundary | Untrusted code |
Resource budgets per job: max memory, max CPU time, max workers, max domain size, rate limit on DB reads. Exceeding limits pauses or kills the job, not the node.
Implementation Phases
| Phase | Scope | Status |
|---|---|---|
| 0 | stage_exec_replay with noop consumer | Done (PR #19806) |
| 1 | TracerFactory registry + CLI --tracer/--output flags | Planned |
| 2 | Consumer abstraction (DomainConsumer, JSONStreamConsumer) | Planned |
| 3 | RPC entry point (debug_traceBlockRange) | Planned |
| 4 | Persistence + live stream (PersistAndStreamConsumer, ServiceRegistry) | Planned |
| 5 | MCP agent-extensible services | Future |
| 6 | Isolation (Starlark, resource budgets) | Future |
| 7 | WASM runtime, out-of-process extensions | Future |