Version 1.1 — March 2026
UltraDAG is a minimal cryptocurrency built on a leaderless DAG-BFT consensus protocol. The entire consensus core is 1,100 lines of Rust across five files. The protocol achieves Byzantine fault tolerance through descendant coverage finality — a vertex is finalized when 2f+1 distinct validators have built on top of it. This implicit voting mechanism eliminates leader election, view changes, and explicit vote messages. The system has been validated through 373 automated tests (all passing) and a 4-node Fly.io testnet with 1800+ consensus rounds. UltraDAG demonstrates that a complete, working cryptocurrency with a 21 million supply cap, halving schedule, and validator staking can be built with radical simplicity.
Traditional Byzantine Fault Tolerant (BFT) consensus protocols — such as PBFT, Tendermint, and HotStuff — operate in a leader-based paradigm. In each round or view, a designated leader proposes a block, and other validators vote on it. This creates three fundamental limitations:
DAG-based consensus protocols address these limitations by allowing all validators to produce blocks (vertices) concurrently. Recent protocols such as DAG-Rider, Tusk, Bullshark, and Shoal++ have demonstrated that DAG structures can achieve consensus without explicit voting rounds.
UltraDAG implements a complete, working cryptocurrency using a custom leaderless DAG-BFT protocol with the following properties:
Everything else in this paper — the round gate, stall recovery, deterministic ordering, state derivation — is implementation detail required to make these three sentences operational. The protocol’s correctness can be reasoned about entirely from these three rules.
The complete consensus implementation is contained in five files totaling 1,887 lines of Rust, of which 1,100 lines are production code and 787 are inline unit tests:
| File | Production | Tests | Total |
|---|---|---|---|
vertex.rs | 90 | 142 | 232 |
dag.rs | 609 | 288 | 897 |
finality.rs | 212 | 163 | 375 |
ordering.rs | 69 | 98 | 167 |
validator_set.rs | 120 | 96 | 216 |
| Total | 1,100 | 787 | 1,887 |
For comparison:
| System | Approx. Consensus Lines |
|---|---|
| Narwhal/Tusk (Mysten Labs) | ~15,000 |
| Bullshark | ~20,000 (incl. Narwhal) |
| Shoal++ | ~30,000 (incl. Bullshark + pipelining + reputation) |
| UltraDAG | 1,100 |
This is not an accident. It is the result of deliberate omissions.
No separate mempool layer. Narwhal introduced a dedicated data-availability layer where transactions are disseminated independently of consensus ordering. UltraDAG bundles transactions directly into vertices. This couples throughput to round timing (see Section 15) but eliminates an entire subsystem (~5,000 lines in Narwhal). For the target use case — IoT micropayments — per-node transaction volume is modest and a separate mempool layer adds complexity without proportional benefit.
No leader or anchor selection. Bullshark and Shoal++ designate “anchor” vertices in even rounds whose causal history determines the commit order. This requires leader-election logic, fallback paths for missing anchors, and careful interaction between the anchor schedule and the DAG structure. UltraDAG replaces all of this with a single descendant-coverage check: a vertex is final when enough validators have built on it. No vertex is special.
No wave structure. DAG-Rider organizes rounds into “waves” of 4 rounds each, with a common-coin-based leader per wave. Shoal++ pipelines waves for lower latency. UltraDAG has no waves — every round is identical. Round r works exactly like round r+1. This eliminates the wave-management state machine entirely.
No reputation system. Shoal++ includes a validator reputation mechanism that tracks responsiveness and adjusts leader selection accordingly (~2,000 lines). UltraDAG handles stall recovery in 8 lines: after 3 consecutive round skips, the validator produces unconditionally. This is sufficient for networks with known, cooperating validators.
UltraDAG is not optimized for maximum throughput. It is optimized for minimum correct implementation.
The 27x reduction in consensus code (1,100 vs. ~30,000 lines) directly reduces the attack surface. Every line of consensus code is a potential source of consensus bugs — the most catastrophic class of failure in a distributed ledger. A protocol that can be fully described in three sentences can be fully audited by a single engineer in a single day. This property is worth more than any throughput benchmark for networks where correctness is the primary requirement.
Let V = {v1, v2, …, vn} be the set of n validators. Each validator vi holds an Ed25519 keypair (ski, pki) and is identified by an address:
addr_i = Blake3(pk_i)
We assume the standard BFT fault model: at most f validators are Byzantine, where n ≥ 3f + 1.
The protocol assumes partial synchrony: there exists an unknown Global Stabilization Time (GST) after which all messages between honest validators are delivered within a bounded delay δ. Before GST, messages may be delayed arbitrarily.
| Primitive | Algorithm | Purpose |
|---|---|---|
| Digital Signatures | Ed25519 (ed25519-dalek 2.2.0) | Vertex and transaction authentication |
| Hashing | Blake3 | Address derivation, vertex identity, Merkle trees |
| Replay Prevention | NETWORK_ID prefix | Cross-network signature isolation |
The core data structure is a directed acyclic graph G = (V, E) where V is the set of all accepted vertices and E = {(u, v) : H(u) ∈ v.parents}. Each vertex is a tuple:
v = (block, parents, round, validator, pub_key, signature)
Where block contains transactions, parents references all DAG tips at time of creation, round is the logical round number, and signature is an Ed25519 signature over the vertex’s signable bytes.
H(v) = Blake3(block_hash || round_LE64 || validator || parent_0 || ... || parent_k)
signable(v) = NETWORK_ID || block_hash || parent_0 || ... || parent_k || round_LE64 || validator
Each honest validator produces exactly one vertex per round. The validator uses optimistic responsiveness: it waits for either the round timer (default: 5 seconds) or a notification that a new vertex has arrived. When a notification fires and the previous round has quorum, the validator produces immediately without waiting for the timer.
tokio::select! between the round timer and round_notify (fired on every new vertex insertion).The timer resets after early production to prevent double-firing. This achieves sub-second finality latency under normal conditions while the timer guarantees progress even without notifications.
A vertex v is accepted into the DAG if and only if all of the following hold:
When a vertex fails insertion due to missing parents, the receiving node initiates a recursive parent fetch protocol to ensure DAG convergence across partitioned nodes:
GetParents message is sent to the source peer containing the missing parent hashes (capped at 32 per request).ParentVertices containing the requested vertices from its local DAG.resolve_orphans() iterates the orphan buffer and attempts to insert buffered vertices whose parents now exist.GetDagVertices to trigger bulk re-sync.This protocol guarantees that two honest nodes will eventually converge to the same DAG state after any transient network partition, provided they can exchange messages.
For a vertex v in DAG state G:
DV(v, G) = { u.validator : u ∈ descendants(v, G) }
The set of distinct validator addresses that have produced at least one descendant of v.
q(n) = ⌈2n/3⌉ = ⌊(2n + 2) / 3⌋
When a configured_validators count is set, the threshold uses that fixed count instead of the dynamically registered count.
FINALIZED(v) ⇔ |DV(v, G)| ≥ ⌈2n/3⌉ ∧ (∀p ∈ v.parents : FINALIZED(p))
Rather than recomputing DV(v, G) via BFS on every finality check (O(V) per vertex), UltraDAG maintains a precomputed map descendant_validators: HashMap<Hash, HashSet<Address>>. When a new vertex v is inserted into the DAG:
descendant_validators[a].This gives O(1) finality lookups via descendant_validators[hash].len(), with amortized O(V) total work across all inserts. Benchmark: 10,000 vertices finalized in 21ms (previously 47,000ms).
The find_newly_finalized() procedure uses forward propagation instead of per-tip BFS:
descendant_validator_count ≥ threshold.(round, hash) for deterministic ancestor-first ordering.Multiple passes run in a loop until no new vertices are finalized, guaranteeing the parent finality invariant.
A vertex may only be finalized if all its parents are already finalized. This ensures that when vertex v is finalized, all state changes from v’s causal history have already been committed to the state engine. Without this guarantee, transactions could be applied against an incomplete state, producing non-deterministic results across nodes.
Two equivocating vertices (same validator, same round, different hash) cannot both be finalized. The equivocation detection rule ensures at most one vertex from a given validator in a given round exists in any honest node’s DAG.
If vertex v is finalized at node A, then |DV(v, GA)| ≥ ⌈2n/3⌉. If a conflicting vertex v′ were finalized at node B, then |DV(v′, GB)| ≥ ⌈2n/3⌉. By the quorum intersection property:
|DV(v, G_A)| + |DV(v', G_B)| ≥ 2 · ⌈2n/3⌉ > n + f
Therefore DV(v, GA) ∩ DV(v′, GB) must contain at least one honest validator. This honest validator’s DAG would contain both conflicting vertices, triggering equivocation detection — a contradiction.
For transaction-level conflicts, safety is ensured by deterministic ordering: the vertex finalized first wins.
Note: This is an argument sketch, not a formal proof.
If at least ⌈2n/3⌉ validators are honest and connected (after GST), the protocol makes progress. Each honest validator produces one vertex per round. After GST, vertices propagate within δ time. A vertex produced in round r accumulates honest descendants in rounds r+1 and r+2, reaching the finality threshold within 2–3 rounds.
The stall recovery mechanism ensures liveness during bootstrap: after 3 consecutive round skips (due to the 2f+1 gate), a validator produces unconditionally. The implementation is 8 lines of Rust — compare to Shoal++’s ~2,000-line reputation-based leader recovery.
Theoretical: With 4 validators producing at the same round, finality lag should stabilize at 2–3 rounds. In practice, round desynchronization between nodes currently increases this lag (see Section 14).
When a vertex v is submitted but another vertex v′ from the same validator in the same round already exists:
EquivocationEvidence messagesUltraDAG uses an account-based model with balance (in satoshis, 1 UDAG = 108 sats) and nonce (transaction counter for replay protection) per address.
Finalized vertices are applied to the state engine in deterministic order:
order(v1, v2) = round (ascending) → primary key ancestor count (asc) → secondary key H(v) lexicographic → tiebreaker
| Parameter | Value |
|---|---|
| Maximum supply | 21,000,000 UDAG |
| Smallest unit | 1 satoshi = 10−8 UDAG |
| Initial block reward | 50 UDAG per round (total emission) |
| Halving interval | Every 210,000 finalized rounds |
| Default round time | 5 seconds |
| Supply cap enforcement | Reward capped at MAX_SUPPLY boundary |
| Allocation | Amount | Purpose |
|---|---|---|
| Developer allocation | 1,050,000 UDAG (5%) | Protocol development |
| Faucet reserve | 1,000,000 UDAG | Testnet only |
| Mining rewards | ~19,000,000 UDAG (95%) | Validator rewards over time |
The developer allocation is credited at genesis to a deterministic address derived from the seed "ultradag-dev-addr-testnet-v1". This allocation is visible and auditable from block 0. There is no VC funding, no presale, and no hidden allocations. The 5% developer allocation is the only non-mined supply. For mainnet, this will be replaced with an offline-generated keypair stored in a hardware wallet.
| Parameter | Value |
|---|---|
| Minimum stake | 10,000 UDAG |
| Unstaking cooldown | 2,016 rounds (~1 week) |
| Slashing penalty | 50% on equivocation |
| Reward distribution | Proportional to stake |
| Epoch length | 210,000 rounds (~12 days at 5s rounds) |
| Max active validators | 21 (top stakers by amount) |
The active validator set is recalculated at epoch boundaries (every 210,000 finalized rounds). At each epoch transition:
StateEngine::recalculate_active_set() selects the top 21 stakers by amount.sync_epoch_validators() bridges the active set to the FinalityTracker:configured_validators to the active set size (fixes quorum denominator)This ensures quorum intersection holds across epoch boundaries: the old set must achieve finality on the transition vertex before the new set begins producing.
reward(h) = ⌊50 × 10&sup8; / 2^(h / 210000)⌋
Total emission per round = reward(height). This amount is split among validators:
reward(h) × (own_stake / total_stake) (proportional to their stake)total_stake == 0, each validator receives the full reward(h) per round. This ensures the network can operate before any validators have staked, maintaining backward compatibility with non-staking deployments.The fallback mechanism allows validators to earn rewards immediately upon joining the network, which they can then stake to participate in proportional reward distribution. Reward = 0 after 64 halvings. Slashed stake is burned (removed from total_supply).
Peers communicate over TCP with 4-byte big-endian length-prefixed JSON messages (max 4 MB). Each connection is split into independent PeerReader and PeerWriter halves.
| Message | Direction | Description |
|---|---|---|
Hello | Bidirectional | Version, current DAG round, listen port |
DagProposal | Broadcast | New signed DAG vertex |
GetDagVertices | Request | Request vertices from a given round |
DagVertices | Response | Batch of DAG vertices for sync |
NewTx | Broadcast | New transaction for mempool |
GetPeers / Peers | Request/Response | Gossip-based peer discovery |
GetParents | Request | Request specific vertices by hash (missing parent resolution) |
ParentVertices | Response | Requested parent vertices for DAG convergence |
EquivocationEvidence | Broadcast | Two conflicting vertices as proof |
CheckpointProposal | Broadcast | Validator proposes checkpoint, requests co-signatures |
CheckpointSignatureMsg | Broadcast | Co-signature on a verified checkpoint |
GetCheckpoint | Request | Request latest checkpoint for fast-sync |
CheckpointSync | Response | Checkpoint + suffix + state for new node sync |
Ping / Pong | Keepalive | Connection liveness |
ultradag-node CLI binary: validator loop + HTTP RPC
└— ultradag-network TCP P2P: peer discovery, DAG relay, sync
└— ultradag-coin Core: consensus, state, crypto, persistence
Built on Tokio for async I/O. All shared state (DAG, finality tracker, state engine, mempool) is protected by tokio::sync::RwLock with short lock scopes. Write locks are never held across I/O operations.
All node state is periodically saved to disk using atomic file operations (write to .tmp, then rename). State is saved every 10 rounds and on graceful shutdown.
The --validators N flag fixes the quorum threshold at ⌈2N/3⌉ regardless of dynamic registrations. This prevents phantom validator inflation — a class of bugs where stale registrations raise the quorum beyond what active validators can satisfy.
Every CHECKPOINT_INTERVAL finalized rounds (default: 1,000), validators produce a signed checkpoint capturing the state_root (Blake3 hash of the full state snapshot), the dag_tip (hash of the last finalized vertex), and the total_supply. Checkpoints require quorum (⌈2n/3⌉) validator signatures to be accepted.
New nodes fast-sync by requesting the latest accepted checkpoint from a peer, verifying the quorum signatures and state_root integrity, applying the state snapshot, and inserting the suffix DAG from checkpoint.round to present. This reduces sync time from O(all history) to O(suffix since last checkpoint).
Equivocation evidence is stored in a separate, permanently retained store that survives DAG pruning, ensuring slashing proofs remain available regardless of how much history has been pruned. The --archive flag disables pruning entirely for archive nodes and block explorers.
| Attack | Defense |
|---|---|
| Equivocation | One vertex per validator per round; permanent ban + evidence broadcast |
| Network replay | NETWORK_ID prefix in all signable bytes |
| DAG corruption (phantom parents) | Parent existence check before insertion |
| Memory exhaustion (future rounds) | MAX_FUTURE_ROUNDS = 10; vertices beyond rejected |
| Message flooding DoS | 4 MB max message size; 10K mempool cap with fee eviction |
| Nothing-at-stake | Equivocation detection + permanent ban |
| Phantom validator inflation | --validators N configured count fixes quorum denominator |
| Non-deterministic finality | BTreeSet iteration; deterministic hash ordering |
| Clock drift attack | 2f+1 round gate prevents ahead-of-network advancement. Round bound (MAX_FUTURE_ROUNDS=10) prevents future-round flooding. Slow-clock validators skip rounds and catch up via stall recovery. Drift degrades DAG density but cannot violate safety. |
| Orphan buffer exhaustion | Hard cap: 1,000 entries AND 50 MB byte limit. Vertices exceeding either limit silently dropped. Per-peer rate limiting (future work) would further mitigate. |
| Sync poisoning | Every vertex in a DagVertices response is verified (Ed25519 signature, equivocation check, parent existence, round bound) before insertion. Same acceptance rules as live proposals. |
| DAG partition divergence | Recursive parent fetch via GetParents/ParentVertices ensures convergence after network partitions. Orphan buffer (1K entries, 50 MB) prevents memory exhaustion. Stall recovery triggers bulk re-sync when finality lags >10 rounds. |
tokio::select! on vertex notifications. Timer fires as fallback. Sub-second finality under normal conditions.sync_epoch_validators() bridges StateEngine active set to FinalityTracker allowlist at epoch boundaries. Quorum intersection maintained across transitions.A 4-node Fly.io testnet (ams region) runs continuously with permissioned validator set:
| Metric | Value |
|---|---|
| DAG round | 330+ |
| Last finalized round | 182 |
| Validator count | 4 (permissioned allowlist) |
| Genesis supply | 2,050,000 UDAG (dev 1,050,000 + faucet 1,000,000) |
| Current supply | 2,059,550 UDAG |
| Avg round time | 5.0s (timer fallback) |
| Tests passing | 373 |
Known issue: Nodes currently produce at independent round numbers due to asynchronous startup, preventing in-round quorum. Finality happens via descendant accumulation across rounds, resulting in elevated finality lag. Round synchronization is planned to address this.
Throughput is coupled to round timing. Because transactions are bundled directly into vertices (no separate data-availability layer), maximum throughput is:
TPS_max = (max_txs_per_vertex × validators_per_round) / round_duration
With 4 validators and a maximum of 10,000 transactions per vertex:
| Round Duration | Theoretical Max TPS |
|---|---|
| 5 seconds | 8,000 |
| 2 seconds | 20,000 |
| 1 second | 40,000 |
By comparison, Narwhal decouples data availability from ordering — validators disseminate transaction batches continuously between rounds, so throughput scales with bandwidth rather than round timing. Under identical hardware, Narwhal achieves 100,000+ TPS by saturating network bandwidth independent of consensus latency.
No pipelining. In UltraDAG, consensus and DAG construction are sequential. Shoal++ pipelines these operations: while round r’s vertices are being finalized, round r+1’s vertices are already being built and disseminated, approximately halving effective finality latency.
Modest per-node transaction volume. IoT devices generate transactions at rates measured in transactions per second, not thousands. A sensor reporting readings every 5 seconds, a smart meter settling micropayments every minute — these workloads fit comfortably within a single vertex per round.
Device constraints favor auditability. An embedded device that participates as a light client or validator must be able to understand and verify the consensus protocol. A protocol with 1,100 lines of consensus logic can be compiled to a small binary, audited on the target platform, and reasoned about in resource-constrained environments. A protocol with 30,000 lines cannot.
Code complexity is attack surface. The 27x reduction from Shoal++ to UltraDAG directly reduces the number of places where a bug could cause a safety violation, liveness failure, or state divergence. For networks where the cost of a consensus bug exceeds the cost of lower throughput, this tradeoff is unambiguously correct.
Round timing is tunable. The --round-ms flag allows operators to choose their position on the latency-throughput curve. A 1-second round with 4 validators can handle 40,000 TPS theoretical max, which exceeds most L1 chains in production today.
| Property | PBFT | Tendermint | HotStuff | DAG-Rider | Narwhal/Tusk | Bullshark | Shoal++ | UltraDAG |
|---|---|---|---|---|---|---|---|---|
| Leader | Per-view | Round-robin | Rotating | None | None + leader | Anchor | Anchor + rep. | None |
| Finality | 3 phases | 2 phases | Pipeline | Wave-based | Separate | 2 rounds | 1 round | Desc. coverage |
| Votes | Explicit | Explicit | Threshold | Implicit | Mixed | Implicit | Implicit | Implicit |
| Messages | O(n²) | O(n²) | O(n) | O(n) | O(n) | O(n) | O(n) | O(n) |
| Consensus code | ~5,000 | ~10,000 | ~8,000 | ~10,000 | ~15,000 | ~20,000 | ~30,000 | 1,100 |
| 3-sentence rule? | No | No | No | No | No | No | No | Yes |
| Separate mempool | No | No | No | No | Yes | Yes | Yes | No |
| Responsive | No | No | Yes | No | No | No | Yes | Yes |
| Waves/anchors | N/A | N/A | N/A | 4-round waves | N/A | 2-round anchors | Pipelined | None |
The “3-sentence rule” row captures a qualitative property: can the complete consensus rule be stated in three sentences that a competent distributed systems engineer can verify for correctness? For UltraDAG, yes (Section 2.1). For protocols with wave structures, anchor selection, reputation systems, or view changes, the rule set is inherently more complex.
sync_epoch_validators(); active set capped at 21 validatorsGetParents/ParentVerticesUltraDAG demonstrates that a complete, working cryptocurrency can be built on a leaderless DAG-BFT consensus protocol with minimal complexity. The entire consensus core — 1,100 lines of Rust across five files — implements DAG construction, BFT finality via descendant coverage, deterministic ordering, validator management, and Ed25519-signed vertices. The protocol can be stated in three sentences and fully audited in a day.
The protocol’s safety relies on the standard BFT quorum intersection property — the same foundation used by PBFT, Tendermint, and HotStuff — applied to an implicit voting mechanism where DAG topology replaces explicit vote messages. While a formal safety proof remains future work, the system has been thoroughly tested with 373 automated tests (all passing) and validated on a 4-node Fly.io testnet. Round synchronization between nodes is a known area for improvement (see Section 14).
UltraDAG is not the fastest DAG-BFT protocol. It is the simplest correct one. For networks where auditability, small binary size, and minimal attack surface matter more than maximum throughput — IoT micropayments, embedded systems, resource-constrained validators — this is the right tradeoff.