BIP452: P2P UTXO Set Sharing #2137

fjahr commented at 9:54 PM on April 10, 2026: contributor

This BIP draft describes the sharing of a full UTXO set via the p2p network.

Design summary:

Uses a new service bit to signal ability to share one or more UTXO sets
Introduces four new P2P messages, one round trip to get information on the available UTXO sets and one round trip for each chunk and associated meta data
UTXO sets are downloaded in chunks of 3.9 MiB
For each chunk there is a merkle proof which shows the chunk is part of the same merkle tree, this prevent potential DoS/OOM attack vectors
The root of the merkle tree can be known through a trusted information source (assumutxo params in Bitcoin Core) or multiple peers could be asked and the mechanism only used if there is agreement on the value, similar to compact block filters

The one part I am not so sure about yet: This references Bitcoin Core and it’s features or RPCs in a few places now. I am aware that this is not ideal for specification that targets a wider audience but the reality is that assumeutxo seems to be only implemented in Bitcoin Core and mentioning RPCs from the workflow there seems the most clear way to describe this. But I am happy to generalize this further, I would be very happy to receive some guidance what level of referencing assumutxo is acceptable here since it is obviously the main current motivation. One concrete example: The Bitcoin Core PR that I will use as a reference implementation will rely on assumeutxo params rather than comparing multiple peer values of the merkle root. Is this discrepancy an issue?

Mailing List post: https://groups.google.com/g/bitcoindev/c/rThmyI8ZN3Q/m/TJBc1xRbAQAJ

Proposed implementation: https://github.com/bitcoin/bitcoin/pull/35054

fjahr force-pushed on Apr 10, 2026

in bip-XXXX.md:143 in 4bc39c8b17

 138 | +| `serialized_hash` | `uint256` | 32 | The UTXO set serialized hash. |
 139 | +| `data_length` | `uint64_t` | 8 | Total size of the serialized UTXO set in bytes (header + body). |
 140 | +| `merkle_root` | `uint256` | 32 | Root of the Merkle tree computed over chunk hashes. |
 141 | +
 142 | +A requesting node MUST ignore entries whose `serialized_hash` does not match a known
 143 | +utxo set hash for the corresponding height.

ajtowns commented at 10:37 PM on April 10, 2026:

I think it would be better for a client that supports this mechanism to hardcode the merkle root instead of the straight serialized hash, and to drop serialized_hash from this message.

fjahr commented at 8:56 PM on April 11, 2026:

Hm, yeah, I think that makes sense, I was struggling with finding the right path between building on top of the existing assumeutxo data we already have and extending it. I am adding the merkle root to the assumeutxo data in my PR, so checking the serialized_hash is a belts and suspenders there, so it makes sense to make this explicit here as well.

ajtowns commented at 2:24 AM on April 12, 2026:

If you wanted to change the contents of the utxo dump as well, it would be nice to include the header chain for the block where the snapshot was taken. Then you could do an import without needing to connect to the network at all, I think.

fjahr commented at 3:21 PM on May 5, 2026:

I think it would be better for a client that supports this mechanism to hardcode the merkle root instead of the straight serialized hash, and to drop serialized_hash from this message.

Dropped serialized_hash mentions from the BIP completely in this new version.

it would be nice to include the header chain for the block where the snapshot was taken

Hm, I find this a bit odd. For the purpose of this BIP we are connected to the network necessarily, so we can just as well fetch the headers over the network in the way we usually do and that should be simpler code-wise. For loading the file directly it would make more sense to me but still we are connecting to the network afterwards. I will need to think about this more but thankfully we have the versioning of the dump file so we can extend the format easily in the future and it wouldn't even need to affect this BIP I think.

in bip-XXXX.md:169 in 4bc39c8b17

 164 | +|-------|------|------|-------------|
 165 | +| `height` | `uint32_t` | 4 | Block height this data corresponds to. |
 166 | +| `block_hash` | `uint256` | 32 | Block hash this data corresponds to. |
 167 | +| `chunk_index` | `uint32_t` | 4 | Zero-based index of this chunk. |
 168 | +| `proof_length` | `compact_size` | 1–9 | Number of hashes in the Merkle proof. |
 169 | +| `proof_hashes` | `uint256[]` | 32 × `proof_length` | Sibling hashes from leaf to root. |

ajtowns commented at 10:42 PM on April 10, 2026:

Rather than a proof, it might be better to just request getutxoset <height> <hash> 0xFFFFFFFF once to get the full list of chunk hashes -- that should be about 74kB for a 9GB utxo set, and should stay under 4MB until the utxo set is >450GB. Then each chunk is just <height> <hash> <number> <data>.

fjahr commented at 8:51 PM on April 11, 2026:

Huh, interesting idea to get all the chunk hashes first, I didn't think of that. It might make sense to do this with a separate message type even instead of hacking getutxoset as you described. I will have to think about it a little more.

fjahr commented at 3:21 PM on May 5, 2026:

Specified the fetching of the chunk hash list now with new message type getutxotree/utxotree messages. I liked this a lot more when combining it with dropping the discovery approach and asking for the specific UTXO set directly. Let me know what you think.

murchandamus commented at 11:33 PM on April 10, 2026: member

I was going to complain that I haven’t seen a discussion about this proposal on the mailing list… but you did that already for me. If you already know that you should send it to the mailing list first, I don’t know why you opened the PR first, though. :stuck_out_tongue:

fjahr commented at 9:04 PM on April 11, 2026: contributor

I don’t know why you opened the PR first

It wasn't clear to me that a ML post was a prerequisit to open the PR. I just thought it was necessary to do this at some point before the bip could get merged/assigned a number. I think that having a place for more detail oriented commentary makes sense to have in addition to the high level discussion happening on the ML, if ML readers have such feedback but would rather use the more convenient inline commenting/suggestion features in GitHub. I was also looking for feedback on my assumeutxo related question, e.g. can I assume knowledge of a feature that is only implemented in Bitcoin Core or should I define this in the BIP. The ML doesn't seem like the right place to ask about this.

I will close this for now and re-open when I have made the ML post and given it some reasonable time for discussion.

fjahr closed this on Apr 11, 2026

murchandamus commented at 5:44 PM on April 12, 2026: member

Thanks, and sorry, I might have come off as more gruff than intended — tone is hard in written text. Obviously, you’ve been around the block and your proposal reads well-considered, but we have been getting a lot of premature submissions out of the blue here, where then BIP Editors become the first line of feedback. Personally, it’s been taking a growing part of my work hours to even just get through all submissions to the repository. So, we have become a bit more insistent on proposals actually being posted to the list first, and I’d like to avoid giving the impression that I’m playing favorites.

A hypothetical optimal order might be:

Discuss your idea with a couple colleagues
Post about the idea on the ML
Compile a first draft in a PR against your personal fork of the BIPs repository
Have someone give it a read
Send your draft to the ML
Open a PR here

In your case it sounds like you’d be able to skip directly to 5, and we can of course reopen this PR, when there has been an ML discussion.

in bip-XXXX.md:135 in 4bc39c8b17 outdated

 130 | +| `count` | `compact_size` | 1–9 | Number of available UTXO sets. |
 131 | +
 132 | +For each available UTXO set:
 133 | +
 134 | +| Field | Type | Size | Description |
 135 | +|-------|------|------|-------------|

luke-jr commented at 10:07 PM on April 13, 2026:

Since the format has a version number, it would make sense to include it here.

fjahr commented at 3:21 PM on May 5, 2026:

Done

in bip-XXXX.md:191 in 4bc39c8b17

 186 | +
 187 | +1. The requesting node identifies peers advertising `NODE_UTXO_SET`.
 188 | +2. The requesting node sends `getutxosetinfo` to one or more of these peers.
 189 | +3. Each peer responds with `utxosetinfo`. The requesting node verifies that the advertised
 190 | +   `serialized_hash` matches a known UTXO set hash, compares `merkle_root` values across peers,
 191 | +   and selects a UTXO set whose Merkle root has agreement among multiple peers.

luke-jr commented at 10:09 PM on April 13, 2026:

I think this process defeats the point of the service bit. Having "at least one" UTXO set only "works" while there are only a small number of UTXO set options to download. If it's not sufficient to just try peers until you find the UTXO set you want, then we probably need a way to advertise specific UTXO sets.

fjahr commented at 3:22 PM on May 5, 2026:

Thanks, I do think it should be sufficient to try peers until we find the UTXO set we want and I dropped the discovery approach, see my other comment as well.

in bip-XXXX.md:210 in 4bc39c8b17

 205 | +
 206 | +**Serialized hash in `utxosetinfo`:** The requesting node should have access to a known UTXO set hash
 207 | +before initiating the process. Including the serialized hash in the advertisement lets the requester
 208 | +immediately filter out peers claiming a different UTXO set state before downloading any data.
 209 | +
 210 | +**Discovery before download:** The `getutxosetinfo`/`utxosetinfo` exchange lets the requesting node

luke-jr commented at 10:11 PM on April 13, 2026:

I'm not sure discovery is a useful feature. It enables a misguided implementation to blindly trust whatever nodes provide, and harms node privacy by adding a way to fingerprint nodes.

If you know what you'll accept in advance, you can simply request that snapshot, and the node will either provide it or (potentially) say it can't.

fjahr commented at 3:24 PM on May 5, 2026:

I dropped the discovery approach, I think fingerprinting potential is still there with some probing but it's still better than with discovery and I think this matches the approach we follow in the network generally, like in BIP157 for example, better as well.

in bip-XXXX.md:227 in 4bc39c8b17 outdated

 222 | +
 223 | +**3.9 MB chunk size:** The number balances round trips (~2,560 for a ~10 GB set) against memory usage
 224 | +for buffering and verifying a single chunk. Smaller chunks would increase protocol overhead; larger
 225 | +chunks would increase memory pressure on constrained devices commonly used to run Bitcoin nodes.
 226 | +Together with the additional message overhead, the `utxoset` message including the chunk data also
 227 | +sits just below the theoretical maximum block size which means any implementation should be able to

svanstaa commented at 5:18 PM on April 20, 2026:

It also happens to sit just below the maximum P2P message size MAX_PROTOCOL_MESSAGE_LENGTH, so it may be clearer to refer to that instead of block size

fjahr commented at 3:24 PM on May 5, 2026:

Yeah, but this was a contious decision actually. MAX_PROTOCOL_MESSAGE_LENGTH is a Bitcoin Core implementation specific value. A different implementation may have a higher value for this. But every implementation will at least need to be able to receive the biggest possible block. So I think it's better to anchor it to that.

in bip-XXXX.md:9 in 4bc39c8b17 outdated

   0 | @@ -0,0 +1,244 @@
   1 | +```
   2 | +  BIP: ?
   3 | +  Layer: Peer Services
   4 | +  Title: P2P UTXO Set Sharing
   5 | +  Authors: Fabian Jahr <fjahr@protonmail.com>
   6 | +  Status: Draft
   7 | +  Type: Specification
   8 | +  Assigned: ?
   9 | +  Discussion: ?

jonatack commented at 5:35 PM on April 20, 2026:

Please link here to the top post of the mail list thread, if you decide to open this PR later after discussion there.

fjahr commented at 3:25 PM on May 5, 2026:

I wanted to address the comments I already got before posting and I was a bit busy for a while, will make the post now with this push.

jonatack commented at 6:55 PM on May 5, 2026:

Updated the PR description with links to the ML post and to the proposed implementation (edit: still needs updating in your header).

jonatack added the label New BIP on Apr 20, 2026

in bip-XXXX.md:121 in 4bc39c8b17

 116 | +
 117 | +#### `getutxosetinfo`
 118 | +
 119 | +Sent to discover which UTXO sets a peer can serve. This message has an empty payload.
 120 | +
 121 | +A node that has advertised `NODE_UTXO_SET` MUST respond with `utxosetinfo`. A node that has not

danielabrozzoni commented at 3:31 PM on April 29, 2026:

nit: I think it would be clearer to repeat here that we should disconnect a node that doesn't respond to our getutxoset, as stated above:

A node signaling NODE_UTXO_SET MUST respond to getutxosetinfo messages and MUST be capable of serving all UTXO sets it advertises in its utxosetinfo response. A node that fails to meet these obligations SHOULD be disconnected.

fjahr commented at 3:25 PM on May 5, 2026:

With the latest changes I felt it would be more appropriate to remove this general statement instead:

A node that fails to meet these obligations SHOULD be disconnected.

Without discovery it is just harder for us to decide if the peer is misbehaving or not. It is more precise anyway to clarify this explicitly for each situation.

in bip-XXXX.md:183 in 4bc39c8b17

 178 | +disconnect the peer. A node SHOULD also disconnect a peer that sends a `utxoset` message with fields
 179 | +(`chunk_index`, `height`, `block_hash`) that do not match the outstanding request.
 180 | +
 181 | +After all chunks have been received, the node MUST compute the serialized hash and compare it against a
 182 | +known UTXO set hash. If this check fails, the node MUST discard all data and
 183 | +SHOULD disconnect all peers that advertised the corresponding Merkle root.

danielabrozzoni commented at 5:33 PM on April 29, 2026:

I'm trying to understand why we "should" disconnect here, but we "must" disconnect if the peer sends us a utxoset message that fails the proof verification (L176). Is it because in the second case, the peer is surely trying to trick us, while in the first case, we probably have a bug on our end?

I'm trying to understand how it can happen that our chunks pass the individual verifications, but the final utxo set hash mismatches, and whether in this case it would be a problem on our end, or peers trying to trick us.

fjahr commented at 3:25 PM on May 5, 2026:

Yeah, the intention was that we are stricter with a clear inconsistency in what the peer has given us vs. something that is probably a problem on our side. But have revisited the disconnection policies and I think it is a lot more consistent now. If I got everything right we must disconnect if the peer sent us something clearly inconsistent, we should disconnect if the peer sent us we didn't ask for and if evidence points to the issue being on our side I am now not saying anything about disconnecting.

murchandamus commented at 9:30 PM on April 30, 2026: member

Since this is already getting more discussion here than many submissions to the mailing list, I’m going to reopen it and turn it into draft. Please set it to “Ready for Review” when you have posted to the mailing list.

murchandamus reopened this on Apr 30, 2026

murchandamus marked this as a draft on Apr 30, 2026

fjahr force-pushed on May 5, 2026

evoskuil commented at 3:58 PM on May 5, 2026: contributor

Concept NACK. It's bad enough that nodes are formalizing this off network, but incorporating it into p2p is another level of awful.

fjahr force-pushed on May 5, 2026

fjahr marked this as ready for review on May 5, 2026

in bip-XXXX.md:11 in 07e74d26c0

   6 | +  Status: Draft
   7 | +  Type: Specification
   8 | +  Assigned: ?
   9 | +  Discussion: 2026-05-06: https://groups.google.com/g/bitcoindev/c/rThmyI8ZN3Q
  10 | +  Version: 0.2.0
  11 | +  License: BSD-2-Clause

murchandamus commented at 9:55 PM on May 12, 2026:

The Headers have a fixed order, please update to:

  Assigned: ?
  License: BSD-2-Clause
  Discussion: 2026-05-06: https://groups.google.com/g/bitcoindev/c/rThmyI8ZN3Q
  Version: 0.2.0

fjahr commented at 8:32 PM on May 17, 2026:

Done

in bip-XXXX.md:19 in 07e74d26c0

  14 | +## Abstract
  15 | +
  16 | +This BIP defines a P2P protocol extension for sharing full UTXO sets between peers. It introduces
  17 | +a new service bit `NODE_UTXO_SET`, four new P2P messages (`getutxotree`, `utxotree`, `getutxoset`,
  18 | +`utxoset`), and a chunk-hash list anchored to a Merkle root known to the requesting node, enabling
  19 | +per-chunk verification. This allows nodes to bootstrap from a recent height by obtaining the

murchandamus commented at 9:57 PM on May 12, 2026:

I assume that the nodes are bootstrapping from scratch to a recent height rather than from a recent height? Maybe:

per-chunk verification. This allows bootstrapping nodes to leapfrog to a recent height by obtaining the

fjahr commented at 8:38 PM on May 17, 2026:

Ok, to me these sound like they mean the same but I am no native speaker and this is probably clearer.

in bip-XXXX.md:42 in 07e74d26c0

  37 | +
  38 | +### Service Bit
  39 | +
  40 | +| Name | Bit | Description |
  41 | +|------|-----|-------------|
  42 | +| `NODE_UTXO_SET` | 12 (0x1000) | The node can serve complete UTXO set data for at least one height. |

murchandamus commented at 10:07 PM on May 12, 2026:

This is in conflict with the Utreexo proposal which allocates

Bit 12 to NODE_UTREEXO
Bit 13 to NODE_UTREEXO_ARCHIVE

per the BIP183 draft.

fjahr commented at 8:39 PM on May 17, 2026:

Moved to take Bit 14, thanks for giving me the heads up. But I will most likely change this to use BIP 434 instead unless there are any late issues arising with it.

in bip-XXXX.md:46 in 07e74d26c0

  41 | +|------|-----|-------------|
  42 | +| `NODE_UTXO_SET` | 12 (0x1000) | The node can serve complete UTXO set data for at least one height. |
  43 | +
  44 | +A node MUST NOT set this bit unless it has at least one full UTXO set available to serve.
  45 | +A node signaling `NODE_UTXO_SET` MUST be capable of responding to `getutxotree` and `getutxoset`
  46 | +requests for every UTXO set it is willing to serve, including the full chunk-hash list and every

murchandamus commented at 10:14 PM on May 12, 2026:

requests for every UTXO set that it is willing to serve, including the full chunk-hash list and every

fjahr commented at 8:39 PM on May 17, 2026:

Done

in bip-XXXX.md:91 in 07e74d26c0

  86 | +
  87 | +The serialized UTXO set (header + body) is split into chunks of exactly 3,900,000 bytes (3.9 MB). The
  88 | +last chunk contains the remaining bytes and may be smaller.
  89 | +
  90 | +The leaf hash for each chunk is `SHA256d(chunk_data)`. The tree is built as a balanced binary tree. When
  91 | +the number of nodes at a level is odd, the last node is duplicated before hashing the next level.

murchandamus commented at 10:23 PM on May 12, 2026:

I guess this is safe because no left_child and right_child should ever have the same data, but I dimly remember that this construction had some downsides in the Merkle tree for transactions in a block. (Some node implementation accepted a block with the same transaction repeated, or smth?) Would it perhaps be better to have a dedicated value to add into the hash instead of repeating the left_child, e.g., SHA256d(left_child || SHA256d(odd_number_of_leaves_no_right_child))?

fjahr commented at 8:39 PM on May 17, 2026:

Hm, while the situation is a bit different than with txs, it's a good point to raise and one way or the other, there should be an additional check added to the BIP. Generally, I think any such attack should be able to get caught based on the utxotree response. The attack you describe would make it necessary for the attack to have the same hash in the list twice. So we should be able to catch it with a duplication check of the chunk hash list in the utxotree response. The alternative you are suggesting should probably be to use SHA256d("") instead of the duplication but I still think we can not avoid doing a similar check. If an attacker sends us the list with SHA256d("") included at the end we need to identify it early, otherwise there could be a confusion about what the correct number of chunks is. The attacker could not send the final ghost chunk to us and we would keep asking peers for it and get disconnected most likely, failing to complete the snapshot even though we have all the data already (the same would be possible for the current approach BTW). At least using an empty hash as the sentinel value would ensure that we don't append any nonsense data to the snapshot if we get a result to the ghost chunk.

I did some (re) reading on the latest preferred approaches for merkle trees and it appears that domain separation through tagged hashing is valid (as in Taproot) or just promoting the odd leaf up a level. Of the two I much prefer the latter due to ease of implementation. I have thus changed the draft to use this.

I am still undecided if this is the best approach since I see a little bit of confusion potential as people might just assume the merkle tree works the same as the one for txs. The uniqueness check seems like a small price to pay. But the new version is cleaner. Happy to reconsider this change if anyone feels strongly about this or if I missed anything worth considering here.

ajtowns commented at 4:40 AM on May 18, 2026:

The utxotree message include data_length which (along with the fixed 3.9MB data sizes) makes the merkle tree structure deterministic. If you make the hardcoded hash identifying the utxo set be the hash of all the data in the utxotree message, that should suffice to prevent messing around with the chunk structure?

Also, num_chunks is redundant given data_length and the 3.9MB constant. If you want to support chunks breaking at sizes other than 3.9MB, it would be better to have 3,900,000 be included in the message rather than num_chunks IMO.

BTW, I don't think you need a tree structure here btw; you could just hash the utxotree message (ie concatenate the chunk hashes) instead.

fjahr commented at 6:00 PM on May 18, 2026:

The utxotree message include data_length which (along with the fixed 3.9MB data sizes) makes the merkle tree structure deterministic. If you make the hardcoded hash identifying the utxo set be the hash of all the data in the utxotree message, that should suffice to prevent messing around with the chunk structure?

Yes, it does. It is another variant of a (very small) check in addition to the validation of the tree. So is checking the uniqueness of all the hashes in the list. My point was that there are several of these integrity checks that are possible to fix the malleability of the final data and even catch it early, and I don't really have much of a preference between them. But I liked the idea of changing the tree construction because that seems to solve the issue without any additional integrity check.

Also, num_chunks is redundant given data_length and the 3.9MB constant. If you want to support chunks breaking at sizes other than 3.9MB, it would be better to have 3,900,000 be included in the message rather than num_chunks IMO.

Yeah, I actually thought about removing num_chunks from utxotree because of the redundancy, and I am not sure why I didn't do it yet, so I am doing it with the next push. My intention with this wasn't to support different chunk sizes, so I think the removal addresses your comment about the 3.9 MB number. I thought about parameterization, but I don't see much value in it, given the rationale I have in the BIP about the 3.9 MB choice. But if people find it compelling, I would be open to parameterizing this.

BTW, I don't think you need a tree structure here btw; you could just hash the utxotree message (ie concatenate the chunk hashes) instead.

You are correct, of course! The tree structure is a relict of the inclusion proofs that were part of the utxoset message. I am still not fully convinced we should change to concatenated hashes, though. Merkle trees are well understood in Bitcoin and are not that much more complex than simple concatenation. Keeping Merkle trees still leaves the option open for inclusion proofs if they should be useful somehow down the line. But I don't feel strongly about this and can define a different way of combining chunk hashes if reviewers prefer that.

ajtowns commented at 5:49 AM on May 23, 2026:

If you want to reuse the hash for inclusion proofs, then probably it would be best to divide chunks into individual utxos, and definitely to avoid having individual utxos split across chunks (meaning you'd need 8MB to prove some utxo's inclusion).

fjahr commented at 10:35 PM on June 3, 2026:

Sorry, I meant inclusion of the chunks like I had been using in the original version of the BIP. But I thought about this a bit, if this could be useful and how much overhead this would create. I don't think it's worth it without an actual use-case though. We would give up a lot of simplicity without the uniform size of the chunks for example.

in bip-XXXX.md:127 in 07e74d26c0

 122 | +| `version` | `uint16_t` | 2 | Format version of the serialized UTXO set. |
 123 | +| `data_length` | `uint64_t` | 8 | Total size of the serialized UTXO set in bytes (header + body). |
 124 | +| `num_chunks` | `compact_size` | 1–9 | Number of chunks the serialized UTXO set is split into. |
 125 | +| `chunk_hashes` | `uint256[]` | 32 × `num_chunks` | The ordered list of chunk hashes. |
 126 | +
 127 | +Upon receiving a `utxotree` message, the node MUST recompute the Merkle root from

murchandamus commented at 10:26 PM on May 12, 2026:

Stumbled over this the first time, because “node” could apply to either side.

Upon receiving a `utxotree` message, the requesting node MUST recompute the Merkle root from

fjahr commented at 8:39 PM on May 17, 2026:

Done

in bip-XXXX.md:131 in 07e74d26c0 outdated

 126 | +
 127 | +Upon receiving a `utxotree` message, the node MUST recompute the Merkle root from
 128 | +`chunk_hashes` and compare it against the Merkle root it knows for the corresponding UTXO set. If
 129 | +the roots do not match, the node MUST discard the response and MUST disconnect the peer.
 130 | +
 131 | +#### `getutxoset`

murchandamus commented at 10:28 PM on May 12, 2026:

getutxoset feels a bit odd, when the message actually requests a chunk. Also, Cluster Mempool makes extensive use of the term "chunk", and I was wondering whether this overlap could cause confusion in the future.

fjahr commented at 8:39 PM on May 17, 2026:

Chunk is just a very generic term in computer science, it's part of the http spec as well and thus we have the term appearing several times in the libevent replacement code as well (both client and server) which can not be avoided. Initially, I also wasn't the happiest about it but I just couldn't find a different term for it that felt right, especially considering how http uses the term and how that analogy seems to match pretty well. This also explains the message naming: The message transports the UTXO set, chunks is just an aspect of the transport mechanism in the http analogy. At least that's how felt most comfortable with reasoning about it. I think Cluster Mempool would have had a more wide pick of fitting terminology to chose from but that ship has sailed. Happy to still consider a renaming if anyone has a good suggestion but all the alternatives I could think of didn't seem to fit well enough. I also obviously prefer shorter naming in order to not make squeezing it in 12 characters too awkward.

murchandamus commented at 5:34 PM on June 4, 2026:

I don’t feel that strongly about it, your response resolves it for me.

in bip-XXXX.md:134 in 07e74d26c0

 129 | +the roots do not match, the node MUST discard the response and MUST disconnect the peer.
 130 | +
 131 | +#### `getutxoset`
 132 | +
 133 | +Sent to request a single chunk of UTXO set data. The requesting node MUST have received a `utxotree`
 134 | +for the corresponding UTXO set before sending this message.

murchandamus commented at 10:30 PM on May 12, 2026:

Should there perhaps also be an addition that the serving node must not reply if it has not previously sent a utxotree message to the requesting peer?

murchandamus commented at 10:33 PM on May 12, 2026:

I also checked that the Utreexo BIPs don’t use the term "utxotree", but I worry that "utreexo" and "utxotree" are extremely similar and that will cause headaches in the future.

fjahr commented at 8:40 PM on May 17, 2026:

Should there perhaps also be an addition that the serving node must not reply if it has not previously sent a utxotree message to the requesting peer?

I think this would require that the serving node keeps track of every request which opens new DoS possibilities. So I would rather not require this.

I also checked that the Utreexo BIPs don’t use the term "utxotree", but I worry that "utreexo" and "utxotree" are extremely similar and that will cause headaches in the future.

Just like with the other naming response, overall happy to consider renaming if there is a good suggestion. I also didn't like this option too much due to the closeness to utreexo but I also couldn't find a better name. Using chunks in the msg name here was ruled out by me for the same reason I laid out above: chunks are a generic term within transport and don't have anything to do with the content of what is transported IMO. In the end both proposals deal with trees of UTXOs to some degree so its hard to differentiate. I guess if there was a unique brand for this proposal that could be used but I am not creative enough, best I got are "minishare" and "UTorrentXO" but nobody wants that :p

in bip-XXXX.md:185 in 07e74d26c0

 180 | +discretion to manage resource consumption.
 181 | +
 182 | +## Rationale
 183 | +
 184 | +**Usage of service bit 12:** Service bits allow selective peer discovery through
 185 | +DNS seeds and addr relay. Bit 12 is chosen as the next unassigned bit after `NODE_P2P_V2` (bit 11, BIP 324).

murchandamus commented at 10:43 PM on May 12, 2026:

As noted above, your service flag should perhaps be moved to bit 14, unless the Utreexo project is willing to move instead.

fjahr commented at 8:40 PM on May 17, 2026:

As mentioned above, moved to take Bit 14 but I will most likely change this to use BIP 434 instead in the next push.

in bip-XXXX.md:194 in 07e74d26c0 outdated

 189 | +`getutxotree`. The serving node responds only if it can serve that specific UTXO set.
 190 | +
 191 | +**Per-chunk verification:** The chunk-hash list returned in `utxotree` enables each chunk to be verified
 192 | +by direct lookup against the accepted list as it arrives, allowing immediate detection of corrupt data,
 193 | +peer switching without data loss, and parallel download from multiple peers. The list itself is small
 194 | +(~80 KB for a ~10 GB set). The specified serialization is deterministic, so all honest nodes produce

murchandamus commented at 10:48 PM on May 12, 2026:

If a node is expected to source chunks from multiple different peers, is it really necessary to receive the entire tree description of 80 KB from each of the peers?

ajtowns commented at 1:43 PM on May 13, 2026:

The spec says "The requesting node sends getutxotree for the desired block hash to one or more of these peers." so I think the answer is "no, it is not necessary to receive the entire tree description of 80 KB from each of the peers" -- you only send requests to the number of peers you want to receive response from. Any attempt to give different responses will (should) result in them not hashing back to your known merkle root, so all valid descriptions will be identical, AIUI.

murchandamus commented at 6:25 PM on May 13, 2026:

I read line 133,134

Sent to request a single chunk of UTXO set data. The requesting node MUST have received a utxotree for the corresponding UTXO set before sending this message.

as the serving node not being permitted to respond to getutxoset calls for a specific tree unless it previously sent a utxotree message to the same peer, but maybe I misinterpreted that. It seems to me that both aspects of the question should be clarified:

Must a peer send getutxotree before being eligible to responses to getutxoset for the same tree?
Is it necessary to retrieve the utxotree from multiple peers before requesting chunks?

ajtowns commented at 9:17 PM on May 14, 2026:

Perhaps it would be better to send getutxotree to one node (repeating until you get a valid response), and then send getutxoset to any nodes that support utxo set sharing, with the response utxoset <hash> <n> <empty> indicating "i don't have that utxoset data" ? So instead of getutxotree / utxotree to establish whether a peer has the data you want, you send getutxoset / utxoset and either get an explicit nope or data you actually want?

fjahr commented at 8:40 PM on May 17, 2026:

@murchandamus I tightened the language you were not happy with, primarily in the Protocol Flow section. Please let me know if it works better for you now. @ajtowns It's an interesting suggestion. I am not convinced so far since I don't really like overloading original message semantics (its a new protocol, we may at least do it right in the beginning). I also think the explicit nope doesn't gain us that much and at the same time moves us backwards with regards to @luke-jr 's concerns. Did you have specific thoughts on this? If we would do your suggestion then I would almost feel better about adding the -info discovery approach again. But I haven't thought about it that much, I will continue to ponder this.

murchandamus commented at 6:10 PM on June 4, 2026:

Reading the diff, I think there is still one thing that could be clearer: from the perspective of the serving node, is it required that a peer has first sent getutxotree before it responds to getutxoset messages or will it respond to getutxoset even if the peer did not send getutxotree? Presumably it’s the latter, since getutxotree is a heavy message and you say that it can be requested only a single time, but I think the protocol is only specific on the requester side so far.

murchandamus commented at 10:55 PM on May 12, 2026: member

I gave this submission a first quick read. I noticed that there doesn’t seem to be a Backwards Compatibility section. Please add one.

murchandamus added the label PR Author action required on May 12, 2026

in bip-XXXX.md:110 in 07e74d26c0 outdated

 105 | +
 106 | +| Field | Type | Size | Description |
 107 | +|-------|------|------|-------------|
 108 | +| `block_hash` | `uint256` | 32 | Block hash identifying the requested UTXO set. |
 109 | +
 110 | +A node that has advertised `NODE_UTXO_SET` and can serve the requested UTXO set MUST respond with

stickies-v commented at 12:31 PM on May 13, 2026:

nit

A node that has advertised `NODE_UTXO_SET` and can serve the requested UTXO set SHOULD respond with

fjahr commented at 8:40 PM on May 17, 2026:

I looked at many other BIPs, particularly 152 and 157 in terms of the language they use for similar situations and that's how I ended up with MUST here. Do you have a specific rationale for this change? I guess you are not feeling too strongly about it (nit), so I am leaving it as is for now.

in bip-XXXX.md:90 in 07e74d26c0 outdated

  85 | +#### Chunk Merkle Tree
  86 | +
  87 | +The serialized UTXO set (header + body) is split into chunks of exactly 3,900,000 bytes (3.9 MB). The
  88 | +last chunk contains the remaining bytes and may be smaller.
  89 | +
  90 | +The leaf hash for each chunk is `SHA256d(chunk_data)`. The tree is built as a balanced binary tree. When

stickies-v commented at 12:32 PM on May 13, 2026:

nit: when reading this, I was wondering if missed a section. A very brief (non-title) introduction of the tree we're building before talking about its leafs would read more naturally.

fjahr commented at 8:40 PM on May 17, 2026:

Improved the wording here, I hope it's clearer now.

stickies-v commented at 12:43 PM on May 13, 2026: none

I found the BIP pretty easy to grok and think it's well written. At this point, I do not support implementing this in Bitcoin Core, which afaik is the only current use case, but I think it does make sense to formalize these ideas into a BIP, given that a lot of people have thought about and discussed this and it is generally better to preserve that kind of knowledge.

My main conceptual concern with this BIP is that there's a non-trivial data availability problem. Being able to serve the utxo set for a specific height will most likely incur a significant storage and/or computation overhead. I think this means the network would have to converge on a small subset of available heights (cfr the "blessed" heights in Bitcoin Core). Serving just the tip is another option, but given the size of the UTXO set and the frequency at which th tip changes, that seems problematic for nodes that aren't able to be likely to download the utxo set within 10 minutes? As a "Peer Services" BIP, I'm not sure that concern can be resolved here, but perhaps it's worth at least mentioning?

ajtowns commented at 1:37 PM on May 13, 2026: contributor

Being able to serve the utxo set for a specific height will most likely incur a significant storage and/or computation overhead. I think this means the network would have to converge on a small subset of available heights

I don't believe this is a major problem; automatically storing and serving the utxo sets at fixed heights, eg every 12500 blocks (12 weeks / 3 months), and only keeping the most recent ~4 means that your node's storage requirements increase by ~50GB (~7% of the 700GB used for historical block data), and means that you can expect to download up to about ~40GB (12GB utxo set plus 12500 blocks at 2MB each) before you can validate blocks at the tip with a new install, assuming new point releases are made to update hardcoded assumeutxo data shortly after block x*12500 is mined.

stickies-v commented at 2:02 PM on May 13, 2026: none

I don't believe this is a major problem; automatically storing and serving the utxo sets at fixed heights, eg every 12500 blocks (12 weeks / 3 months), and only keeping the most recent ~4

That's what I would consider as "blessed" heights in my comment, and I agree it's a solution. It still requires coordination as to which heights "the network" is able to serve, either through implementation details or BIP specification.

murchandamus commented at 6:32 PM on May 13, 2026: member

I think this means the network would have to converge on a small subset of available heights (cfr the "blessed" heights in Bitcoin Core). Serving just the tip is another option, but given the size of the UTXO set and the frequency at which th tip changes, that seems problematic for nodes that aren't able to be likely to download the utxo set within 10 minutes? As a "Peer Services" BIP, I'm not sure that concern can be resolved here, but perhaps it's worth at least mentioning?

I was assuming that the UTXO set snapshot heights would correspond to the UTXO set root commitments shipped with the Bitcoin Core releases, and volunteering nodes would likely keep the last 2-3 of those on hand. Alternatively I was wondering whether nodes might automatically keep UTXO set snapshots of, e.g., the latest couple modulo %20'000 blocks on top of the ones shipped with the Bitcoin Core releases, and would be willing to serve those if it had at least been buried by say 4000 blocks.

ajtowns commented at 9:20 PM on May 14, 2026: contributor

would be willing to serve those if it had at least been buried by say 4000 blocks.

If you're storing the utxoset yourself, you probably want to take the snapshot while it's still close to the tip, to avoid too much churn. Whether you start serving that immediately or wait is another matter though.

murchandamus commented at 6:46 PM on May 15, 2026: member

Yeah, snapshotting while it is on-top is certainly the easiest, but I figured that it might not be great to start serving the UTXO set if it too close to the tip.

fjahr force-pushed on May 17, 2026

fjahr commented at 8:40 PM on May 17, 2026: contributor

@murchandamus @ajtowns @stickies-v

Re: The question about which snapshot heights should be served.

Thank you all for kicking up this conversation. Here my idea was to keep the language in the BIP broad enough to accomodate both most likely scenarios (IMO). Initially we can certainly assume Bitcoin Core would be the first to adopt this and serve the embedded assumeutxo snapshots that are renewed once every release. Then, at some later time, may it be Bitcoin Core or another project, there could be some effort to share snapshots closer to the tip on some "rolling" basis. Specifying this right now appears to be tricky and IMO outside of the scope I had in mind for this BIP. I would prefer it if the specification of the "rolling" snapshots could be added as some extension of this BIP. But I can also go down this rabbit hole if you feel strongly that this should be part of this BIP.

fjahr commented at 8:45 PM on May 17, 2026: contributor

Addressed feedback from @murchandamus @ajtowns @stickies-v , thank you all for commenting!

Aside from making some of the language more explicit in several places this change makes the merkle tree use promotion of odd nodes rather than duplicating them in order to avoid adding any additional Merkle tree pre-checks. Service bit is moved to 14 since I missed the potential collision with Utreexo. I also added a minimal backwards compatibility section.

stickies-v commented at 4:43 PM on May 18, 2026: none

Specifying this right now appears to be tricky and IMO outside of the scope I had in mind for this BIP. I would prefer it if the specification of the "rolling" snapshots could be added as some extension of this BIP. But I can also go down this rabbit hole if you feel strongly that this should be part of this BIP.

I think it's not unreasonable to extend or compose BIPs in this case, but I think it means the NODE_UTXO_SET service bit (defined in this BIP) becomes pretty useless to identify helpful nodes to connect to without redefining it. If it turns out we can actually agree on a good availability definition right away, that would make the service bit meaningfully more useful.

fjahr force-pushed on May 18, 2026

fjahr commented at 6:35 PM on May 18, 2026: contributor

I think it's not unreasonable to extend or compose BIPs in this case, but I think it means the NODE_UTXO_SET service bit (defined in this BIP) becomes pretty useless to identify helpful nodes to connect to without redefining it. If it turns out we can actually agree on a good availability definition right away, that would make the service bit meaningfully more useful.

Thanks, but I don't understand why it makes the service bit useless if this BIP leaves it to the implementations whose heights they typically want to share. It still helps to exclude the majority of nodes that don't share a UTXO set at all. That's about the same level of usefulness as NODE_NETWORK_LIMITEDI would say.

Trying to come up with a perfect number from the start seems like it would likely invite endless bikeshedding and would put additional pressure on implementations that want to use it.

Also note that I am considering to change signalling to use BIP434 when that is actually implemented in Bitcoin Core. For discovery this isn't a straightforward decision, but it would be definitely interesting for adding the possibility for additional signalling for "recent utxo sets available" or something if people would like that. But that seems perfect to be part of the extension as well to me.

fjahr commented at 6:43 PM on May 18, 2026: contributor

Pushed the removal of num_chunks from utxotree msg.

fjahr commented at 8:47 PM on May 19, 2026: contributor

@murchandamus @jonatack I see there is still the "PR author action required" flag set here. Is there something in particular that causes this to remain, or can it be removed?

jonatack removed the label PR Author action required on May 19, 2026

jonatack commented at 9:08 PM on May 19, 2026: member

@fjahr removed, will review.

murchandamus commented at 11:52 PM on May 19, 2026: member

Thanks, but I don't understand why it makes the service bit useless if this BIP leaves it to the implementations whose heights they typically want to share. It still helps to exclude the majority of nodes that don't share a UTXO set at all. That's about the same level of usefulness as NODE_NETWORK_LIMITEDI would say.

Given the expected small subsets of nodes that would serve this data, a service bit would be useful, but the nodes then serving a variety of different UTXO sets would be very odd indeed. Presumably, it should be obvious which height is served by the version of the node, but I would much prefer if the service bit indicated the ability to serve specific recent heights. I don’t think the comparison to NODE_NETWORK_LIMITED is accurate. That service bit indicates that a node can serve exactly the latest 288 blocks (it must serve at least 288, but is not supposed to serve more for fingerprinting reasons), so the service bit very clearly indicates exactly what is offered.

fjahr commented at 9:41 AM on May 21, 2026: contributor

Given the expected small subsets of nodes that would serve this data, a service bit would be useful, but the nodes then serving a variety of different UTXO sets would be very odd indeed. Presumably, it should be obvious which height is served by the version of the node, but I would much prefer if the service bit indicated the ability to serve specific recent heights.

Sure, but how large would that variety be really if we assume only the current frequency of assumeutxo params being added. I think connecting to a hand full of nodes should already yield at least one that has the right UTXO set usually. I also had this thought about the version and that may be possible for an implementation to explore but it doesn't seem relevant in the BIP context since the version isn't specified and doesn't seem to be appropriate to be used in this way IMO.

I don’t think the comparison to NODE_NETWORK_LIMITED is accurate. That service bit indicates that a node can serve exactly the latest 288 blocks (it must serve at least 288, but is not supposed to serve more for fingerprinting reasons), so the service bit very clearly indicates exactly what is offered.

Right, from a global perspective it is not 1-to-1 comparable because there could be multiple potential UTXO set candidates. But from the serving node's perspective, with core's current assumeutxo model, there is also only one latest (most interesting) utxo set to serve, so I think the analogy still works well from that perspective. Anyway, these semantics won't resolve the issue and I do realize the discovering node is the perspective that matters more.

Do I understand correctly that your point is that the BIP should indeed specify a recent height signalling mechanism right away? I am just not convinced that it is smart to tie this directly to the service bit. As I mentioned before it puts a lot of pressure on selecting a number which then all nodes that want to use this feature have to comply with but they may not want to or be able to do so. In core implementing this is pretty straight forward since https://github.com/bitcoin/bitcoin/pull/33477. But I am not sure if this makes the feature bit too heavily influenced by the core context making it less likely to be adopted by other implementations with other requirements.

What do people think about making what I had planned as a extension an optional part of this BIP? As in specifying an optional signalling via BIP434 if a recent utxo set is being served and a definition for how the most recent utxo set to be served is calculated.

stickies-v commented at 11:16 AM on May 21, 2026: none

but how large would that variety be really if we assume only the current frequency of assumeutxo params being added.

Sure, but then you're implicitly embedding Bitcoin Core implementation and release details in this BIP, and I thought the point was to not do that. Not doing that or making it explicit would both seem like better options.

What do people think about making what I had planned as a extension an optional part of this BIP?

Does this BIP need to define the service bit? If not, it can be defined in the optional extension(s)?

fjahr commented at 12:47 PM on May 21, 2026: contributor

Sure, but then you're implicitly embedding Bitcoin Core implementation and release details in this BIP, and I thought the point was to not do that. Not doing that or making it explicit would both seem like better options.

Fair take, but I just meant to use it as an example. Any similarly low-frequency updating of the shared set would have the same effect. It probably makes more sense to distinguish between low-frequency and high-frequency routes that implementations might take. Low frequency would mean that the utxo set sharing parameters are updated with releases, allowing for AssumeUTXO-style security through embedding the hash etc. I still think discovery for this usage model would work well enough with the BIP as-is due to the lower churn in shared utxo sets.

The high-frequency case has different requirements: It would be very helpful to define when exactly the utxo set is generated and shared to have unison. And since this frequency would be much higher than releases it also seems more likely that this would use the comparison of multiple peers to check if the root matches. So this puts pressure on this model to find more peers. But of course it would also be possible to use it with a trusted root that is supplied by the user. I just think the likelihood shifts.

Does this BIP need to define the service bit? If not, it can be defined in the optional extension(s)?

It seems to me like the BIP then would be much less usable without the extension in any case, so if the BIP doesn't work without the extension what would be the point of separating them this way? This BIP isn't that complex that it would require splitting up (like BIP340/341/342 for example) unless it's splitting off parts that would be considered optional, like I have been laying it out in my suggestion above.

Looking at other relevant BIP examples (144, 157, 152) it seems like typically we define a service bit together with the wire format.

So it seems better to me to directly add what I had in mind for the extension here than remove any discovery mechanism. But if people agree that such a split is an improvement, I can still do it.

stickies-v commented at 2:55 PM on May 21, 2026: none

what would be the point of separating them this way?

That a single service bit could unambiguously be used to find nodes that speak the protocol and are likely to serve at the heights you'd expect them to. If it turns out there is interest in having multiple height frequencies, they could all reference this BIP and have their own service bit. It would be good to try and limit the number of UTXO set service bits to one regardless, but then at least we don't need to decide on that in this BIP, which seems to be your preference.

I don't think I've got anything more to contribute on the topic, so I'll probably conclude my review here. Might pop back in case of major changes.

murchandamus commented at 4:02 PM on May 21, 2026: member

Do I understand correctly that your point is that the BIP should indeed specify a recent height signalling mechanism right away? I am just not convinced that it is smart to tie this directly to the service bit.

My understanding so far was that each Bitcoin Core release was hardcoding one height for which it supports assumeutxo. So, whenever there is a release, all existing nodes of prior versions with the service bit would actually not serve the new hardcoded height’s UTXO set. Presumably, a node upon the update would generate the new UTXO set and start serving it, but that would still leave us in the situation that most nodes exhibiting the service bit would not actually be able to help new nodes that are coming online with the data they are looking for. The ensuing need to churn through different nodes until one is found that has the data seems like a departure from how service bits were previously used, although I agree knowing who in general offers the service is very useful to cut down the cycling. That’s why combining the service bit with scheduled heights that can be served if they are sufficiently buried and e.g., nodes committing to be able to serve the latest two heights would make more sense to me.

fjahr commented at 8:28 PM on May 21, 2026: contributor

That’s why combining the service bit with scheduled heights that can be served if they are sufficiently buried and e.g., nodes committing to be able to serve the latest two heights would make more sense to me.

Sure, I understand now what you mean by "it makes sense". It would make discovery very straightforward for the nodes that use the high-frequency model. But if we make this mandatory, we are trading this for a lot higher minimum complexity for implementation on the server side and it doesn't change the situation for the nodes that still want to use the low-frequency mechanism, they would still have the exact same effort to find a serving node that has their release-blessed UTXO set.

I am curious what your rationale is for keeping the latest two heights specifically. FWIW, in my core implementation I signal with just one snapshot which allows for the downloading node to then become a seeder after activation without further action, similar to what Torrent does by default. I think this allows for a nice dynamic of higher availability with the slow-frequency model.

Since there is interest in this here now, we can go into the high-frequency serving independent of if it goes in here or an extension. Maybe with the bikeshedding out of the way I will feel more comfortable including it in here directly ;)

The requirements I can think of are:

Fairly recent height so sync is actually fast, so frequent refreshes but also not too frequent (mentioned were every 12,500 and every 20,000 blocks, my pick would have been around the 12,500 mark or possibly even lower)
Needs to be buried reasonably so it isn't re-orged out and there is some buffer for the serving node to actually build the snapshot before the first nodes start asking for it (4,000 blocks was mentioned, I would have gone lower here, more like a week maybe)
A reasonable buffer before deletion so that a node that started downloading the snapshot doesn't get rugged midway, this why I think keeping a minimum of 2 actually makes sense, just so that there is an overlap and the most recent doesn't just disappear if the downloader starts right at the end of a snapshots availability. I wouldn't want to keep more though (4 was mentioned) as I don't think it would be needed and storage-wise we would need to account for one more anyway which may be currently being built or already built but not served yet since it isn't buried enough yet.

Anything I am missing so far?

murchandamus commented at 12:59 AM on May 22, 2026: member

I am curious what your rationale is for keeping the latest two heights specifically.

If we would be doing the blessed heights, I was thinking the last two blessed heights, because we would not serve the latest while it isn’t mature enough, in which case the penultimate blessed height would be the one that nodes would be interested in. Once the latest is mature enough, one would even be enough, but we’d eventually store the newest blessed height while it is immature. That’s how I arrived at two. Storing older blessed heights seems pointless. Beyond that, I’d expect an assumeutxo supporting node to serve the hardcoded heights for the latest or latest two maintained major branches. We wouldn’t want to keep too many heights, because of the additional storage, and because they presumably would not be demanded. That said, I am not sure whether I support the blessed heights, these are just musings how I’d go about it, if we were to do it.

ajtowns commented at 2:17 AM on May 22, 2026: contributor

That said, I am not sure whether I support the blessed heights, these are just musings how I’d go about it, if we were to do it.

I think it would be good if the service bit did imply which heights were "blessed". I would suggest the blessed heights should be:

nodes advertising NODE_UTXO_SET will provide the utxo sets at heights H1 and H2, calculated as follows:
- N = current tip height
- M = N - 2016 (height two weeks ago)
- K = 13*1008 (approximately 3 months worth of blocks)
- H1 = M-M%K (most recent block at a multiple of K that's at least two weeks ago)
- H2 = H1 - K
- The utxo set at H2 need not be provided if M >= H1+K

Currently H1 would be 943488 (2026-04-03) and H2 would be 930384 (2026-01-01). Those compare to the current hardcoded heights of 935000, 910000, 880000, 840000 (deltas of 40k, 30k, 25k blocks vs 13,104).

Alternatively, setting K=11000 (~11 weeks, ~2.5 months) would give blocks 946000 (2026-04-21) and 935000 (2026-02-05) which would be compatible with core v31.0 (the updated assumeutxo values haven't been backported to v30.x).

If we took this approach, nodes advertising this service would need to store three versions of the utxo set (ie, the utxo set at tip, the utxo set at H1, and either the utxo set at H2 or the next utxo set "H0" that will become H1 within 2016 blocks).

Something in the range of 10k to 15k blocks seems manageable from both a bandwidth perspective (on average, the additional 7k to 9.5k blocks at 2MB serialized size each is roughly similar to the size of the utxo set) and as far as doing point releases to updated the hardcoded utxo set hashes (roughly once per quarter).

If the p2p network were only advertising one utxo set, rather than the two most recent ones, the 31.0 hardcoded hash would already be out of date, and we'd need to make point releases precisely within the 2 week period between a new H1 block being mined and it becoming the new p2p utxo set.

EDIT to add:

What do people think about making what I had planned as a extension an optional part of this BIP? As in specifying an optional signalling via BIP434 if a recent utxo set is being served and a definition for how the most recent utxo set to be served is calculated.

I think either a single bip with (FEATURE = p2p messages are available; service bit = these scheduled utxo sets are available) or two bips (one defining the p2p messages/FEATURE, the other defining the service bit and schedule) would work fine. If you want to defer defining the schedule, then two bips would probably be better, since you could finish off the first one without delaying?

jonatack commented at 12:30 PM on May 25, 2026: member

From a BIP editor point of view, this seems complete and ready for a number, modulo your decision on using BIP 434 and the structure (extension, one BIP vs two, etc.)

What do people think about making what I had planned as a extension an optional part of this BIP? As in specifying an optional signalling via BIP434 if a recent utxo set is being served and a definition for how the most recent utxo set to be served is calculated.

I think either a single bip with (FEATURE = p2p messages are available; service bit = these scheduled utxo sets are available) or two bips (one defining the p2p messages/FEATURE, the other defining the service bit and schedule) would work fine. If you want to defer defining the schedule, then two bips would probably be better, since you could finish off the first one without delaying?

fjahr force-pushed on Jun 3, 2026

Add BIP UTXO set sharing 23c4b35ce9

fjahr force-pushed on Jun 3, 2026

fjahr commented at 10:26 PM on June 3, 2026: contributor

Push to v0.5.0: Since there seemed to be pretty unanimous preference to have the scheduled heights specified with the service bit, I added this to the BIP now. I took @ajtowns 's suggestions almost completely but extended K a little bit to be 7 difficulty adjustment periods instead of the suggested 6.5. The cosmetic aspect of this is to not have a fraction of a difficulty adjustment but I also think it's good to be a bit more conservative and make sure H2 is not disappearing after less than 6 months if there is bunch of new hash power coming online.

I think this also settles the need for a BIP extension. I did also add feature negotiation for the p2p message availability like @ajtowns suggested and following the spirit of BIP434 in general, however I guess we could also move forward without it, given the service bit.

I don't suggest any course of action for the downloading node if not both H1 and H2 are available because it seems unreasonable for the downloader to request a UTXO set it doesn't actually want just to be pedantic.

I also added an acknowledgements section, thanks again everyone for contributing thoughtful improvement suggestions!

murchandamus renamed this:
~~BIP draft: UTXO set sharing~~
BIP452: P2P UTXO Set Sharing
on Jun 4, 2026

murchandamus commented at 5:47 PM on June 4, 2026: member

Let’s refer to this as BIP452. Please update the file name as well as the BIP and Assigned headers correspondingly. Please add a table entry to the README for your document.

in bip-XXXX.md:77 in 23c4b35ce9

  72 | +snapshot before it is requested.
  73 | +
  74 | +### Feature Negotiation
  75 | +
  76 | +Support for the messages in this document is negotiated per connection via the BIP 434 `feature`
  77 | +message, using `featureid` `BIPXXXX` (TODO) and empty `featuredata`. A node implementing these

murchandamus commented at 6:16 PM on June 4, 2026:

I thought that the featuredata here would include the heights that the serving node can serve.

If all nodes that implement the protocol send this feature message, the only other way to find out whether a node can serve the release-hardcoded heights would be by asking getutxotree or getutxoset for a chunk of the corresponding height. What’s the rationale for it being empty? Is it meant to make it more expensive for surveillants to learn what heights each node serves?

If a user wants to download the UTXO set for the trusted hash hardcoded in the latest release, how do they find peers that serve it?

ajtowns commented at 6:16 AM on June 5, 2026:

I think the idea is that:

the node has trusted heights hardcoded with the corresponding block and utxo set hash, eg 935'000, 0000000000000000000147034958af1652b2b91bba607beacc5e72a56f0fb5ee, e4b90ef9eae834f56c4b64d2d50143cee10ad87994c614d7d04125e2a6025050
the node downloads the header the block headers to make sure the block is an ancestor of the tip (and is sufficiently recent that the utxo set is expected to be available)
the node uses the service bit to find peers that likely have the utxo set available
the node requests the utxoset details by hash, then requests utxoset chunks
the node maintains connections to a sufficient number of peers that respond to those chunk requests to keep downloads efficient

So knowing the heights doesn't add anything useful there, I think -- at best it saves you maybe a 100 bytes of message data before you realise they can't send you chunks?

in bip-XXXX.md:8 in 23c4b35ce9

   0 | @@ -0,0 +1,296 @@
   1 | +```
   2 | +  BIP: ?
   3 | +  Layer: Peer Services
   4 | +  Title: P2P UTXO Set Sharing
   5 | +  Authors: Fabian Jahr <fjahr@protonmail.com>
   6 | +  Status: Draft
   7 | +  Type: Specification
   8 | +  Assigned: ?

murchandamus commented at 6:19 PM on June 4, 2026:

  BIP: 452
  Layer: Peer Services
  Title: P2P UTXO Set Sharing
  Authors: Fabian Jahr <fjahr@protonmail.com>
  Status: Draft
  Type: Specification
  Assigned: 2026-06-04

in bip-XXXX.md:95 in 23c4b35ce9

  90 | +**Header (55 bytes):**
  91 | +
  92 | +| Field | Type | Size | Description |
  93 | +|-------|------|------|-------------|
  94 | +| `magic` | `bytes` | 5 | `0x7574786fff` (ASCII `utxo` + `0xff`). |
  95 | +| `version` | `uint16_t` | 2 | Format version. |

ajtowns commented at 6:23 AM on June 5, 2026:

Should specify the value (0x0200?) here? This is effectively standardizing/documenting bitcoin core's current format.

in bip-XXXX.md:144 in 23c4b35ce9

 139 | +
 140 | +Sent to request the chunk-hash list for a specific UTXO set.
 141 | +
 142 | +| Field | Type | Size | Description |
 143 | +|-------|------|------|-------------|
 144 | +| `block_hash` | `uint256` | 32 | Block hash identifying the requested UTXO set. |

ajtowns commented at 6:34 AM on June 5, 2026:

I think getutxotree should be taking a "chunk merkle tree root hash" as input, rather than a block_hash. You want the utxotree that matches the hash that was hardcoded into your node software, not any random utxo serialization that a peer might have. Particularly applies if the utxotree format changes in future -- even an honest peer with the same utxo set won't be helpful if they give you the utxos in a different format that you can't decode correctly.

in bip-XXXX.md:164 in 23c4b35ce9

 159 | +| `data_length` | `uint64_t` | 8 | Total size of the serialized UTXO set in bytes (header + body). |
 160 | +| `chunk_hashes` | `uint256[]` | 32 × N | The ordered list of N chunk hashes, where N = `ceil(data_length / 3,900,000)`. |
 161 | +
 162 | +Upon receiving a `utxotree` message, the requesting node MUST recompute the Merkle root from
 163 | +`chunk_hashes` and compare it against the Merkle root it knows for the corresponding UTXO set. If
 164 | +the roots do not match, the node MUST discard the response and MUST disconnect the peer.

ajtowns commented at 6:40 AM on June 5, 2026:

I believe that if you don't include data_length in the merkle root, then you get ambiguity between (eg) 1000 chunks of 3.9GB of total data and 999 chunks of 3,892,200,064 bytes of data, where the final 64 byte chunk is the concatenated hashes of the original final two chunks.

in bip-XXXX.md:158 in 23c4b35ce9

 153 | +metadata.
 154 | +
 155 | +| Field | Type | Size | Description |
 156 | +|-------|------|------|-------------|
 157 | +| `block_hash` | `uint256` | 32 | Block hash this data corresponds to. |
 158 | +| `version` | `uint16_t` | 2 | Format version of the serialized UTXO set. |

ajtowns commented at 6:43 AM on June 5, 2026:

These are redundant -- you're already checking them when you verify the chunk_hahses match the hardcoded hash.

in bip-XXXX.md:173 in 23c4b35ce9

 168 | +Sent to request a single chunk of UTXO set data. The requesting node MUST have received a `utxotree`
 169 | +for the corresponding UTXO set (from any peer) before sending this message.
 170 | +
 171 | +| Field | Type | Size | Description |
 172 | +|-------|------|------|-------------|
 173 | +| `block_hash` | `uint256` | 32 | Block hash identifying the requested UTXO set. |

ajtowns commented at 6:44 AM on June 5, 2026:

Should be the chunk merkle root hash -- what if you're serving utxo sets for block B in both format 2 and some future format 3?

in bip-XXXX.md:148 in 23c4b35ce9

 143 | +|-------|------|------|-------------|
 144 | +| `block_hash` | `uint256` | 32 | Block hash identifying the requested UTXO set. |
 145 | +
 146 | +A node that has advertised `NODE_UTXO_SET` and can serve the requested UTXO set MUST respond with
 147 | +`utxotree`. If the serving node cannot fulfill the request, it MUST NOT respond. The requesting
 148 | +node SHOULD apply a reasonable timeout and try another peer.

ajtowns commented at 6:48 AM on June 5, 2026:

I think a notfound message would be better here, really. Failing that, perhaps the serving node should disconnect, rather than simply not responding? If you're trying to obtain your first utxo set, and this connection can't do it, there's not much benefit to either party to keeping the connection open is there? Likewise for inability to respond to getutxoset requests.

ajtowns commented at 6:56 AM on June 5, 2026: contributor

I think this protocol should rely on the "merkle root hashes" it defines for identifying the utxo set (both at the protocol level and hardcoded in node implementations that are will to use this protocol to get a particular utxo set), rather than block hashes. (see also)

Currently I think the protocol isn't quite secure unless node's also hardcode and check the expected length of the decoded utxo set, and it would be better to include that in the merkle root hash calculation.

The "timeout and retry" on failure behaviour seems like it could be improved to me.

Otherwise, looks good.

in bip-XXXX.md:44 in 23c4b35ce9

  39 | +
  40 | +### Service Bit
  41 | +
  42 | +| Name | Bit | Description |
  43 | +|------|-----|-------------|
  44 | +| `NODE_UTXO_SET` | 14 (0x4000) | The node serves complete UTXO set data for the scheduled heights (see [Scheduled UTXO Set Heights](#scheduled-utxo-set-heights)). |

ajtowns commented at 10:58 AM on June 5, 2026:

Should setting this bit also indicate that you are able to serve blocks between H2 and the tip? That is up to ~30k blocks which might be 60GGB to 120GB vs NODE_NETWORK_LIMITED which only guarantees 288 blocks. Or will nodes just need to find NODE_NETWORK peers to finish IBD in general?

(At height N=961631, H2 is 931392 which is 30,239 blocks earlier)

murchandamus commented at 9:55 PM on June 5, 2026:

I don’t think this proposal needs to solve that, when NODE_NETWORK already solves it.

murchandamus added the label PR Author action required on Jun 5, 2026