net: cap future-MTP headers commitments #35208

l0rinc commented at 10:35 AM on May 5, 2026: contributor

Problem

Headers presync computes m_max_commitments from the elapsed time since the chain-start MTP plus MAX_FUTURE_BLOCK_TIME. When the local clock is more than MAX_FUTURE_BLOCK_TIME behind the chain-start MTP, that elapsed value is negative, but it is used in arithmetic assigned to the unsigned commitment cap. This can turn the intended zero bound into a large cap, letting low-work headers presync continue instead of aborting when the commitment cap is reached.

Fix

Assign an explicit zero commitment cap when the chain-start MTP is still too far in the future for a locally acceptable extension to exist.

The branch is split into two bisect-friendly commits: a regression test pinning the current behavior, and the minimal fix, which flips those test assertions.

The headers_sync_state fuzz target also now allows the same clock-skew case instead of always choosing a time at or after the chain-start MTP.

DrahtBot added the label P2P on May 5, 2026

DrahtBot commented at 10:36 AM on May 5, 2026: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/35208.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	optout21, vasild, dergoegge
Approach NACK	hodlinator

If your review is incorrectly listed, please copy-paste <code></code> into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#35114 (test: NodeClockContext follow-ups by seduless)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

in src/test/fuzz/headerssync.cpp:66 in 529757189b

  59 | @@ -60,7 +60,11 @@ FUZZ_TARGET(headers_sync_state, .init = initialize_headers_sync_state_fuzz)
  60 |      CBlockHeader genesis_header{Params().GenesisBlock()};
  61 |      CBlockIndex start_index(genesis_header);
  62 |  
  63 | -    NodeClockContext clock_ctx{ConsumeTime(fuzzed_data_provider, /*min=*/start_index.GetMedianTimePast())};
  64 | +    const int64_t chain_start_mtp{start_index.GetMedianTimePast()};
  65 | +    const NodeSeconds now{fuzzed_data_provider.ConsumeBool() ?
  66 | +                              ConsumeTime(fuzzed_data_provider, /*min=*/chain_start_mtp) :
  67 | +                              NodeSeconds{std::chrono::seconds{chain_start_mtp - MAX_FUTURE_BLOCK_TIME - 1}}};

maflcko commented at 12:49 PM on May 5, 2026:

nit: I think instead of a branch, this can be a plain ConsumeTime call with just the min adjusted to go further back. This makes the fuzz format a bit more static, and the fuzz engine should still be able to figure out the edge case. You may also extend the input space by allowing time further back than one MAX_FUTURE_BLOCK_TIME?

l0rinc force-pushed on May 5, 2026

dergoegge approved

dergoegge commented at 10:32 AM on May 8, 2026: member

utACK e201bdb74b994ebb34999a38612401167b02c4a9

optout21 commented at 2:41 PM on May 11, 2026: contributor

reACK 9274f925a7ca64d114e1172957a9f94e1ed1796b

Prev: ACK e201bdb74b994ebb34999a38612401167b02c4a9

Good catch of a boundary condition issue, fixed correctly.

Verified build and the test passing locally. Reproduced the failure of the new test with the pre-change production code. Note that the auto->int64_t change is not strictly needed, but it is correct, and it's safer to be explicit (the compiler resolves auto to signed int64_t (at least with c++/linux)).

Ran the test with these variations:

variant	type	conditional present	test result
pre-change	auto	no	fail
	uint64_t	no	fail
	int64_t	no	fail
	auto	yes	pass
	uint64_t	yes	fail
post-change	int64_t	yes	pass

in src/headerssync.cpp:43 in e201bdb74b outdated

  37 | @@ -38,9 +38,9 @@ HeadersSyncState::HeadersSyncState(NodeId id,
  38 |      // exceeds this bound, because it's not possible for a consensus-valid
  39 |      // chain to be longer than this (at the current time -- in the future we
  40 |      // could try again, if necessary, to sync a longer chain).
  41 | -    const auto max_seconds_since_start{(Ticks<std::chrono::seconds>(NodeClock::now() - NodeSeconds{std::chrono::seconds{chain_start.GetMedianTimePast()}}))
  42 | -                                       + MAX_FUTURE_BLOCK_TIME};
  43 | -    m_max_commitments = 6 * max_seconds_since_start / m_params.commitment_period;
  44 | +    const int64_t max_seconds_since_start{
  45 | +        Ticks<std::chrono::seconds>(NodeClock::now() - NodeSeconds{std::chrono::seconds{chain_start.GetMedianTimePast()}}) + MAX_FUTURE_BLOCK_TIME};
  46 | +    m_max_commitments = max_seconds_since_start > 0 ? 6 * max_seconds_since_start / m_params.commitment_period : 0;

hodlinator commented at 6:34 PM on May 11, 2026:

This approach of setting up m_max_commitments for a later failure feels safe in the context of pre-existing code, but the error is quite indirect, and the cause of the error becomes unclear.

Have you considered throwing an exception from the constructor instead?

<details><summary>Potential diff</summary>

diff --git a/src/headerssync.cpp b/src/headerssync.cpp
index 30a57702c8..a34fb37ca7 100644
--- a/src/headerssync.cpp
+++ b/src/headerssync.cpp
@@ -6,10 +6,13 @@
 
 #include <logging.h>
 #include <pow.h>
+#include <tinyformat.h>
 #include <util/check.h>
 #include <util/time.h>
 #include <util/vector.h>
 
+#include <stdexcept>
+
 // Our memory analysis in headerssync-params.py assumes this many bytes for a
 // CompressedHeader (we should re-calculate parameters if we compress further).
 static_assert(sizeof(CompressedHeader) == 48);
@@ -40,7 +43,10 @@ HeadersSyncState::HeadersSyncState(NodeId id,
     // could try again, if necessary, to sync a longer chain).
     const int64_t max_seconds_since_start{
         Ticks<std::chrono::seconds>(NodeClock::now() - NodeSeconds{std::chrono::seconds{chain_start.GetMedianTimePast()}}) + MAX_FUTURE_BLOCK_TIME};
-    m_max_commitments = max_seconds_since_start > 0 ? 6 * max_seconds_since_start / m_params.commitment_period : 0;
+    if (max_seconds_since_start <= 0) {
+        throw std::runtime_error{strprintf("System clock is earlier than chain-start's MTP minus %d seconds", MAX_FUTURE_BLOCK_TIME)};
+    }
+    m_max_commitments = 6 * max_seconds_since_start / m_params.commitment_period;
 
     LogDebug(BCLog::NET, "Initial headers sync started with peer=%d: height=%i, max_commitments=%i, min_work=%s\n", m_id, m_current_height, m_max_commitments, m_minimum_required_work.ToString());
 }
diff --git a/src/test/fuzz/headerssync.cpp b/src/test/fuzz/headerssync.cpp
index 742a496707..0c363c0183 100644
--- a/src/test/fuzz/headerssync.cpp
+++ b/src/test/fuzz/headerssync.cpp
@@ -16,6 +16,8 @@
 #include <validation.h>
 
 #include <iterator>
+#include <memory>
+#include <stdexcept>
 #include <vector>
 
 static void initialize_headers_sync_state_fuzz()
@@ -70,11 +72,16 @@ FUZZ_TARGET(headers_sync_state, .init = initialize_headers_sync_state_fuzz)
         .redownload_buffer_size = fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, Params().HeadersSync().redownload_buffer_size * 2),
     };
     arith_uint256 min_work{UintToArith256(ConsumeUInt256(fuzzed_data_provider))};
-    FuzzedHeadersSyncState headers_sync(
-        params,
-        /*commit_offset=*/fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, params.commitment_period - 1),
-        /*chain_start=*/start_index,
-        /*minimum_required_work=*/min_work);
+    std::unique_ptr<FuzzedHeadersSyncState> headers_sync;
+    try {
+        headers_sync = std::make_unique<FuzzedHeadersSyncState>(
+            params,
+            /*commit_offset=*/fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, params.commitment_period - 1),
+            /*chain_start=*/start_index,
+            /*minimum_required_work=*/min_work);
+    } catch (const std::runtime_error& e) {
+        assert(strcmp(e.what(), "System clock is earlier than chain-start's MTP minus 7200 seconds") == 0);
+    }
 
     // Store headers for potential redownload phase.
     std::vector<CBlockHeader> all_headers;
@@ -105,14 +112,14 @@ FUZZ_TARGET(headers_sync_state, .init = initialize_headers_sync_state_fuzz)
         }
 
         if (headers.empty()) return;
-        auto result = headers_sync.ProcessNextHeaders(headers, fuzzed_data_provider.ConsumeBool());
+        auto result = headers_sync->ProcessNextHeaders(headers, fuzzed_data_provider.ConsumeBool());
         requested_more = result.request_more;
 
         if (result.request_more) {
             if (presync) {
                 all_headers.insert(all_headers.cend(), headers.cbegin(), headers.cend());
 
-                if (headers_sync.GetState() == HeadersSyncState::State::REDOWNLOAD) {
+                if (headers_sync->GetState() == HeadersSyncState::State::REDOWNLOAD) {
                     presync = false;
                     redownloaded_it = all_headers.cbegin();
 
@@ -122,7 +129,7 @@ FUZZ_TARGET(headers_sync_state, .init = initialize_headers_sync_state_fuzz)
                 }
             }
 
-            (void)headers_sync.NextHeadersRequestLocator();
+            (void)headers_sync->NextHeadersRequestLocator();
         }
     }
 }
diff --git a/src/test/headers_sync_chainwork_tests.cpp b/src/test/headers_sync_chainwork_tests.cpp
index a4b917fada..dd71f8a2a2 100644
--- a/src/test/headers_sync_chainwork_tests.cpp
+++ b/src/test/headers_sync_chainwork_tests.cpp
@@ -14,6 +14,7 @@
 #include <validation.h>
 
 #include <cstddef>
+#include <stdexcept>
 #include <vector>
 
 #include <boost/test/unit_test.hpp>
@@ -145,13 +146,8 @@ BOOST_FIXTURE_TEST_SUITE(headers_sync_chainwork_tests, HeadersGeneratorSetup)
 BOOST_AUTO_TEST_CASE(future_chain_start_mtp_bounds_commitments)
 {
     const NodeClockContext clock_ctx{std::chrono::seconds{genesis.GetBlockTime() - MAX_FUTURE_BLOCK_TIME - 1}};
-    HeadersSyncState hss{CreateState(/*commitment_period=*/1)};
-
-    CHECK_RESULT(hss.ProcessNextHeaders({{FirstChain().front()}}, /*full_headers_message=*/true),
-        hss, /*exp_state=*/State::FINAL,
-        /*exp_success=*/false, /*exp_request_more=*/false,
-        /*exp_headers_size=*/0, /*exp_pow_validated_prev=*/std::nullopt,
-        /*exp_locator_hash=*/std::nullopt);
+    BOOST_CHECK_EXCEPTION(CreateState(/*commitment_period=*/1), std::runtime_error,
+        HasReason("System clock is earlier than chain-start's MTP minus 7200 seconds"));
 }
 
 BOOST_AUTO_TEST_CASE(sneaky_redownload)

</details>

If we figure out the current approach in the PR is more desirable, maybe we could repeat the m_max_commitments value in the log output when aborting and raise the log level to info?

-            LogDebug(BCLog::NET, "Initial headers sync aborted with peer=%d: exceeded max commitments at height=%i (presync phase)\n", m_id, next_height);
+            LogInfo("Initial headers sync aborted with peer=%d: exceeded max commitments (%d) at height=%i (presync phase)", m_id, m_max_commitments, next_height);

A third alternative would be to emit a LogWarning() when we detect this condition in the constructor, I think that's more appropriate if you want to avoid adding the exception.

l0rinc commented at 10:01 PM on May 11, 2026:

Thanks for the review! I liked the logging ideas, they fit the existing method better than exceptions (which would also need a unique_ptr + try/catch in the fuzz target).

Split into three commits following #35260: test pinning current behavior, minimal fix (cap at zero + upfront abort, test assertions flip), and a final commit unifying the logs (new LogWarning for the clock-skew abort, non-continuous-headers raised to LogWarning, and exceeded max commitments to LogInfo with m_max_commitments included).

hodlinator commented at 7:51 AM on May 12, 2026:

Thanks for incorporating my feedback (so far ;) ).

A few more suggestions:

Adding a pure refactor commit which makes HeadersSyncState::m_max_commitments const and extracts the calculation to a local function (ComputeMaxCommitments()).
Add a warning in ComputeMaxCommitments() that indicates that we expect syncing to fail later due to the system clock vs chain start MTP being off.
Introduce the warning in m_max_commitments == 0 in earlier commit (see other comment).
Apply the LogDebug -> LogInfo change to all abort reasons and adjust commit message to mention rate limiting etc.

https://github.com/bitcoin/bitcoin/compare/master...hodlinator:bitcoin:pr/35208_suggestions2

git range-diff d84c8cc6...78f53789 --creation-factor=90

l0rinc commented at 8:18 AM on May 12, 2026:

This PR is meant to fix a security issue, I would prefer doing the refactors (which sound reasonable to me) in a follow-up instead.

hodlinator commented at 6:45 PM on May 11, 2026: contributor

Concept ACK e201bdb74b994ebb34999a38612401167b02c4a9

Good to tighten up the code!

l0rinc force-pushed on May 11, 2026

in src/headerssync.cpp:158 in d84c8cc6ec

 154 |          // Somehow our peer gave us a header that doesn't connect.
 155 |          // This might be benign -- perhaps our peer reorged away from the chain
 156 |          // they were on. Give up on this sync for now (likely we will start a
 157 |          // new sync with a new starting point).
 158 | -        LogDebug(BCLog::NET, "Initial headers sync aborted with peer=%d: non-continuous headers at height=%i (presync phase)\n", m_id, m_current_height);
 159 | +        LogWarning("Initial headers sync aborted with peer=%d: non-continuous headers at height=%i (presync phase)", m_id, m_current_height);

hodlinator commented at 6:35 AM on May 12, 2026:

Not sure why you're touching this case of a remote peer sending us non-contiguous headers. If you insist, should be LogInfo() as LogWarning() is currently intended for local configuration-errors. As per developer-notes.md:

LogWarning(fmt, params...) should be used in place of LogInfo for severe problems that the node admin should address, but are not severe enough to warrant shutting down the node (e.g., system time appears to be wrong, unknown soft fork appears to have activated).

The above also means that using LogWarning() for the m_max_commitments == 0 case is encouraged. 👍

l0rinc commented at 8:00 AM on May 12, 2026:

Reverted to debug, the logging changes can be done in a follow-up - ping me if you push it.

in src/headerssync.cpp:150 in d84c8cc6ec outdated

 144 | @@ -145,12 +145,17 @@ bool HeadersSyncState::ValidateAndStoreHeadersCommitments(std::span<const CBlock
 145 |      Assume(m_download_state == State::PRESYNC);
 146 |      if (m_download_state != State::PRESYNC) return false;
 147 |  
 148 | +    if (m_max_commitments == 0) {
 149 | +        LogWarning("Initial headers sync aborted with peer=%d: local clock behind chain-start MTP (presync phase)", m_id);
 150 | +        return false;

hodlinator commented at 6:46 AM on May 12, 2026:

nit: I think it makes more sense to add the LogWarning() in the same commit as return false. The conditions above are ones with Assume()s and are more like logic errors, hence the lack of logging.

l0rinc commented at 8:18 AM on May 12, 2026:

Done

l0rinc force-pushed on May 12, 2026

hodlinator approved

hodlinator commented at 11:42 AM on May 12, 2026: contributor

ACK f90578b977758ffa19b623747a8faa083dc61f57

It's unfortunate that we don't error out from the constructor and instead set the invalid value 0, but as already discussed (https://github.com/bitcoin/bitcoin/pull/35208#discussion_r3221253576), doing so might be very disruptive to surrounding code. The new abort condition in HeadersSyncState::ValidateAndStoreHeadersCommitments() ensures we fail early during the sync with a clear error at least.

DrahtBot requested review from dergoegge on May 12, 2026

DrahtBot requested review from optout21 on May 12, 2026

in src/test/fuzz/headerssync.cpp:63 in f90578b977 outdated

  59 | @@ -60,7 +60,7 @@ FUZZ_TARGET(headers_sync_state, .init = initialize_headers_sync_state_fuzz)
  60 |      CBlockHeader genesis_header{Params().GenesisBlock()};
  61 |      CBlockIndex start_index(genesis_header);
  62 |  
  63 | -    NodeClockContext clock_ctx{ConsumeTime(fuzzed_data_provider, /*min=*/start_index.GetMedianTimePast())};
  64 | +    NodeClockContext clock_ctx{ConsumeTime(fuzzed_data_provider, /*min=*/start_index.GetMedianTimePast() - 2 * MAX_FUTURE_BLOCK_TIME)};

vasild commented at 1:13 PM on May 13, 2026:

I guess 2 is just arbitrary here, right? Might as well have been e.g. 3 or 4?

hodlinator commented at 6:57 PM on May 13, 2026:

It's just to go beyond the MAX_FUTURE_BLOCK_TIME in src/headerssync.cpp.

l0rinc commented at 9:30 PM on May 13, 2026:

Yes, see @maflcko's comment in #35208 (review)

in src/headerssync.cpp:42 in f90578b977

  37 | @@ -38,9 +38,9 @@ HeadersSyncState::HeadersSyncState(NodeId id,
  38 |      // exceeds this bound, because it's not possible for a consensus-valid
  39 |      // chain to be longer than this (at the current time -- in the future we
  40 |      // could try again, if necessary, to sync a longer chain).
  41 | -    const auto max_seconds_since_start{(Ticks<std::chrono::seconds>(NodeClock::now() - NodeSeconds{std::chrono::seconds{chain_start.GetMedianTimePast()}}))
  42 | -                                       + MAX_FUTURE_BLOCK_TIME};
  43 | -    m_max_commitments = 6 * max_seconds_since_start / m_params.commitment_period;
  44 | +    const int64_t max_seconds_since_start{
  45 | +        Ticks<std::chrono::seconds>(NodeClock::now() - NodeSeconds{std::chrono::seconds{chain_start.GetMedianTimePast()}}) + MAX_FUTURE_BLOCK_TIME};

vasild commented at 1:27 PM on May 13, 2026:

This:

Removes the unnecessary braces around Ticks: (Ticks<...>(...)) -> Ticks<...>(...).
Changes auto to int64_t (before the variable ended as long long and after - just long). The commit message reads "Keep the elapsed value signed ...", however that implies that before this PR the value was changed from signed to unsigned. Correct me if I am wrong, but this is not the case?
Reformats the entire expression.

All 3 seem unnecessary. I would revert the changes to this line. It is confusing why change those if not needed.

l0rinc commented at 9:32 PM on May 13, 2026:

The int64_t was made explicit to help us see the signed-to-unsigned change here. The formatting and parentheses are easy to ignore; it's quite a simple expression.

l0rinc commented at 10:09 AM on May 14, 2026:

Reverted the formatting and parenthesis but kept the auto to int64_t change and explained it in the commit message.

hodlinator commented at 5:51 PM on May 19, 2026:

You're still changing the indent of + MAX_FUTURE_BLOCK_TIME - not sure what column you are aiming for? Better to change it to something sane or avoid touching that line.

https://github.com/bitcoin/bitcoin/blob/9274f925a7ca64d114e1172957a9f94e1ed1796b/src/headerssync.cpp#L41-L42

l0rinc commented at 12:52 PM on May 20, 2026:

I’d prefer not to keep iterating on whitespace here. This line is touched because the PR intentionally makes the signed elapsed value explicit for the security fix; the continuation indentation is incidental and does not change behavior. I’d rather keep this PR focused on the bug fix than force-push another version for formatting churn.

in src/headerssync.cpp:149 in f90578b977

 144 | @@ -145,6 +145,11 @@ bool HeadersSyncState::ValidateAndStoreHeadersCommitments(std::span<const CBlock
 145 |      Assume(m_download_state == State::PRESYNC);
 146 |      if (m_download_state != State::PRESYNC) return false;
 147 |  
 148 | +    if (m_max_commitments == 0) {
 149 | +        LogWarning("Initial headers sync aborted with peer=%d: local clock behind chain-start MTP (presync phase)", m_id);

vasild commented at 3:13 PM on May 13, 2026:

Why LogWarning()? The code just below uses LogDebug(BCLog::NET, "Initial headers sync aborted with peer=%d.... Also the log in HeadersSyncState::ValidateAndProcessSingleHeader() due to exceeded m_max_commitments uses LogDebug().

hodlinator commented at 6:55 PM on May 13, 2026:

m_max_commitments == 0 only happens when the system clock is really off compared to the MTP of the starting block. Thus the issue is local to the machine running the code, making a LogWarning() motivated, so hopefully the user can rectify the situation. In other cases where we abort it is rather the remote peer which is misbehaving, so a warning would be wrong.

vasild commented at 8:48 AM on May 14, 2026:

I agree. Just saying that in master, in ValidateAndProcessSingleHeader() if m_max_commitments is 0 then LogDebug() will be used. Maybe that deserves an upgrade to LogWarning() as you describe, but that is unrelated to the integer underflow. Also, that is not strictly just for 0, smaller values of m_max_commitments indicate that the local clock is close to 2h behind MTP, e.g. 1h48min. Does that also deserve a warning?

The message could be made more useful by showing the timestamps as well, e.g.: "local clock (value of local clock here) more than 2 hours behind chain-start MTP (value of MTP + height here)"

l0rinc commented at 10:10 AM on May 14, 2026:

I have reverted this, we can adjust in a follow-up. Added back the commitment_period to the test to exercise this path.

in src/headerssync.cpp:151 in f90578b977

 144 | @@ -145,6 +145,11 @@ bool HeadersSyncState::ValidateAndStoreHeadersCommitments(std::span<const CBlock
 145 |      Assume(m_download_state == State::PRESYNC);
 146 |      if (m_download_state != State::PRESYNC) return false;
 147 |  
 148 | +    if (m_max_commitments == 0) {
 149 | +        LogWarning("Initial headers sync aborted with peer=%d: local clock behind chain-start MTP (presync phase)", m_id);
 150 | +        return false;
 151 | +    }

vasild commented at 3:16 PM on May 13, 2026:

I do not fully understand why this addition is necessary. The commit message reads:

Abort presync upfront on the first header batch in that case with a LogWarning, instead of waiting for the random commitment offset to reach a commitment slot.

But that does not quite answer the question "Why?" for me. Is it just an optimization?

This is not strictly related to the unsigned underflow fixed in the constructor of HeadersSyncState. The fix is legit even without this.

Further, is this causing an unintended change of behavior:

before this PR if m_max_commitments ended up 0 (the current time is exactly 2h behind MTP) all calls to ValidateAndProcessSingleHeader() might have returned true and thus ValidateAndStoreHeadersCommitments() might have returned true as well.
after this PR if m_max_commitments ends up 0, then ValidateAndStoreHeadersCommitments() will never return true.

l0rinc commented at 9:37 PM on May 13, 2026:

It is not required to fix, just to detect and log the inconsistency early. Letting ValidateAndStoreHeadersCommitments() return true in that state would keep presync alive despite us already knowing no valid commitment can be reached under the current local time bound.

vasild commented at 5:14 AM on May 14, 2026:

What about if the job is done (we get out of presync because enough PoW was reached) even if m_max_commitments would have been triggered but was not because the check is not done for every block:

    if (next_height % m_params.commitment_period == m_commit_offset) { 
        m_header_commitments.push_back(m_hasher(current.GetHash()) & 1);
        if (m_header_commitments.size() > m_max_commitments) { // return false, abort
            ...
        } 
    } 

    m_current_chain_work += GetBlockProof(current);

Correct me if I am wrong. It looks to me that in master, m_header_commitments might be empty and m_max_commitments equal to 0 in which case the next time the first if above is entered, we would abort the operation. However that first if is not entered for every block and the current chain work grows for every block. So, it might happen that we successfully get out of presync after processing a few blocks even when m_header_commitments.size() == m_max_commitments == 0. And this PR changes that.

l0rinc commented at 10:11 AM on May 14, 2026:

I have reverted this to avoid diverting the change, we can do these in a follow-up instead, thanks for the comments.

hodlinator commented at 5:59 PM on May 19, 2026:

m_max_commitments == 0 rightfully is a valid value on master is an excellent point that I missed during earlier review. Credits to @vasild!

It highlights that attempting to set an invalid value to induce a later failure is brittle.

What is a valid number of header commitments if our system time is stuck at Jan 1st 1970? Surely it is not zero?

In kernel/chainparams.cpp we have:

consensus.nMinimumChainWork = uint256{"0000000000000000000000000000000000000001128750f82f4c366153a3a030"};
consensus.defaultAssumeValid = uint256{"00...177c478e094bfad51ba5ac"}; // 938343

If we had the block height of 938343 as a constant instead of a comment we could fall back to using that to calculate roughly how many commitments we need from the chain_start height (when the system clock is unreliable). 938343 is the height we reach the minimum chainwork, unless some major re-org action has been happening.

With an upper bound of zero m_max_commitments, the commitment_period should eventually cause us to abort upon pushing the first one. But this seems very sloppy unless I'm misunderstanding something? I don't see a reason to rush-ACK + merge this in it's current state.

If we really want to keep the invalid value with later abort, maybe m_max_commitments could be made std::optional, or we could set it to some invalid value like ~0ULL and add back the new early abort.

l0rinc commented at 2:03 PM on May 20, 2026:

The goal of this PR is to fix a potential underflow, not fix every problem with headers sync commitment calculation algorithm.

hodlinator commented at 7:25 PM on May 20, 2026:

Please see my suggestions branch at https://github.com/bitcoin/bitcoin/compare/master...hodlinator:bitcoin:pr/35208_suggestions

It uses nullopt to flag that m_max_commitments is unusable instead of setting it to 0 which is a valid value for when the system clock is exactly MAX_FUTURE_BLOCK_TIME behind the chain start, found by @vasild.
It re-adds the early abort in HeadersSyncState::ValidateAndStoreHeadersCommitments() similar to your f90578b977758ffa19b623747a8faa083dc61f57 push which I previously acked. Aborting in this place should be minimally disruptive to surrounding code, and we are able to log a clear reason for why its happening. It also means we abort reliably, unlike with the current (9274f925a7ca64d114e1172957a9f94e1ed1796b) which may succeed, see inline comment about the commit message.
The current state of this PR manipulates the commitment period in the test to the unrealistic value of 1 which makes HeadersSyncState::ValidateAndProcessSingleHeader() fail on the first header whereas in 99.9+% of real-world scenarios it would not. In my branch there is no need for manipulating the period.
The test case is added after the others in headers_sync_chainwork_tests.cpp, see other inline comment.

vasild commented at 3:49 PM on May 13, 2026: contributor

Concept ACK as of f90578b977

I think a simpler one-liner would suffice:

- m_max_commitments = 6 * max_seconds_since_start / m_params.commitment_period;
+ m_max_commitments = max_seconds_since_start > 0 ? 6 * max_seconds_since_start / m_params.commitment_period : 0;

and I find it hard to fully grasp the implication of the other changes.

l0rinc commented at 9:37 PM on May 13, 2026: contributor

Thanks for the comments, I'll investigate them in more detail soon.

test: cover future chain-start MTP boundary in headers presync

When the local clock sits more than `MAX_FUTURE_BLOCK_TIME` behind the chain-start MTP, the elapsed value used to derive `m_max_commitments` is negative.
The later division by the unsigned commitment period converts that negative value to a large unsigned bound.
With `commitment_period=1`, presync accepts the peer's first commitment slot and requests more, where it should abort.

Pin that pre-fix behavior in a regression test that the subsequent fix will flip.

3a70822391

net: cap future-MTP headers commitments at zero

Make the elapsed-time value as `int64_t` explicit for clarity and assign an explicit zero commitment cap when the chain-start MTP is still too far in the future.
This avoids converting a negative elapsed value through the unsigned commitment-period arithmetic into a large `m_max_commitments` bound.

With a zero cap, the existing commitment-limit check aborts once the first commitment slot is reached.
Flip the regression test to expect that abort, and broaden the `headers_sync_state` fuzz target so it covers the same clock-skew case.

Co-authored-by: MarcoFalke <*~=`'#}+{/-|&$^_@721217.xyz>
Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>

9274f925a7

l0rinc force-pushed on May 14, 2026

vasild approved

vasild commented at 1:02 PM on May 14, 2026: contributor

ACK 9274f925a7ca64d114e1172957a9f94e1ed1796b

I was wondering how to exercise this on master without any changes to the non-test-code (like the early quit when max commitments is zero) and couldn't figure it out. Setting commitment period to 1 is cool - it forces the random commit offset to be 0 and so next_height % m_params.commitment_period == m_commit_offset becomes next_height % 1 == 0 which is always true.

DrahtBot requested review from hodlinator on May 14, 2026

l0rinc requested review from maflcko on May 18, 2026

dergoegge commented at 4:02 PM on May 19, 2026: member

utACK 9274f925a7ca64d114e1172957a9f94e1ed1796b

hodlinator changes_requested

hodlinator commented at 6:22 PM on May 19, 2026: contributor

Approach NACK 9274f925a7ca64d114e1172957a9f94e1ed1796b

in src/test/headers_sync_chainwork_tests.cpp:145 in 9274f925a7

 141 | @@ -141,6 +142,18 @@ std::vector<CBlockHeader> HeadersGeneratorSetup::GenerateHeaders(
 142 |  //    when the chain a peer provides has too little work.
 143 |  BOOST_FIXTURE_TEST_SUITE(headers_sync_chainwork_tests, HeadersGeneratorSetup)
 144 |  
 145 | +BOOST_AUTO_TEST_CASE(future_chain_start_mtp_bounds_commitments)

hodlinator commented at 6:53 PM on May 20, 2026:

Please put this test case after the pre-existing ones or in another file. There's a comment block right above which describes the test suites and the existing test cases, and splicing this in before they start makes things disorderly.

https://github.com/bitcoin/bitcoin/blob/9274f925a7ca64d114e1172957a9f94e1ed1796b/src/test/headers_sync_chainwork_tests.cpp#L133-L146

Sorry for not catching this earlier.

in src/headerssync.cpp:1 in 9274f925a7

hodlinator commented at 7:21 PM on May 20, 2026:

The commit message in 9274f925a7ca64d114e1172957a9f94e1ed1796b states:

With a zero cap, the existing commitment-limit check aborts once the first commitment slot is reached.

It is not necessarily true that we will abort, in fact it is somewhat unlikely in this case:

A node manages to complete IBD with a valid system clock. (Maybe the datadir is copied from another machine).
It is then only intermittently online, now with an invalid clock.
If we go offline for more than 144 blocks, then came back online, we would not meet the dynamic minimum_chain_work threshold in PeerManagerImpl::TryLowWorkHeadersSync() and therefore re-enter HeadersSyncState even though we are outside IBD.
It depends on whether the range of block heights we sync headers for in order to reach the minimum chainwork do overlap with the commitment_period (currently 641 blocks on mainnet).
If there is no overlap, we never try to store a commitment, and the node will complete the HeadersSyncState successfully.

l0rinc commented at 7:41 PM on May 20, 2026: contributor

@hodlinator has worked more with headers than I have, has stronger opinions about how it should be done, and already has a branch ready with the alternative. Could you please open a new PR that reflects the way you imagine it? I'll review it - seems cleaner than going back and forth here.

l0rinc closed this on May 20, 2026

hodlinator commented at 8:07 PM on May 21, 2026: contributor

Thanks @l0rinc, sorry for any frustration. Opened #35351 which does take a more hardline approach of not allowing HeadersSyncState to complete construction if the system clock is lagging far behind, at least in the first published iteration.

l0rinc deleted the branch on Jul 1, 2026