ConnectBlock: don't serialize block hash twice #23819

pull jb55 wants to merge 2 commits into bitcoin:master from jb55:faster_block_connected_usdt changing 1 files +7 −4
  1. jb55 commented at 9:06 PM on December 19, 2021: contributor

    In the validation:block_connected tracepoint, we call block->GetHash(), which ends up calling CBlockHeader::GetHash(), executing around 8000 serialization instructions. We don't need to do this extra work, because block->GetHash() is already called further up in the function. Let's save that value as a local variable and re-use it in our tracepoint so there is no unnecessary tracepoint overhead.

    Shave off an extra 100 or so instructions from the validation:block_connected tracepoint by reusing a nearby GetTimeMicros(). This brings the tracepoint down to 54 instructions. Still high, but much better than the previous ~154 and 8000 instructions which it was originally.

    Signed-off-by: William Casarin jb55@jb55.com

  2. jb55 commented at 9:09 PM on December 19, 2021: contributor
  3. DrahtBot added the label Validation on Dec 19, 2021
  4. 0xB10C commented at 11:48 AM on December 20, 2021: contributor

    Concept ACK, will review in the next days.

    ref: #23724

  5. fanquake cross-referenced this on Dec 30, 2021 from issue build: add systemtap's sys/sdt.h as depends for GUIX builds with USDT tracepoints by 0xB10C
  6. theStack approved
  7. theStack commented at 9:55 PM on January 1, 2022: contributor

    Code-review ACK b60c8fdc0772e78cb4c8b937f35afb3e25d3df56

    Not related to the tracepoint overhead, but maybe still worth mentioning: note that in the PR branch, block.GetHash() is still called more than once in CChainState::ConnectBlock, as there is another invocation at the very start: https://github.com/bitcoin/bitcoin/blob/a1e04e10799791a728cce4430808517598c8742c/src/validation.cpp#L1869 (As far as I know, we enforce compiling with assertions on, i.e. NDEBUG never being set)

  8. jb55 force-pushed on Jan 4, 2022
  9. jb55 commented at 5:04 PM on January 4, 2022: contributor

    On Sat, Jan 01, 2022 at 01:55:29PM -0800, Sebastian Falbesoner wrote:

    Not related to the tracepoint overhead, but maybe still worth mentioning: note that in the PR branch, block.GetHash() is still called more than once in CChainState::ConnectBlock, as there is another invocation at the very start: https://github.com/bitcoin/bitcoin/blob/a1e04e10799791a728cce4430808517598c8742c/src/validation.cpp#L1869 (As far as I know, we enforce compiling with assertions on, i.e. NDEBUG never being set)

    great catch. I've moved the blockHash variable to above the assert, so we shave off ~16,000 instructions now.

  10. theStack approved
  11. theStack commented at 6:57 PM on January 4, 2022: contributor

    Code review re-ACK 8f8b1969e9d15117fe89ecc377d83e3f7b8dc5c1

    Slightly off-topic, but could you elaborate how exactly you measure the number of saved instructions (and on which architecture)? Do you statically analyse the compiled binary or do it dynamically via perf or alike?

  12. jb55 commented at 7:05 PM on January 4, 2022: contributor

    On Tue, Jan 04, 2022 at 10:57:28AM -0800, Sebastian Falbesoner wrote:

    Slightly off-topic, but could you elaborate how exactly you measure the number of saved instructions (and on which architecture)? Do you statically analyse the compiled binary or do it dynamically via perf or alike?

    See #23724 (comment) for the method and #23724 (comment) for where I determined that GetHash hits 8000 instructions.

  13. fanquake requested review from laanwj on Jan 5, 2022
  14. fanquake referenced this in commit 542e405a85 on Jan 10, 2022
  15. sidhujag referenced this in commit 5ca60a8fdc on Jan 10, 2022
  16. luke-jr approved
  17. luke-jr commented at 11:11 PM on January 11, 2022: member

    utACK

    I'm surprised we're not caching block hashes like we do for transactions, though?

  18. jb55 commented at 12:55 AM on January 12, 2022: contributor

    On Tue, Jan 11, 2022 at 03:11:55PM -0800, Luke Dashjr wrote:

    I'm surprised we're not caching block hashes like we do for transactions, though?

    Where is it cached for transactions? I see CMutableTransaction::GetHash but it looks roughly the same as CBlockHeader::GetHash.

  19. luke-jr commented at 1:02 AM on January 12, 2022: member

    Check CTransaction

  20. jb55 commented at 2:39 AM on January 12, 2022: contributor

    On Tue, Jan 11, 2022 at 05:03:06PM -0800, Luke Dashjr wrote:

    Check CTransaction

    gotcha, I see ComputeHash in the constructor. Yes it looks like CBlockHeader doesn't have caching, I'm guessing because it's a mutable thing?

  21. MarcoFalke renamed this:
    tracing/block_connected: don't serialize block hash twice
    ConnectBlock: don't serialize block hash twice
    on Jan 12, 2022
  22. MarcoFalke added the label Refactoring on Jan 12, 2022
  23. MarcoFalke commented at 7:48 AM on January 12, 2022: member

    Looks like this is also refactoring ConnectBlock, which is not tracing related? -> Renamed pull title

  24. in src/validation.cpp:1870 in 8f8b1969e9 outdated
    1865 | @@ -1866,7 +1866,10 @@ bool CChainState::ConnectBlock(const CBlock& block, BlockValidationState& state,
    1866 |  {
    1867 |      AssertLockHeld(cs_main);
    1868 |      assert(pindex);
    1869 | -    assert(*pindex->phashBlock == block.GetHash());
    1870 | +
    1871 | +    uint256 blockHash = block.GetHash();
    


    MarcoFalke commented at 7:51 AM on January 12, 2022:
        const uint256 block_hash{block.GetHash()};
    

    style nit

  25. MarcoFalke approved
  26. MarcoFalke commented at 7:52 AM on January 12, 2022: member

    lgtm. Left a style nit

  27. block_connected: don't serialize block hash twice
    In the validation:block_connected tracepoint, we call block->GetHash(),
    which ends up calling CBlockHeader::GetHash(), executing around 8000
    serialization instructions. We don't need to do this extra work, because
    block->GetHash() is already called further up in the function. Let's
    save that value as a local variable and re-use it in our tracepoint so
    there is no unnecessary tracepoint overhead.
    
    Signed-off-by: William Casarin <jb55@jb55.com>
    80e1c55687
  28. block_connected: re-use previous GetTimeMicros
    Shave off an extra 100 or so instructions from the
    validation:block_connected tracepoint by reusing a nearby
    GetTimeMicros(). This brings the tracepoint down to 54 instructions.
    Still high, but much better than the previous ~154 and 8000 instructions
    which it was originally.
    
    Signed-off-by: William Casarin <jb55@jb55.com>
    eb8b22d517
  29. jb55 force-pushed on Jan 12, 2022
  30. theStack approved
  31. theStack commented at 5:32 PM on January 12, 2022: contributor

    re-ACK eb8b22d5176d7abc6f93b4473df446105ca595e6

  32. 0xB10C commented at 11:15 AM on January 13, 2022: contributor

    ACK eb8b22d5176d7abc6f93b4473df446105ca595e6

    I'm measuring 7 instructions. Code changes look good to me. Still needs refactoring in the PR title :)

    (gdb) b src/validation:2161
    Breakpoint 1 at 0x2c57b0: file src/validation.cpp, line 2161.
    (gdb) b src/validation:
    Breakpoint 2 at 0x2c57de: file src/validation.cpp, line 2172.
    (gdb) run
    [..]
    
    Thread 22 "b-msghand" hit Breakpoint 1, CChainState::ConnectBlock (this=0x555555d19980, block=..., state=...,
        pindex=<optimized out>, view=..., fJustCheck=false) at validation.cpp:2161
    2161	    LogPrint(BCLog::BENCH, "    - Index writing: %.2fms [%.2fs (%.2fms/blk)]\n", MILLI * (nTime5 - nTime4), nTimeIndex * MICRO, nTimeIndex * MILLI / nBlocksTotal);
    (gdb) n
    2163	    TRACE6(validation, block_connected,
    (gdb) record btrace
    (gdb) c
    Continuing.
    
    Thread 22 "b-msghand" hit Breakpoint 2, CCheckQueueControl<CScriptCheck>::~CCheckQueueControl (
        this=<optimized out>, __in_chrg=<optimized out>) at validation.cpp:2172
    2172	    return true;
    (gdb) record info
    Undefined record command: "info".  Try "help record".
    (gdb) info record
    [..]
    Recorded 7 instructions in 1 functions (0 gaps) for thread 22 (Thread 0x7fff2e7fc640 (LWP 508792)).
    

    @jb55 you mention 54 instructions in eb8b22d5176d7abc6f93b4473df446105ca595e6. I think your measurements include the instructions from line 2161. I'm starting the measurement in line 2163.

    https://github.com/bitcoin/bitcoin/blob/eb8b22d5176d7abc6f93b4473df446105ca595e6/src/validation.cpp#L2161-L2172

  33. fanquake added this to the milestone 23.0 on Jan 18, 2022
  34. MarcoFalke cross-referenced this on Feb 8, 2022 from issue Enforce Taproot script flags whenever WITNESS is set by MarcoFalke
  35. laanwj commented at 1:09 PM on February 17, 2022: member

    Code review ACK eb8b22d5176d7abc6f93b4473df446105ca595e6

  36. laanwj merged this on Feb 17, 2022
  37. laanwj closed this on Feb 17, 2022

  38. sidhujag referenced this in commit 504a590bf7 on Feb 18, 2022
  39. JSKitty cross-referenced this on Oct 15, 2022 from issue backport: partial bitcoin#23819 ConnectBlock: don't serialize block hash twice by JSKitty
  40. PastaPastaPasta referenced this in commit 70717c901d on Oct 16, 2022
  41. Fabcien referenced this in commit c3ed96a41c on Nov 30, 2022
  42. andrewtoth cross-referenced this on Dec 31, 2022 from issue p2p: cache block hash in ProcessGetBlockData and CMPCTBLOCK processing by andrewtoth
  43. bitcoin locked this on Feb 17, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-20 06:53 UTC