refactor: Replace boost::mutex,condition_var,chrono with std equivalents in scheduler #18234

pull ajtowns wants to merge 6 commits into bitcoin:master from ajtowns:202002-scheduler-deboost changing 14 files +151 −117
  1. ajtowns commented at 5:09 AM on March 1, 2020: contributor

    Replacing boost functionality with C++11 stuff.

    Motivated by #18227, but should stand alone. Changing from boost::condition_var to std::condition_var means threadGroup.interrupt_all isn't enough to interrupt serviceQueue anymore, so that means calling stop() before join_all() is needed. And the existing reverselock.h code doesn't work with sync.h's DebugLock code (because the reversed lock won't be removed from g_lockstack which then leads to incorrect potential deadlock warnings), so I've replaced that with a dedicated class and macro that's aware of our debug lock behaviour.

    Fixes #16027, Fixes #14200, Fixes #18227

  2. fanquake added the label Refactoring on Mar 1, 2020
  3. ajtowns cross-referenced this on Mar 1, 2020 from issue scheduler test causing pthread_cond_timedwait: Invalid argument error by amitiuttarwar
  4. fanquake commented at 5:14 AM on March 1, 2020: member

    cc @theuni. I know you have been working on similar refactors recently.

  5. ajtowns force-pushed on Mar 1, 2020
  6. ajtowns force-pushed on Mar 1, 2020
  7. DrahtBot commented at 10:03 AM on March 1, 2020: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #18271 (scheduler: Workaround negative nsecs bug in boost's wait_until by luke-jr)
    • #18264 ([WIP] build: Remove Boost Chrono by fanquake)
    • #16117 (util: Replace boost sleep with std sleep by MarcoFalke)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  8. DrahtBot cross-referenced this on Mar 1, 2020 from issue util: Replace boost sleep with std sleep by MarcoFalke
  9. in src/sync.h:213 in 3cb2664971 outdated
     208 | +        const int line;
     209 | +     };
     210 | +     friend class reverse_lock;
     211 |  };
     212 |  
     213 | +#define REVERSE_LOCK(g) decltype(g)::reverse_lock PASTE2(revlock, __COUNTER__)(g, #g, __FILE__, __LINE__)
    


    elichai commented at 1:29 PM on March 1, 2020:

    Pending comments on my post here: #18088 (comment) Maybe replace it with PASTE2(revlock,PASTE2(__FILE__,__LINE__))


    ajtowns commented at 1:40 AM on March 2, 2020:

    We already use __COUNTER__ for LOCK() in sync.h, so reusing it here seems fine. Using __FILE__ seems unlikely to generate a usable symbol name too :)

  10. in src/sync.cpp:63 in 42a5326a4d outdated
      59 | @@ -60,6 +60,11 @@ struct CLockLocation {
      60 |              mutexName, sourceFile, itostr(sourceLine), (fTry ? " (TRY)" : ""), m_thread_name);
      61 |      }
      62 |  
      63 | +    std::string Name() const
    


    elichai commented at 2:24 PM on March 1, 2020:

    Maybe change to const std::string& Name() so that it will only copy/allocate it if needed?


    MarcoFalke commented at 3:04 PM on March 1, 2020:

    elichai commented at 3:10 PM on March 1, 2020:

    oh god. thanks. (for some reason I thought/hoped C++ helps you making sure you're not violating the lifetime when using references) I think I should stop using Rust lol

  11. practicalswift commented at 8:14 PM on March 1, 2020: contributor

    Concept ACK

    Thanks for working on de-boosting! :)

  12. in src/test/txindex_tests.cpp:75 in 42a5326a4d outdated
      69 | @@ -70,9 +70,6 @@ BOOST_FIXTURE_TEST_CASE(txindex_initial_sync, TestChain100Setup)
      70 |      // shutdown sequence (c.f. Shutdown() in init.cpp)
      71 |      txindex.Stop();
      72 |  
      73 | -    threadGroup.interrupt_all();
      74 | -    threadGroup.join_all();
      75 | -
    


    jonatack commented at 2:22 PM on March 2, 2020:

    In 69460c9 could these be also removed in src/test/transaction_tests.cpp L::457-458 if they are handled with TestingSetup::~TestingSetup(); the test compiles and runs without them.

    --- a/src/test/transaction_tests.cpp
    +++ b/src/test/transaction_tests.cpp
    @@ -453,9 +453,6 @@ BOOST_AUTO_TEST_CASE(test_big_witness_transaction)
    
         bool controlCheck = control.Wait();
         assert(controlCheck);
    -
    -    threadGroup.interrupt_all();
    -    threadGroup.join_all();
     }
    

    MarcoFalke commented at 5:08 PM on March 2, 2020:

    Not sure why this should be safe to remove. The scheduler needs to be stopped before this test ends, so the interrupt_all should be replaced by a scheduler.stop(). I think that otherwise you reintroduce bug #15410


    jonatack commented at 5:19 PM on March 2, 2020:

    Thanks. I was just now looking at src/test/checkqueue_tests.cpp which has similar code.


    ajtowns commented at 3:36 AM on March 4, 2020:

    @jonatack In transaction_tests, that's a locally declared threadGroup, and it's a BasicTestingSetup too, so the TestingSetup destructor wouldn't execute, and even if it did, wouldn't clean up that threadGroup. @MarcoFalke Ah, you're quite right. Have undeleted the code and added a comment.


    jonatack commented at 2:57 PM on March 5, 2020:

    @ajtowns right, makes sense. Thanks for the explanation.


    MarcoFalke commented at 3:00 PM on March 17, 2020:

    in commit 306f71b4eb4a0fd8e64f47dc008bc235b80b13d9:

    After adding the scheduler.stop(), they are again redundant and confusing. I'd suggest to remove the threadGroup calls


    ajtowns commented at 6:32 PM on March 17, 2020:

    Calling stop() doesn't actually cause the thread to stop, you need to wait for it to complete. Could do that by looping and waiting for AreThreadsServicingQueue() to be false, but that still leaves the possibility you continue on and destruct the object in between --nThreadsServicingQueue; and newTaskScheduled.notify_one(); at the end of serviceQueue. Could drop the interrupt_all but if some other thread in the group needed an interrupt before being joined, that'd cause a hang too... So until we change something else, this doesn't seem redundant to me?


    MarcoFalke commented at 6:34 PM on March 17, 2020:

    Ups, correct.

  13. jonatack commented at 4:16 PM on March 2, 2020: contributor

    ACK 42a5326 I reviewed the changes and surrounding code in detail, built/ran tests for each commit, ran bitcoind for 2 days (debian), and shut down/restarted bitcoind several times. This looks good.

  14. theuni commented at 4:30 PM on March 2, 2020: member

    This doesn't seem to take into account that boost::condition_variable is interruptible, and can't simply be swapped out. Have I missed some prior work that makes this ok?

    Ok, I see that in the description. Re-reviewing.

  15. theuni commented at 5:03 PM on March 2, 2020: member

    I would've expected this to require more work, but it looks like the scheduler was already surprisingly interruptible. Concept ACK. @TheBlueMatt might be aware of more scheduling minefields.

  16. in src/scheduler.cpp:18 in 42a5326a4d outdated
      14 | @@ -16,22 +15,14 @@ CScheduler::CScheduler() : nThreadsServicingQueue(0), stopRequested(false), stop
      15 |  
      16 |  CScheduler::~CScheduler()
      17 |  {
      18 | +    LOCK(newTaskMutex);
    


    theuni commented at 5:50 PM on March 2, 2020:

    Adding a lock in a destructor just for an assertion is a shame. Maybe this should be behind a debug guard?


    ajtowns commented at 5:27 AM on March 3, 2020:

    It's probably not useful in the first place, the lock's about to be destroyed, so if anyone else even has a reference to it there's a bug, and threadsafety annotations don't apply to constructors and destructors for the same reason.

  17. in src/init.cpp:209 in 42a5326a4d outdated
     206 | @@ -207,6 +207,7 @@ void Shutdown(NodeContext& node)
     207 |  
     208 |      // After everything has been shut down, but before things get flushed, stop the
     209 |      // CScheduler/checkqueue threadGroup
     210 | +    if (node.scheduler) node.scheduler->stop();
    


    theuni commented at 6:52 PM on March 2, 2020:

    Are these guaranteed to fully execute in-order? Can an interruption-point hit while the scheduler is still tearing down?


    ajtowns commented at 4:02 AM on March 4, 2020:

    Not sure what you're asking; node.scheduler->stop() just sets a boolean to say "exit the loop" and notifies everything waiting on the condition var to prevent it staying asleep, all the tear down happens in the thread. The interruption could certainly happen while it's tearing down, stop(); interrupt_all() happen in one thread, and only afterwards does the other thread even wake up. I don't think boost interrupt will have an effect anymore though, because serviceQueue doesn't use any boost primitives and boost interrupt only sets a flag which is checked by other boost things.

  18. MarcoFalke renamed this:
    scheduler.cpp: Replace boost::mutex,condition_var,chrono with std equivalents
    refactor: Replace boost::mutex,condition_var,chrono with std equivalents in scheduler
    on Mar 3, 2020
  19. MarcoFalke cross-referenced this on Mar 3, 2020 from issue Future PRs by jnewbery
  20. ajtowns force-pushed on Mar 4, 2020
  21. MarcoFalke added this to the "In progress" column in a project

  22. fanquake cross-referenced this on Mar 5, 2020 from issue build: Remove Boost Chrono by fanquake
  23. MarcoFalke added this to the milestone 0.20.0 on Mar 5, 2020
  24. MarcoFalke cross-referenced this on Mar 5, 2020 from issue scheduler: crash after releasing wallet by fanquake
  25. MarcoFalke commented at 9:03 PM on March 5, 2020: member

    Should add

    Fixes [#16027](/github-metadata-backup-bitcoin-bitcoin/16027/), Fixes [#14200](/github-metadata-backup-bitcoin-bitcoin/14200/), Fixes [#18227](/github-metadata-backup-bitcoin-bitcoin/18227/)
    

    to OP?

  26. DrahtBot cross-referenced this on Mar 6, 2020 from issue scheduler: Workaround negative nsecs bug in boost's wait_until by luke-jr
  27. fanquake commented at 7:42 AM on March 6, 2020: member

    @ajtowns Can you rebase on master now that #16117 is merged. Also, feel free to include a commit here to kill the last DHAVE_WORKING_BOOST_SLEEP_FOR.

  28. DrahtBot added the label Needs rebase on Mar 6, 2020
  29. scheduler: don't rely on boost interrupt on shutdown
    Calling interrupt_all() will immediately stop the scheduler, so it's
    safe to invoke stop() beforehand, and this removes the reliance on boost
    to interrupt serviceQueue().
    306f71b4eb
  30. sync.h: add REVERSE_LOCK b9c4260127
  31. scheduler: switch from boost to std
    Changes from boost::chrono to std::chrono, boost::condition_var to
    std::condition_var, boost::mutex to sync.h Mutex, and reverselock.h to
    sync.h REVERSE_LOCK. Also adds threadsafety annotations to CScheduler
    members.
    d0ebd93270
  32. Drop unused reverselock.h cea19f6859
  33. scheduler_tests: re-enable mockforward test 294937b39d
  34. MarcoFalke removed the label Needs rebase on Mar 6, 2020
  35. MarcoFalke added the label Needs rebase on Mar 6, 2020
  36. lint-cppcheck: Remove -DHAVE_WORKING_BOOST_SLEEP_FOR 70a6b529f3
  37. ajtowns force-pushed on Mar 6, 2020
  38. DrahtBot removed the label Needs rebase on Mar 6, 2020
  39. laanwj commented at 7:08 PM on March 6, 2020: member

    Great work! I'm really glad this is happening, thanks for working on this. ACK 70a6b529f306ff72ea1badf25e970a92b2b17ab3

  40. laanwj merged this on Mar 6, 2020
  41. laanwj closed this on Mar 6, 2020

  42. luke-jr cross-referenced this on Mar 6, 2020 from issue [0.19] scheduler: Workaround negative nsecs bug in boost's wait_until by luke-jr
  43. MarcoFalke moved this from the "In progress" to the "Done" column in a project

  44. in src/sync.cpp:163 in b9c4260127 outdated
     159 | @@ -155,6 +160,18 @@ void EnterCritical(const char* pszName, const char* pszFile, int nLine, void* cs
     160 |      push_lock(cs, CLockLocation(pszName, pszFile, nLine, fTry, util::ThreadGetInternalName()));
     161 |  }
     162 |  
     163 | +void CheckLastCritical(void* cs, std::string& lockname, const char* guardname, const char* file, int line)
    


    MarcoFalke commented at 3:05 PM on March 17, 2020:

    in commit b9c426012770d166e6ebfab27689be44e6e89aa5:

    Why is this not called GetLastLockName, which better describes what this function does, imo.

  45. in src/sync.h:206 in b9c4260127 outdated
     201 | +        reverse_lock(reverse_lock const&);
     202 | +        reverse_lock& operator=(reverse_lock const&);
     203 | +
     204 | +        UniqueLock& lock;
     205 | +        UniqueLock templock;
     206 | +        std::string lockname;
    


    MarcoFalke commented at 3:09 PM on March 17, 2020:

    in commit b9c426012770d166e6ebfab27689be44e6e89aa5:

    Should be m_lockname. Also for other members?

  46. in src/sync.h:205 in b9c4260127 outdated
     200 | +     private:
     201 | +        reverse_lock(reverse_lock const&);
     202 | +        reverse_lock& operator=(reverse_lock const&);
     203 | +
     204 | +        UniqueLock& lock;
     205 | +        UniqueLock templock;
    


    MarcoFalke commented at 4:40 PM on March 17, 2020:

    in commit b9c426012770d166e6ebfab27689be44e6e89aa5:

    I think it would help readers of the code to briefly explain why this is needed. Also, m_templock? ;)

  47. in src/scheduler.h:10 in d0ebd93270 outdated
       6 | @@ -7,11 +7,12 @@
       7 |  
       8 |  //
       9 |  // NOTE:
      10 | -// boost::thread / boost::chrono should be ported to std::thread / std::chrono
      11 | +// boost::thread should be ported to std::thread
    


    MarcoFalke commented at 4:45 PM on March 17, 2020:

    in commit d0ebd93270758ea97ea956b8821e17a2d001ea94:

    Actually boost::thread is no longer needed an can be replaced with std::thread in this file and the scheduler tests. I know that the thread group is still boost, but the scheduler doesn't care about that.

  48. MarcoFalke commented at 4:48 PM on March 17, 2020: member

    ACK 70a6b529f306ff72ea1badf25e970a92b2b17ab3, just style-nits 🌩

    <details><summary>Show signature and timestamp</summary>

    Signature:

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA512
    
    ACK 70a6b529f306ff72ea1badf25e970a92b2b17ab3, just style-nits 🌩
    -----BEGIN PGP SIGNATURE-----
    
    iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
    pUjbywwAhqXP0sWwiqNLLsI0h9fBTK3DQ7qRxX2cxV9RmUoiok2nKutVPGef2v4L
    brJk/asu9hCJ5nb9/SMf6NRqnyhJ4lbTbt961Tu1au0LNfloCUfvfb8JxY42yDXD
    zxzKYI/vQeUnaNBMINmZ9rOLkLMYqnxXb7fjejNL9l8txmE4jmRp53lvZj34IfDr
    0Mhu3rq+FwLFgvwMWPEbPR1jG/AN2XaW0sHMMe1Xpw/intRxJLh3BSpPHZ9+AjIF
    kWF6Dqp4Onf2dO4jmfaB5ipQI9IYFFZh3+jD3wePBD9kZuL2iqsfXvfblm+rXGZ4
    XMOSXjJoNreRp/hH14gcGuKQrOOiraemT15rH4XiC/DUI6lLMa1FWMNGn1Pob5A0
    EwjIUwnHVNbCqpJIUcGXLEHTZ0pE/1cQ6nBKEFoPcTEgk7D1gH5mX4wP3UBdPw57
    Vka9bjoTAOLqdy7B2FE8qg+dNiVqaDm9eRxixS1uqDY+lWHfNkeubg9SaE6LkC//
    5qT991xc
    =GGMb
    -----END PGP SIGNATURE-----
    

    Timestamp of file with hash 8b7a37edb1d87523e3df70995e0c51ee4dd4dafc50f78c0b12195a66e49c4ce2 -

    </details>

  49. hebasto cross-referenced this on Mar 20, 2020 from issue wallet: Remove calls to Chain::Lock methods by ryanofsky
  50. MarkLTZ cross-referenced this on Apr 4, 2020 from issue Bitcoin PR tracking by MarkLTZ
  51. UdjinM6 cross-referenced this on May 23, 2020 from issue Fix ProcessNewBlock vs EnforceBestChainLock deadlocks in ActivateBestChain by UdjinM6
  52. jnewbery cross-referenced this on May 26, 2020 from issue wallet: Remove boost from PeriodicFlush by MarcoFalke
  53. deadalnix referenced this in commit 014e5697a6 on Jun 11, 2020
  54. deadalnix referenced this in commit fba3cd6167 on Jun 11, 2020
  55. deadalnix referenced this in commit 1af5e7c22b on Jun 11, 2020
  56. deadalnix referenced this in commit 879fe2a322 on Jun 11, 2020
  57. MarcoFalke referenced this in commit 89a6bb9245 on Aug 28, 2020
  58. div72 cross-referenced this on Sep 15, 2020 from issue gridcoinresearch crashes after the machine is woken up by bzaborow
  59. kwvg referenced this in commit 54f16ee1a0 on Jul 13, 2021
  60. kwvg referenced this in commit 6d6bc2abda on Jul 13, 2021
  61. random-zebra cross-referenced this on Aug 18, 2021 from issue refactor: Replace boost::mutex,condition_var,chrono with std equivalents in scheduler by random-zebra
  62. Fabcien referenced this in commit cdc0cfb74c on Aug 27, 2021
  63. furszy referenced this in commit c8ad2c17ab on Aug 27, 2021
  64. Bushstar cross-referenced this on Sep 6, 2021 from issue Boost usage reduction by Bushstar
  65. Bushstar cross-referenced this on Sep 10, 2021 from issue C++17 and reduce Boost usage by Bushstar
  66. PastaPastaPasta referenced this in commit 417099870a on Sep 28, 2021
  67. PastaPastaPasta referenced this in commit 9b8884b430 on Sep 28, 2021
  68. kwvg referenced this in commit efb9cdad7a on Oct 12, 2021
  69. str4d cross-referenced this on Dec 6, 2021 from issue Backport upstream PRs that remove Boost usage by str4d
  70. bitcoin locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-20 06:54 UTC