sync: detect double lock from the same thread #19337

pull vasild wants to merge 2 commits into bitcoin:master from vasild:detect_double_lock changing 3 files +112 −11
  1. vasild commented at 1:49 PM on June 20, 2020: contributor

    Double lock of the same (non-recursive) mutex from the same thread would produce an undefined behavior. Detect this from DEBUG_LOCKORDER and react similarly to the deadlock detection.

    This came up during discussion in another, related PR: #19238 (review).

  2. vasild cross-referenced this on Jun 20, 2020 from issue refactor: Make CAddrMan::cs non-recursive by hebasto
  3. hebasto commented at 1:54 PM on June 20, 2020: member

    Concept ACK.

  4. in src/sync.cpp:225 in 401d3b759e outdated
     214 |  }
     215 | +template void EnterCritical(const char*, const char*, int, Mutex*, bool);
     216 | +template void EnterCritical(const char*, const char*, int, RecursiveMutex*, bool);
     217 | +template void EnterCritical(const char*, const char*, int, std::mutex*, bool);
     218 | +template void EnterCritical(const char*, const char*, int, std::recursive_mutex*, bool);
     219 | +template void EnterCritical(const char*, const char*, int, boost::mutex*, bool);
    


    MarcoFalke commented at 1:54 PM on June 20, 2020:

    no need to add dead code or code that is only used in tests


    vasild commented at 2:10 PM on June 20, 2020:

    These instantiations are required even if tests do not exist. Without them it wouldn't link with errors like:

    ld: error: undefined symbol: void EnterCritical<boost::mutex>(char const*, char const*, int, boost::mutex*, bool)
    >>> referenced by checkqueue.h:184 (src/checkqueue.h:184)
    >>>               libbitcoin_server_a-validation.o:(CCheckQueueControl<CScriptCheck>::CCheckQueueControl(CCheckQueue<CScriptCheck>*)) in archive libbitcoin_server.a
    

    hebasto commented at 2:20 PM on June 20, 2020:

    Yeah, this is solved in #18710 :)

    EDIT: I mean boost::mutex


    vasild commented at 7:07 AM on June 22, 2020:

    So when these two PRs meet and if that was the last usage of boost::mutex, then this line can be removed: template void EnterCritical(const char*, const char*, int, boost::mutex*, bool);.


    MarcoFalke commented at 2:03 PM on August 5, 2020:

    I tried compiling locally without it and it worked #19337 (comment)


    vasild commented at 8:51 AM on January 26, 2021:

    Just for reference - the two PRs met in master and afterwards that line was removed in #21010.

    Thanks, @fanquake!

  5. MarcoFalke commented at 1:54 PM on June 20, 2020: member

    Concept ACK

  6. MarcoFalke added the label Refactoring on Jun 20, 2020
  7. MarcoFalke removed the label Refactoring on Jun 20, 2020
  8. MarcoFalke added the label Utils/log/libs on Jun 20, 2020
  9. hebasto commented at 2:53 PM on June 20, 2020: member
  10. hebasto commented at 3:08 PM on June 20, 2020: member

    Tested 401d3b759e14a96332a033485257d2e993c2b944 with the following diff:

    --- a/src/addrman.h
    +++ b/src/addrman.h
    @@ -178,7 +178,7 @@ class CAddrMan
     friend class CAddrManTest;
     protected:
         //! critical section to protect the inner data structures
    -    mutable RecursiveMutex cs;
    +    mutable Mutex cs;
     
     private:
         //! last used nId
    
    $ ./src/qt/bitcoin-qt -regtest
    Assertion failed: detected double lock at sync.cpp:153, details in debug log.
    Aborted (core dumped)
    

    Nice!

    Some non-blocking nits:

    • could place boost headers into a separate group
    • in commit 401d3b759e14a96332a033485257d2e993c2b944 message "... would produce an undefined behavior" -- mind rewording "would" -> "is"
    • could use unnamed namespace instead of static
  11. hebasto commented at 3:33 PM on June 20, 2020: member

    Travis error:

    It is reliably reproducible locally.

  12. hebasto commented at 3:48 PM on June 20, 2020: member

    Travis error:

    It is reliably reproducible locally. @vasild ~idk exactly why but~ moving the potential_deadlock_detected test case to the end fixes the problem.

    UPDATE: at the end of the potential_deadlock_detected test case the lock stack is not empty.

  13. hebasto commented at 8:27 PM on June 20, 2020: member

    Travis error:

    * https://travis-ci.org/github/bitcoin/bitcoin/jobs/700351407#L4430

    Fixed in #19340.

  14. hebasto cross-referenced this on Jun 20, 2020 from issue Preserve the `LockData` initial state if "potential deadlock detected" exception thrown by hebasto
  15. DrahtBot commented at 10:06 PM on June 20, 2020: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #19983 (Drop some TSan suppressions by hebasto)
    • #19982 (test: Fix inconsistent lock order in wallet_tests/CreateWalletFromFile by hebasto)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  16. jnewbery commented at 6:32 PM on June 21, 2020: member

    Concept ACK

  17. practicalswift commented at 9:00 AM on June 22, 2020: contributor

    Concept ACK

    Nice!

  18. vasild force-pushed on Jun 22, 2020
  19. vasild commented at 10:17 AM on June 22, 2020: contributor
    * could place boost headers into a separate group

    Done.

    * in commit [401d3b7](https://github.com/bitcoin/bitcoin/commit/401d3b759e14a96332a033485257d2e993c2b944) message "... would produce an undefined behavior" -- mind rewording "would" -> "is"

    Done.

    * could use unnamed namespace instead of `static`

    The rest of the functions in the file use static, so for consistency I also used static in the newly introduced function. Maybe replace all static with an unnamed namespace in sync.cpp? If yes, then that would go as a separate commit, or maybe even a separate PR.

    Reviewers: this PR exposed an existent deficiency which is fixed in #19340.

  20. hebasto approved
  21. hebasto commented at 10:22 AM on June 22, 2020: member

    ACK d32d85590e695fbe3acab0ce9a5525572a9bfa77

    But #19340 should be considering for reviewing before this PR merging.

  22. vasild marked this as a draft on Jun 22, 2020
  23. vasild commented at 11:04 AM on June 22, 2020: contributor

    I converted this PR to "draft" so that it does not get merged accidentally. CI tests passed this time! I guess the order in which the tests are executed matters. #19340 will fix that and should be merged before this PR.

  24. DrahtBot cross-referenced this on Jun 22, 2020 from issue Fix mistakenly swapped "previous" and "current" lock orders by hebasto
  25. promag commented at 8:23 PM on June 22, 2020: member

    Concept ACK, change LGTM.

  26. DrahtBot cross-referenced this on Jul 28, 2020 from issue Remove mempool global by MarcoFalke
  27. DrahtBot cross-referenced this on Jul 30, 2020 from issue Disable fee estimation in blocksonly mode (by removing the fee estimates global) by darosior
  28. hebasto commented at 3:06 PM on August 4, 2020: member

    @vasild

    I converted this PR to "draft" so that it does not get merged accidentally. CI tests passed this time! I guess the order in which the tests are executed matters. #19340 will fix that and should be merged before this PR.

    #19340 is merged :)

  29. vasild marked this as ready for review on Aug 5, 2020
  30. sync: make EnterCritical() & push_lock() type safe
    The functions `EnterCritical()` and `push_lock()` take a pointer to a
    mutex, but that pointer used to be of type `void*` because we use a few
    different types for mutexes. This `void*` argument was not type safe
    because somebody could have send a pointer to anything that is not a
    mutex. Furthermore it wouldn't allow to check whether the passed mutex
    is recursive or not.
    
    Thus, change the functions to templated ones so that we can implement
    stricter checks for non-recursive mutexes. This also simplifies the
    callers of `EnterCritical()`.
    4df6567e4c
  31. vasild force-pushed on Aug 5, 2020
  32. vasild commented at 7:51 AM on August 5, 2020: contributor

    Reopened this PR now that #19340 is merged and rebased to resolve a conflict.

  33. in src/test/sync_tests.cpp:60 in c269e0a224 outdated
      55 | +                return strcmp(e.what(), "double lock detected") == 0;
      56 | +            });
      57 | +    } else {
      58 | +        BOOST_CHECK_NO_THROW(TestDoubleLock2(m));
      59 | +    }
      60 | +    LEAVE_CRITICAL_SECTION(m);
    


    hebasto commented at 8:09 AM on August 5, 2020:
        LEAVE_CRITICAL_SECTION(m);
        BOOST_CHECK(LockStackEmpty());
    

    vasild commented at 8:20 AM on August 5, 2020:

    Done!

  34. hebasto approved
  35. hebasto commented at 8:09 AM on August 5, 2020: member

    ACK c269e0a22423294b81c7a60862bc20f6ccf80f1c, tested on Linux Mint 20 (x86_64).

  36. vasild force-pushed on Aug 5, 2020
  37. hebasto approved
  38. hebasto commented at 8:22 AM on August 5, 2020: member

    re-ACK b1f94072a1cc3ca67ce13658bb981d03480ef38e

  39. in src/test/sync_tests.cpp:41 in b1f94072a1 outdated
      36 | +#ifdef DEBUG_LOCKORDER
      37 | +template <typename MutexType>
      38 | +void TestDoubleLock2(MutexType& m)
      39 | +{
      40 | +    ENTER_CRITICAL_SECTION(m);
      41 | +    LEAVE_CRITICAL_SECTION(m);
    


    MarcoFalke commented at 8:30 AM on August 5, 2020:

    Couldn't this use LOCK(m), since ENTER_CRITICAL_SECTION is mostly only used internally to the sync module and other code used LOCK


    vasild commented at 12:32 PM on August 5, 2020:

    LOCK(m) does not compile if m is std::mutex:

    src/sync.h:132:59: error: no type named 'UniqueLock' in 'std::__1::mutex'
    template <typename Mutex, typename Base = typename Mutex::UniqueLock>
                                              ~~~~~~~~~~~~~~~~^~~~~~~~~~
    

    MarcoFalke commented at 2:02 PM on August 5, 2020:

    Though, we never use LOCK or even ENTER_CRITICAL_SECTION on std::mutex, so what is the point of testing for that? #19337 (comment)

  40. in src/test/sync_tests.cpp:51 in b1f94072a1 outdated
      46 | +{
      47 | +    const bool prev = g_debug_lockorder_abort;
      48 | +    g_debug_lockorder_abort = false;
      49 | +
      50 | +    MutexType m;
      51 | +    ENTER_CRITICAL_SECTION(m);
    


    MarcoFalke commented at 8:31 AM on August 5, 2020:
        {
            LOCK(m);
    

    Same here. Should be equivalent?


    vasild commented at 12:35 PM on August 5, 2020:

    Same as above - it does not compile.

  41. in src/test/sync_tests.cpp:53 in b1f94072a1 outdated
      48 | +    g_debug_lockorder_abort = false;
      49 | +
      50 | +    MutexType m;
      51 | +    ENTER_CRITICAL_SECTION(m);
      52 | +    if (should_throw) {
      53 | +        BOOST_CHECK_EXCEPTION(
    


    MarcoFalke commented at 8:32 AM on August 5, 2020:

    Could limit the #ifdef DEBUG_LOCKORDER to just this one line/statement? Benefit would be that BOOST_CHECK_NO_THROW will be run for non-debug builds as well.


    vasild commented at 12:43 PM on August 5, 2020:

    Without DEBUG_LOCKORDER we would actually do the double lock and once executed then we are in "undefined behavior" state. We don't want to do that because it may randomly crash the test executable, overwrite random portions of memory for the excitement of subsequent tests, etc.

    https://github.com/bitcoin/bitcoin/blob/b1f94072a1cc3ca67ce13658bb981d03480ef38e/src/test/sync_tests.cpp#L93-L95


    MarcoFalke commented at 2:02 PM on August 5, 2020:

    I compiled and tested locally with sanitizers and it passed. Am I missing something? #19337 (comment)

  42. in src/sync.cpp:156 in b1f94072a1 outdated
     152 | +    }
     153 | +    // We throw or abort() here, never actually locking the mutex. So remove the entry
     154 | +    // from the lock stack which we just added in push_lock(). Tests in sync_tests.cpp
     155 | +    // continue executing past the exception and we don't want one test to leave a
     156 | +    // bricked stack for a subsequent test that could execute in the same thread.
     157 | +    lock_stack.pop_back();
    


    MarcoFalke commented at 8:39 AM on August 5, 2020:

    Would be nice if the potential_deadlock_detected and double_lock_detected helpers used the same style to pop_back the just added lock. (Either pass in a mutable reference or create a copy)


    vasild commented at 12:52 PM on August 5, 2020:

    Done - used the same as used when calling potential_deadlock_detected() in master - a copy.

  43. in src/sync.cpp:154 in b1f94072a1 outdated
     155 | +    // continue executing past the exception and we don't want one test to leave a
     156 | +    // bricked stack for a subsequent test that could execute in the same thread.
     157 | +    lock_stack.pop_back();
     158 | +    if (g_debug_lockorder_abort) {
     159 | +        tfm::format(std::cerr, "Assertion failed: detected double lock at %s:%i, details in debug log.\n", __FILE__, __LINE__);
     160 | +        abort();
    


    MarcoFalke commented at 8:41 AM on August 5, 2020:
            std::abort();
    

    Should be the same symbol, but slightly clearer.


    promag commented at 9:14 AM on August 5, 2020:

    There's more cases of s/abort()/std::abort so a follow up to to do this is also fine.


    vasild commented at 12:53 PM on August 5, 2020:

    I am leaving it as is for consistency with potential_deadlock_detected() which uses abort(). As @promag says, a subsequent patch could change all occurrences.

  44. MarcoFalke approved
  45. MarcoFalke commented at 8:41 AM on August 5, 2020: member

    ACK, only style and test nits. Feel free to ingore.

  46. in src/sync.cpp:187 in b1f94072a1 outdated
     186 | +            }
     187 | +            // It is not a recursive mutex and it appears in the stack two times:
     188 | +            // at position `j` and at the end (which we added just before this loop).
     189 | +            // Can't allow locking the same (non-recursive) mutex two times from the
     190 | +            // same thread as that results in an undefined behavior.
     191 | +            double_lock_detected(c, lock_stack);
    


    promag commented at 9:22 AM on August 5, 2020:

    Add assert(false) after this? Or some other "unreachable" assertion. Below, after potential_deadlock_detected() there's a comment, could also add the assertion there.


    vasild commented at 12:54 PM on August 5, 2020:

    I added a comment, like done for potential_deadlock_detected().

  47. in src/sync.cpp:171 in b1f94072a1 outdated
     175 |      LockStack& lock_stack = lockdata.m_lock_stacks[std::this_thread::get_id()];
     176 |      lock_stack.emplace_back(c, locklocation);
     177 | -    for (const LockStackItem& i : lock_stack) {
     178 | -        if (i.first == c)
     179 | -            break;
     180 | +    for (size_t j = 0; j < lock_stack.size() - 1; ++j) {
    


    promag commented at 9:24 AM on August 5, 2020:

    Could keep range loop? When lock_stack is changed the iteration stops.


    vasild commented at 1:01 PM on August 5, 2020:

    We need to iterate on all but the last element.


    promag commented at 1:06 PM on August 5, 2020:

    Right, for double_lock_detected the last element should be skipped. But now lines L187-200 don't run for the last element.


    promag commented at 9:48 PM on August 5, 2020:

    Nevermind, the last element was already skipped with if (i.first == c) break;.

  48. vasild commented at 1:01 PM on August 5, 2020: contributor

    Addressed some review suggestions.

  49. vasild force-pushed on Aug 5, 2020
  50. MarcoFalke commented at 2:02 PM on August 5, 2020: member

    <details><summary>a diff</summary>

    diff --git a/src/sync.cpp b/src/sync.cpp
    index d020b4e334..55cc47191f 100644
    --- a/src/sync.cpp
    +++ b/src/sync.cpp
    @@ -220,9 +220,6 @@ void EnterCritical(const char* pszName, const char* pszFile, int nLine, MutexTyp
     }
     template void EnterCritical(const char*, const char*, int, Mutex*, bool);
     template void EnterCritical(const char*, const char*, int, RecursiveMutex*, bool);
    -template void EnterCritical(const char*, const char*, int, std::mutex*, bool);
    -template void EnterCritical(const char*, const char*, int, std::recursive_mutex*, bool);
    -template void EnterCritical(const char*, const char*, int, boost::mutex*, bool);
     
     void CheckLastCritical(void* cs, std::string& lockname, const char* guardname, const char* file, int line)
     {
    diff --git a/src/test/sync_tests.cpp b/src/test/sync_tests.cpp
    index 2ecc78bbed..9a3c276ed2 100644
    --- a/src/test/sync_tests.cpp
    +++ b/src/test/sync_tests.cpp
    @@ -33,37 +33,44 @@ void TestPotentialDeadLockDetected(MutexType& mutex1, MutexType& mutex2)
         #endif
     }
     
    -#ifdef DEBUG_LOCKORDER
     template <typename MutexType>
     void TestDoubleLock2(MutexType& m)
     {
    -    ENTER_CRITICAL_SECTION(m);
    -    LEAVE_CRITICAL_SECTION(m);
    +    LOCK(m);
     }
     
     template <typename MutexType>
     void TestDoubleLock(bool should_throw)
     {
    +#ifdef DEBUG_LOCKORDER
         const bool prev = g_debug_lockorder_abort;
         g_debug_lockorder_abort = false;
    +#endif /* DEBUG_LOCKORDER */
     
         MutexType m;
    -    ENTER_CRITICAL_SECTION(m);
    -    if (should_throw) {
    -        BOOST_CHECK_EXCEPTION(
    -            TestDoubleLock2(m), std::logic_error, [](const std::logic_error& e) {
    -                return strcmp(e.what(), "double lock detected") == 0;
    -            });
    -    } else {
    -        BOOST_CHECK_NO_THROW(TestDoubleLock2(m));
    +    {
    +        LOCK(m);
    +        if (should_throw) {
    +#ifdef DEBUG_LOCKORDER
    +            /* Double lock would produce an undefined behavior. Thus, we only do that if
    +         * DEBUG_LOCKORDER is activated to detect it. We don't want non-DEBUG_LOCKORDER
    +         * build to produce tests that exhibit known undefined behavior. */
    +            BOOST_CHECK_EXCEPTION(
    +                TestDoubleLock2(m), std::logic_error, [](const std::logic_error& e) {
    +                    return strcmp(e.what(), "double lock detected") == 0;
    +                });
    +#endif /* DEBUG_LOCKORDER */
    +        } else {
    +            BOOST_CHECK_NO_THROW(TestDoubleLock2(m));
    +        }
         }
    -    LEAVE_CRITICAL_SECTION(m);
     
         BOOST_CHECK(LockStackEmpty());
     
    +#ifdef DEBUG_LOCKORDER
         g_debug_lockorder_abort = prev;
    -}
     #endif /* DEBUG_LOCKORDER */
    +}
     } // namespace
     
     BOOST_FIXTURE_TEST_SUITE(sync_tests, BasicTestingSetup)
    @@ -90,34 +97,14 @@ BOOST_AUTO_TEST_CASE(potential_deadlock_detected)
         #endif
     }
     
    -/* Double lock would produce an undefined behavior. Thus, we only do that if
    - * DEBUG_LOCKORDER is activated to detect it. We don't want non-DEBUG_LOCKORDER
    - * build to produce tests that exhibit known undefined behavior. */
    -#ifdef DEBUG_LOCKORDER
     BOOST_AUTO_TEST_CASE(double_lock_mutex)
     {
         TestDoubleLock<Mutex>(true /* should throw */);
     }
     
    -BOOST_AUTO_TEST_CASE(double_lock_std_mutex)
    -{
    -    TestDoubleLock<std::mutex>(true /* should throw */);
    -}
    -
    -BOOST_AUTO_TEST_CASE(double_lock_boost_mutex)
    -{
    -    TestDoubleLock<boost::mutex>(true /* should throw */);
    -}
    -
     BOOST_AUTO_TEST_CASE(double_lock_recursive_mutex)
     {
         TestDoubleLock<RecursiveMutex>(false /* should not throw */);
     }
     
    -BOOST_AUTO_TEST_CASE(double_lock_std_recursive_mutex)
    -{
    -    TestDoubleLock<std::recursive_mutex>(false /* should not throw */);
    -}
    -#endif /* DEBUG_LOCKORDER */
    -
     BOOST_AUTO_TEST_SUITE_END()
    

    </details>

  51. vasild commented at 12:21 PM on August 6, 2020: contributor

    @MarcoFalke the above patch does not compile for me (neither with clang 11 nor with gcc 10) with an error like

    ld: error: undefined symbol: void EnterCritical<std::mutex>(char const*, char const*, int, std::mutex*, bool)
    

    I think you should also get it for ./configure --enable-debug when the explicit instantiation of EnterCritical<std::mutex>() is removed. Here is how we end up needing that instantiation even though we never use std::mutex directly:

    typedef AnnotatedMixin<std::mutex> Mutex;
    
    template <typename PARENT>
    class LOCKABLE AnnotatedMixin : public PARENT
    {
    ...
        using UniqueLock = std::unique_lock<PARENT>;
    

    So when one defines Mutex m; this ends up being AnnotatedMixin<std::mutex> m; and AnnotatedMixin<std::mutex>::UniqueLock is std::unique_lock<std::mutex>.

    LOCK(m) expands to DebugLock<AnnotatedMixin<std::mutex>> criticalblock123(m, ...) which expands to UniqueLock<AnnotatedMixin<std::mutex>> criticalblock123(m, ...) which expands to (with default 2nd template arg): UniqueLock<AnnotatedMixin<std::mutex>, AnnotatedMixin<std::mutex>::UniqueLock> criticalblock123(m, ...) which expands to UniqueLock<AnnotatedMixin<std::mutex>, std::unique_lock<std::mutex>> criticalblock123(m, ...).

    Then the constructor of the above calls the Enter() method which calls EnterCritical(name, file, line, Base::mutex()) which expands to EnterCritical(name, file, line, std::unique_lock<std::mutex>::mutex()). The mutex() method (4th argument) returns a pointer to std::mutex.

    This is how we end up calling EnterCritical() with a 4th argument of type pointer to std::mutex.

    Thus we need the explicit instantiation template void EnterCritical(const char*, const char*, int, std::mutex*, bool);. Same applies to std::recursive_mutex and boost::mutex.

  52. vasild referenced this in commit 2d547b4f7c on Aug 6, 2020
  53. vasild commented at 12:59 PM on August 6, 2020: contributor

    I pushed the patch to my travis instance and it breaks the build, similarly to what I observe locally: https://travis-ci.org/github/vasild/bitcoin/jobs/715483824#L3800

  54. MarcoFalke commented at 8:29 AM on August 7, 2020: member

    Obviously I didn't compile with --enable-debug (shame!). Please disregard the sync.cpp patch, but the other hunks should compile?

  55. sync: detect double lock from the same thread
    Double lock of the same (non-recursive) mutex from the same thread
    is producing an undefined behavior. Detect this from DEBUG_LOCKORDER
    and react similarly to the deadlock detection.
    95975dd08d
  56. vasild force-pushed on Aug 10, 2020
  57. vasild commented at 4:43 PM on August 10, 2020: contributor

    @MarcoFalke, I looked carefully at the diff above and took some of it but not all. Here are the reasons:

    • Indeed we never do LOCK() or ENTER_CRITICAL_SECTION() on std::mutex or std::recursive_mutex directly, so no need to test for that. Thus, I removed the tests double_lock_std_mutex and double_lock_std_recursive_mutex.

    • However we do call ENTER_CRITICAL_SECTION() on boost::mutex. So, I left the test double_lock_boost_mutex. So, TestDoubleLock2() cannot use LOCK(m) because it does not compile.

    • I do not think it makes sense to do these tests for non-DEBUG_LOCKORDER build because they will decay to:

      • if should_throw == true
      mutex lock
      if (should_throw) {
          // nothing
      }
      mutex unlock
      
      • if should_throw == false
      recursive mutex lock
      recursive mutex lock
      recursive mutex unlock
      recursive mutex unlock
      

      (with none of the code from sync.cpp exercised)

  58. hebasto approved
  59. hebasto commented at 10:29 AM on August 15, 2020: member

    re-ACK 95975dd08d8fdaaeaf28e0d06b861ce2748c17b6

  60. hebasto commented at 10:30 AM on August 15, 2020: member

    @ajtowns @ryanofsky @practicalswift Mind looking into this?

  61. MarcoFalke cross-referenced this on Aug 27, 2020 from issue test: Data race detected when running rpc_blockchain.py by hebasto
  62. hebasto cross-referenced this on Sep 9, 2020 from issue scripted-diff: Restore AssertLockHeld after #19668, remove LockAssertion by ryanofsky
  63. DrahtBot cross-referenced this on Sep 20, 2020 from issue Drop some TSan suppressions by hebasto
  64. DrahtBot cross-referenced this on Sep 20, 2020 from issue test: Fix inconsistent lock order in wallet_tests/CreateWallet by hebasto
  65. hebasto cross-referenced this on Oct 14, 2020 from issue p2p, refactor: Use Mutex type for some mutexes in CNode class by hebasto
  66. laanwj commented at 10:14 PM on November 23, 2020: member

    Given how much of the current work on mutexes is converting recursive mutexes to non-resursive ones, it would be nice to get this in.

  67. laanwj commented at 3:44 PM on November 25, 2020: member

    code review ACK 95975dd08d8fdaaeaf28e0d06b861ce2748c17b6

  68. laanwj merged this on Nov 25, 2020
  69. laanwj closed this on Nov 25, 2020

  70. vasild deleted the branch on Nov 25, 2020
  71. sidhujag referenced this in commit e7388e9c10 on Nov 25, 2020
  72. in src/sync.cpp:153 in 95975dd08d
     149 | +            LogPrintf(" (*)"); /* Continued */
     150 | +        }
     151 | +        LogPrintf(" %s\n", i.second.ToString());
     152 | +    }
     153 | +    if (g_debug_lockorder_abort) {
     154 | +        tfm::format(std::cerr, "Assertion failed: detected double lock at %s:%i, details in debug log.\n", __FILE__, __LINE__);
    


    MarcoFalke commented at 10:00 AM on November 26, 2020:

    This will always blame the bug on sync.cpp itself:

    detected double lock at sync.cpp:153
    

    vasild commented at 1:48 PM on November 26, 2020:
  73. MarcoFalke commented at 10:08 AM on November 26, 2020: member

    review ack

  74. in src/sync.cpp:182 in 95975dd08d
     181 | +            // It is not a recursive mutex and it appears in the stack two times:
     182 | +            // at position `j` and at the end (which we added just before this loop).
     183 | +            // Can't allow locking the same (non-recursive) mutex two times from the
     184 | +            // same thread as that results in an undefined behavior.
     185 | +            auto lock_stack_copy = lock_stack;
     186 | +            lock_stack.pop_back();
    


    promag commented at 9:38 AM on December 2, 2020:

    @vasild why copy and pop here?


    vasild commented at 5:57 AM on December 3, 2020:
                auto lock_stack_copy = lock_stack;
                lock_stack.pop_back();
                double_lock_detected(c, lock_stack_copy);
    

    That's subtle:

    1. double_lock_detected() can abort() in which case it does not matter, but it can also throw and tests catch this exception and continue working and executing other tests. So the lock stack needs to be correct after the throw because it is going to be reused by subsequent tests. Notice that we did not acquire the offending mutex that is at the top of the lock stack - we only detected a request to acquire it, which if executed would cause a double lock. So the topmost element needs to be removed before the throw.
    2. double_lock_detected() needs the full lock stack, including the offending-but-not-actually-acquired-mutex because it prints some of its elements. It will want to print that last one.
  75. fanquake referenced this in commit f827e151a2 on Jan 26, 2021
  76. fanquake cross-referenced this on Jan 26, 2021 from issue refactor: remove straggling boost::mutex usage by fanquake
  77. MarcoFalke referenced this in commit 280d0bd0bd on Jan 26, 2021
  78. remyers referenced this in commit 3edbc4f0ad on Jan 26, 2021
  79. bitcoin locked this on Aug 16, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-20 06:54 UTC