Potential deadlock on {cs_main, pnode->cs_vSend} #4493

issue ashleyholman opened this issue on July 9, 2014
  1. ashleyholman commented at 11:02 AM on July 9, 2014: contributor

    2014-07-09 10:10:33 POTENTIAL DEADLOCK DETECTED 2014-07-09 10:10:33 Previous lock order was: 2014-07-09 10:10:33 pnode->cs_vRecvMsg net.cpp:1541 2014-07-09 10:10:33 (1) cs_main main.cpp:3887 2014-07-09 10:10:33 (2) cs_vSend net.h:480 2014-07-09 10:10:33 Current lock order is: 2014-07-09 10:10:33 (2) pnode->cs_vSend net.cpp:1560 2014-07-09 10:10:33 (1) cs_main main.cpp:4387 2014-07-09 10:10:33 (1) cs_main main.cpp:1320

    The problem seems to be:

    • Many paths of execution will grab cs_main and then cs_vSend, eg. ProcessMessage("tx") locks cs_main and then can call pfrom->PushMessage if it rejects the tx. PushMessage() obtains pnode->cs_vSend. Another example: CheckBlock() will lock cs_main and then can call PushGetBlocks() which does a call to PushMessage() which locks pnode->cs_vSend.
    • As far as I can tell, there is just a single execution path that obtains these locks in the opposite order: ThreadMessageHandler() locks pnode->cs_vSend and then calls SendMessages() which locks cs_main in order to call IsInitialBlockDownload().

    The least painful fix seems to be to change the SendMessages() code to not need cs_main. Otherwise, ThreadMessageHandler() would need to lock cs_main prior to its SendMessages calls, which might be expensive use of the lock. Any ideas on making the IsInitialBlockDownload() check available to SendMessages() without it having to lock cs_main every time? Maybe caching it is possible?

  2. ashleyholman renamed this:
    Potential deadlock on {cs_main, cs_vSend}
    Potential deadlock on {cs_main, pnode->cs_vSend}
    on Jul 9, 2014
  3. ashleyholman commented at 11:18 AM on July 9, 2014: contributor

    Oh, I've just realised that SendMessages() does a TRY_LOCK() on cs_main first, and if it can't get the lock it returns early before calling IsInitialBlockDownload(). So does that mean its not a deadlock risk in actuality? Although the LOCK() calls are in the opposite order, there may be no chance of deadlock?

  4. ashleyholman cross-referenced this on Jul 9, 2014 from issue ThreadMessageHandler: replace MilliSleep() with condition variable by ashleyholman
  5. laanwj commented at 1:12 PM on July 9, 2014: member

    Well as long as there is one place where both locks are acquired (in any order) that's not a TRY_LOCK, there could be a deadlock there. If they are all TRY_LOCKS then indeed a deadlock cannot happen, but maybe livelock or thread starvation could (if both threads refuse to give up the other lock after getting the other one fails).

    As for IsInitialBlockDownload: one idea would be to change it to return a global flag that is cleared only after the initial block download (+ check at initialization time). This would be more efficient than executing the checks every time as well.

  6. laanwj added the label Bug on Jul 31, 2014
  7. unknown cross-referenced this on Jan 8, 2016 from issue POTENTIAL DEADLOCK FYI by ghost
  8. laanwj cross-referenced this on Feb 16, 2016 from issue Potential deadlock assertion in net code on 0.12RC3 by GIJensen
  9. sipa commented at 5:56 PM on January 23, 2017: member

    I'm going to close this, assuming that it was fixed by #7846 or other locking changes in recent time. Feel free to reopen if other problems are found.

  10. sipa closed this on Jan 23, 2017

  11. mcelrath commented at 6:17 PM on January 30, 2017: none

    Seen today on 0.13.2:

    2017-01-30 18:03:10 receive version message: /Satoshi:0.13.1/: version 70014, blocks=1087141, us=38.121.165.30:32483, peer=8
    2017-01-30 18:03:10 AdvertiseLocal: advertising address 38.121.165.30:18333
    2017-01-30 18:03:10 connect() to [2001:0:9d38:90d7:2cfe:1245:ac28:731a]:18333 failed: Network is unreachable (101)
    2017-01-30 18:03:37 POTENTIAL DEADLOCK DETECTED
    2017-01-30 18:03:37 Previous lock order was:
    2017-01-30 18:03:37  pnode->cs_vRecvMsg  net.cpp:1877 (TRY)
    2017-01-30 18:03:37  (1) cs_main  main.cpp:5911
    2017-01-30 18:03:37  (2) cs_vSend  net.cpp:2585
    2017-01-30 18:03:37 Current lock order is:
    2017-01-30 18:03:37  (2) pnode->cs_vSend  net.cpp:1896 (TRY)
    2017-01-30 18:03:37  (1) cs_main  main.cpp:6483 (TRY)
    2017-01-30 18:03:37  (1) cs_main  main.cpp:1745
    2017-01-30 18:03:37 UpdateTip: new best=000000000000093a78ec61c44009ff74a44de94fd6e6b0fff2422804d180c082 height=1087142 version=0x20000000 log2_work=68.931904 tx=13040484 date='2017-01-30 18:03:19' progress=1.000000 cache=2.1MiB(9691tx) warning='1 of last 100 blocks have unexpected version'
    2017-01-30 18:03:51 UpdateTip: new best=00000000000005f96e01b15ac4cfb95859e3eecf7f4885274e1f24b3d7e3fde9 height=1087143 version=0x20000000 log2_work=68.931923 tx=13040485 date='2017-01-30 18:03:37' progress=1.000000 cache=2.1MiB(9692tx) warning='1 of last 100 blocks have unexpected version'
    2017-01-30 18:04:11 UpdateTip: new best=00000000000008bf5f8409d19de20f248f5b61e391071afd69bc9119efa3c0a4 height=1087144 version=0x20000000 log2_work=68.931941 tx=13040487 date='2017-01-30 18:03:51' progress=1.000000 cache=2.1MiB(9694tx) warning='1 of last 100 blocks have unexpected version'
    2017-01-30 18:04:35 connect() to [2001:0:9d38:6ab8:248d:2e3a:983b:b04a]:18333 failed: Network is unreachable (101)
    2017-01-30 18:05:58 POTENTIAL DEADLOCK DETECTED
    2017-01-30 18:06:00 Previous lock order was:
    2017-01-30 18:06:05  pnode->cs_vRecvMsg  net.cpp:1877 (TRY)
    2017-01-30 18:06:07  (1) cs_main  main.cpp:5911
    2017-01-30 18:06:13  (2) cs_vSend  net.cpp:2585
    2017-01-30 18:06:16 Current lock order is:
    2017-01-30 18:06:28  (2) pnode->cs_vSend  net.cpp:1896 (TRY)
    2017-01-30 18:06:31  (1) cs_main  main.cpp:6483 (TRY)
    2017-01-30 18:06:32  (1) cs_main  main.cpp:1745
    2017-01-30 18:08:58 UpdateTip: new best=000000000000095216f28d2cd715abee070cd83079411a93463ec61fa2a782eb height=1087145 version=0x20000000 log2_work=68.931959 tx=13040491 date='2017-01-30 18:05:29' progress=1.000000 cache=2.1MiB(9701tx) warning='1 of last 100 blocks have unexpected version'
    2017-01-30 18:08:58 POTENTIAL DEADLOCK DETECTED
    2017-01-30 18:08:58 Previous lock order was:
    2017-01-30 18:08:58  pnode->cs_vRecvMsg  net.cpp:1877 (TRY)
    2017-01-30 18:08:58  (1) cs_main  main.cpp:5911
    2017-01-30 18:08:58  (2) cs_vSend  net.cpp:2585
    2017-01-30 18:08:58 Current lock order is:
    2017-01-30 18:08:58  (2) pnode->cs_vSend  net.cpp:1896 (TRY)
    2017-01-30 18:08:58  (1) cs_main  main.cpp:6483 (TRY)
    2017-01-30 18:08:58  (1) cs_main  main.cpp:1745
    2017-01-30 18:08:58 UpdateTip: new best=000000000000024c5841a34b630828a27114c895ac7de8f1cb3376ec7d7a35a1 height=1087146 version=0x20000000 log2_work=68.931977 tx=13040493 date='2017-01-30 18:08:04' progress=1.000000 cache=2.1MiB(9703tx) warning='1 of last 100 blocks have unexpected version'
    

    Bitcoin did seem to sync successfully on testnet, and I'm not entirely sure if these message indicated a real problem?

  12. TheBlueMatt commented at 6:43 PM on January 30, 2017: contributor

    Yes, this is a spurious warning on many recent versions of Bitcoin Core, however, it is believed to have been "fixed" in master/will be fixed in 0.14.

  13. unknown cross-referenced this on Jun 8, 2018 from issue Error "POTENTIAL DEADLOCK DETECTED" by ghost
  14. bitcoin locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-20 06:55 UTC