util: Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17) #20452

pull practicalswift wants to merge 1 commits into bitcoin:master from practicalswift:remove-atoi changing 15 files +145 −52
  1. practicalswift commented at 12:05 PM on November 22, 2020: contributor

    Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17).

  2. laanwj commented at 12:15 PM on November 22, 2020: member

    Note that we have a function ParseInt32 (as well as 64 and UInt variants) for this and I once tried to use it in more places, see #17385.

    However these didn't turn out to be actually locale-independent. Maybe this can replace them eventually.

  3. in src/util/strencodings.h:95 in be0efcd213 outdated
      85 | +        }
      86 | +        s = s.substr(1);
      87 | +    }
      88 | +    auto [_, error_condition] = std::from_chars(s.data(), s.data() + s.size(), result);
      89 | +    if (error_condition != std::errc{}) {
      90 | +        return 0;
    


    laanwj commented at 12:16 PM on November 22, 2020:

    Please, do not just return 0 in case of a parse error (same comment above!) but use explicit error feedback. The current formulation encourages ignoring errors. Most C++17ish would be to return an Option. (then handle it at the call site, if it wants to squash any errors to 0, fine)


    laanwj commented at 1:06 PM on November 22, 2020:

    I think you need to check the returned pointer against last as well, otherwise it ignores any remainder.


    practicalswift commented at 2:12 PM on November 22, 2020:

    I agree on both for a sane safe interface, but since this function (LocaleIndependentAtoi) is meant as a quirk-by-quirk compatible to the not-entirely-sane atoi(…) I'm afraid we need to return a T (instead of std::optional<T>) and ignore any remainder, no? :)

  4. laanwj commented at 12:23 PM on November 22, 2020: member

    locale-independent std::from_chars(…) (C++17).

    This is good to hear at least. From https://en.cppreference.com/w/cpp/utility/from_chars#Notes :

    Unlike other parsing functions in C++ and C libraries, std::from_chars is locale-independent, non-allocating, and non-throwing.

    Concept ACK.

  5. in src/util/strencodings.h:86 in be0efcd213 outdated
      76 | +template <typename T>
      77 | +T ToIntegral(const std::string& str)
      78 | +{
      79 | +    T result;
      80 | +    // Emulate atoi(...) handling of white space and leading +/-.
      81 | +    std::string s = TrimString(str);
    


    laanwj commented at 12:32 PM on November 22, 2020:

    Making a copy of the string here, as well as below for the +/- check, is kind of efficient. Not sure it matters here, it's not like this function is used in any performance critical places at the moment.

    But as the function takes a range, we could do better.


    practicalswift commented at 4:15 PM on December 1, 2020:

    As a general rule I try to stay away from pointer arithmetic (rationale here): that's the reason for the technically extraneous copy which I think is insignificant in this context :)

    Personally I find code that avoids pointer arithmetic much easier to reason about from a safety perspective.

    That said: I'll let other chime in :)

  6. DrahtBot commented at 1:18 PM on November 22, 2020: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #15936 (Unify bitcoin-qt and bitcoind persistent settings by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  7. laanwj commented at 1:29 PM on November 22, 2020: member

    I think we need to make a choice here (either one or both):

    • Make an emulation of all broken and surprising atoi behavior, but call it something else than ToIntegral, something like LocaleIndependentAtoi, make it clear it should not be used in new code, it's only meant for preserving atoi behavior for backward compatibility reasons.
    • Make a sane integer parsing function that can eventually replace Parse[U]IntXX or their contents in a locale-independent way. I'm fine with the name then.
  8. DrahtBot added the label GUI on Nov 22, 2020
  9. DrahtBot added the label P2P on Nov 22, 2020
  10. DrahtBot added the label Utils/log/libs on Nov 22, 2020
  11. practicalswift force-pushed on Nov 22, 2020
  12. practicalswift commented at 2:07 PM on November 22, 2020: contributor

    @laanwj

    I think we need to make a choice here (either one or both):

    • Make an emulation of all broken and surprising atoi behavior, but call it something else than ToIntegral, something like LocaleIndependentAtoi, make it clear it should not be used in new code, it's only meant for preserving atoi behavior for backward compatibility reasons.
    • Make a sane integer parsing function that can eventually replace Parse[U]IntXX or their contents in a locale-independent way. I'm fine with the name then.

    Good point. I thought about those options as well. I think we should do both. This PR is meant as a pure refactoring: it is not meant to change any behaviour that is defined by atoi.

    I've now made it more clear that this PR is meant as a quirk-by-quirk compatible replacement for atoi by calling the function LocaleIndependentAtoi :)

    Good point about naming: we should keep the name ToIntegral for a future sane version that doesn't emulate atoi or any other legacy function.

  13. DrahtBot cross-referenced this on Nov 22, 2020 from issue Extend support for nested commands to bitcoin-cli by jonasschnelli
  14. practicalswift force-pushed on Nov 22, 2020
  15. practicalswift force-pushed on Nov 22, 2020
  16. laanwj commented at 5:48 PM on November 22, 2020: member

    Okay, agree. I guess it could replace atoi as well as atoi64 in that case (as it's parametrized on type)?

  17. practicalswift force-pushed on Nov 22, 2020
  18. DrahtBot cross-referenced this on Nov 22, 2020 from issue Replace uses of boost::trim* with locale-independent alternatives by Empact
  19. practicalswift commented at 7:48 PM on November 22, 2020: contributor

    @laanwj Done! I've now replaced all instances of atoi64(s) with LocaleIndependentAtoi<int64_t>(s).

  20. DrahtBot cross-referenced this on Nov 23, 2020 from issue refactor: Implement missing error checking for ArgsManager flags by ryanofsky
  21. in src/util/strencodings.h:80 in 6a6a71e098 outdated
      69 | @@ -68,8 +70,25 @@ std::string EncodeBase32(Span<const unsigned char> input, bool pad = true);
      70 |  std::string EncodeBase32(const std::string& str, bool pad = true);
      71 |  
      72 |  void SplitHostPort(std::string in, int& portOut, std::string& hostOut);
      73 | -int64_t atoi64(const std::string& str);
      74 | -int atoi(const std::string& str);
      75 | +
      76 | +template <typename T>
    


    laanwj commented at 9:54 AM on November 23, 2020:

    Please add a comment what this function does (e.g.: replicate the exact behavior of atoi for backwards compatibility) and that new code should use the Parse[U]IntXX functions with parse error feedback.


    practicalswift commented at 3:50 PM on November 23, 2020:

    Done! :)

  22. laanwj commented at 9:55 AM on November 23, 2020: member

    @laanwj Done! I've now replaced all instances of atoi64(s) with LocaleIndependentAtoi<int64_t>(s).

    Thanks, looks good to me now except the documentation nit.

  23. practicalswift force-pushed on Nov 23, 2020
  24. practicalswift force-pushed on Nov 23, 2020
  25. practicalswift cross-referenced this on Nov 24, 2020 from issue test: Add testing of ParseInt/ParseUInt edge cases with leading +/-/0:s by practicalswift
  26. practicalswift renamed this:
    Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17)
    util: Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17)
    on Nov 24, 2020
  27. MarcoFalke referenced this in commit 9402159b9b on Nov 24, 2020
  28. practicalswift force-pushed on Nov 24, 2020
  29. sidhujag referenced this in commit 3a1441004a on Nov 24, 2020
  30. laanwj commented at 2:28 PM on November 25, 2020: member

    I don't get the CI error. Is charconv non-standard in some way?

    In file included from primitives/transaction.cpp:10:0:
    ./util/strencodings.h:16:10: fatal error: charconv: No such file or directory
     #include <charconv>
              ^~~~~~~~~~
    compilation terminated.
    
  31. practicalswift commented at 3:09 PM on December 1, 2020: contributor

    @laanwj I think the CI failures are from relatively older distros, and that C++17 library support lagged behind C++17 compiler support for a while in C++17 infancy.

    According to cppreference the elementary string conversions (<charconv>) were added in GCC libstdc++ 8, Clang libc++ 7 and MSVC Standard Library 19.14.

  32. MarcoFalke removed the label GUI on Dec 1, 2020
  33. MarcoFalke removed the label P2P on Dec 1, 2020
  34. MarcoFalke added the label Waiting for author on Dec 1, 2020
  35. MarcoFalke commented at 3:51 PM on December 1, 2020: member

    ok, this is blocked on #20460

  36. DrahtBot cross-referenced this on Dec 4, 2020 from issue fuzz: Link all targets once by MarcoFalke
  37. practicalswift commented at 12:57 PM on December 15, 2020: contributor

    Could remove "Waiting for author"? I don't know if we have any "Blocked on another PR/issue" (or "Waiting for toolchain upgrade") tag, but this PR is blocked on #20460 :)

  38. MarcoFalke removed the label Waiting for author on Dec 15, 2020
  39. MarcoFalke added the label Waiting for other on Dec 15, 2020
  40. DrahtBot added the label Needs rebase on Dec 15, 2020
  41. practicalswift force-pushed on Dec 16, 2020
  42. DrahtBot removed the label Needs rebase on Dec 16, 2020
  43. DrahtBot cross-referenced this on Feb 17, 2021 from issue MOVEONLY: CWallet transaction code out of wallet.cpp/.h by ryanofsky
  44. DrahtBot cross-referenced this on Feb 17, 2021 from issue refactor: Make CWalletTx sync state type-safe by ryanofsky
  45. DrahtBot cross-referenced this on Mar 2, 2021 from issue net, refactor: pass uint16 CService::port as uint16 by jonatack
  46. practicalswift force-pushed on Mar 2, 2021
  47. DrahtBot added the label Needs rebase on Mar 19, 2021
  48. practicalswift force-pushed on Mar 19, 2021
  49. DrahtBot removed the label Needs rebase on Mar 19, 2021
  50. practicalswift force-pushed on Mar 21, 2021
  51. DrahtBot cross-referenced this on Apr 11, 2021 from issue refactor: Add ALLOW_LIST flags and enforce usage in CheckArgFlags by ryanofsky
  52. DrahtBot cross-referenced this on Apr 11, 2021 from issue interfaces: Expose settings.json methods to GUI by ryanofsky
  53. DrahtBot cross-referenced this on Apr 12, 2021 from issue util: Fix -norpcwhitelist, -norpcallowip, and similar corner case behavior by ryanofsky
  54. DrahtBot cross-referenced this on Apr 13, 2021 from issue refactor: Remove settings merge reverse precedence code by ryanofsky
  55. DrahtBot cross-referenced this on Apr 13, 2021 from issue util: Forbid ambiguous multiple assignments in config file by ryanofsky
  56. DrahtBot cross-referenced this on Apr 19, 2021 from issue refactor: Move more stuff to blockstorage by MarcoFalke
  57. practicalswift force-pushed on Apr 24, 2021
  58. DrahtBot added the label Needs rebase on May 5, 2021
  59. practicalswift force-pushed on May 12, 2021
  60. practicalswift force-pushed on May 12, 2021
  61. DrahtBot removed the label Needs rebase on May 12, 2021
  62. DrahtBot cross-referenced this on May 27, 2021 from issue Validate port-options by amadeuszpawlik
  63. DrahtBot added the label Needs rebase on May 30, 2021
  64. practicalswift force-pushed on Jun 13, 2021
  65. practicalswift commented at 8:12 AM on June 13, 2021: contributor

    Rebased!

    Could this PR get a 23.0 milestone like the related PR #20457 ("util: Make Parse{Int,UInt}{32,64} use locale independent std::from_chars(…) (C++17) instead of locale dependent strto{l,ll,ul,ull}")? :)

  66. MarcoFalke added this to the milestone 23.0 on Jun 13, 2021
  67. DrahtBot removed the label Needs rebase on Jun 13, 2021
  68. DrahtBot cross-referenced this on Jun 13, 2021 from issue util: make ParseMoney return a std::optional<CAmount> by fanquake
  69. practicalswift force-pushed on Jun 13, 2021
  70. practicalswift force-pushed on Jun 22, 2021
  71. practicalswift force-pushed on Jun 22, 2021
  72. practicalswift force-pushed on Jun 22, 2021
  73. PastaPastaPasta referenced this in commit 46a05ad537 on Jun 27, 2021
  74. PastaPastaPasta referenced this in commit f059da98e6 on Jun 28, 2021
  75. PastaPastaPasta referenced this in commit 840d3332e4 on Jun 29, 2021
  76. practicalswift force-pushed on Jun 30, 2021
  77. PastaPastaPasta referenced this in commit 1c80f26441 on Jul 1, 2021
  78. PastaPastaPasta referenced this in commit dea0bfecd1 on Jul 1, 2021
  79. PastaPastaPasta referenced this in commit 1621639e85 on Jul 15, 2021
  80. practicalswift force-pushed on Aug 15, 2021
  81. DrahtBot added the label Needs rebase on Aug 24, 2021
  82. practicalswift force-pushed on Sep 5, 2021
  83. DrahtBot removed the label Needs rebase on Sep 5, 2021
  84. DrahtBot cross-referenced this on Sep 11, 2021 from issue wallet: refactor: inline functions `{Read,Write}OrderPos` by theStack
  85. DrahtBot added the label Needs rebase on Sep 17, 2021
  86. PastaPastaPasta referenced this in commit 0bb4bb8db0 on Sep 17, 2021
  87. practicalswift force-pushed on Sep 18, 2021
  88. DrahtBot removed the label Needs rebase on Sep 18, 2021
  89. PastaPastaPasta referenced this in commit 4485b3ea20 on Sep 19, 2021
  90. PastaPastaPasta referenced this in commit 40f2d3e692 on Sep 21, 2021
  91. fanquake cross-referenced this on Sep 22, 2021 from issue release: increase minimum compiler and lib(std)c++ requirements by fanquake
  92. PastaPastaPasta referenced this in commit 2328ead832 on Sep 24, 2021
  93. fanquake referenced this in commit 67eae69f3f on Sep 28, 2021
  94. MarcoFalke removed the label Waiting for other on Sep 28, 2021
  95. practicalswift commented at 2:19 PM on September 29, 2021: contributor

    Now that #20460 has been merged I think this PR should be ready for final review :)

    See also related PR #20457.

  96. practicalswift force-pushed on Sep 30, 2021
  97. Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17)
    test: Add test cases for LocaleIndependentAtoi
    
    fuzz: Assert legacy atoi(s) == LocaleIndependentAtoi<int>(s)
    
    fuzz: Assert legacy atoi64(s) == LocaleIndependentAtoi<int64_t>(s)
    4343f114cc
  98. practicalswift force-pushed on Sep 30, 2021
  99. jonatack commented at 3:33 PM on September 30, 2021: contributor

    Concept ACK

  100. laanwj commented at 4:57 PM on September 30, 2021: member

    Code review ACK 4343f114cc661cf031ec915538c11b9b030e2e15

  101. in src/rpc/mining.cpp:705 in 4343f114cc
     701 | @@ -702,7 +702,7 @@ static RPCHelpMan getblocktemplate()
     702 |              std::string lpstr = lpval.get_str();
     703 |  
     704 |              hashWatchedChain = ParseHashV(lpstr.substr(0, 64), "longpollid");
     705 | -            nTransactionsUpdatedLastLP = atoi64(lpstr.substr(64));
     706 | +            nTransactionsUpdatedLastLP = LocaleIndependentAtoi<int64_t>(lpstr.substr(64));
    


    jonatack commented at 8:20 PM on September 30, 2021:

    nTransactionsUpdatedLastLP is unsigned int; should this be defined or cast to that?


    laanwj commented at 7:07 AM on October 1, 2021:

    Yes it probably needs the same kind of ugly cast as nTimeSmart in the wallet code. In the longer run the field should imo be converted to an explicitly sized type, and parsed as that, but keeping the behavior exactly the same makes sense here to keep it uncontroversial.


    practicalswift commented at 2:19 PM on October 1, 2021:

    @jonatack Good point! I suggest we tackle that in a follow-up PR to keep this PR pure refactoring only :)

  102. jonatack commented at 8:25 PM on September 30, 2021: contributor

    Code review ACK 4343f114cc661cf031ec915538c11b9b030e2e15

  103. in src/util/strencodings.h:76 in 4343f114cc
      73 | -int atoi(const std::string& str);
      74 | +
      75 | +// LocaleIndependentAtoi is provided for backwards compatibility reasons.
      76 | +//
      77 | +// New code should use the ParseInt64/ParseUInt64/ParseInt32/ParseUInt32 functions
      78 | +// which provide parse error feedback.
    


    laanwj commented at 7:01 AM on October 1, 2021:

    This is a really important comment. I hope we can get rid of these functions entirely at some point. The implicit error handling behavior of atoi and atoi64 is pretty much always undesirable. (so I would normally comment "these function names are long and clunky, please shorten them" but it's good in this case, we want using them to be ugly :smile: )


    MarcoFalke commented at 10:17 AM on October 5, 2021:

    The linter says to use ToIntegral, this contradicts by saying to use ParseInt*??


    practicalswift commented at 11:55 AM on October 5, 2021:

    ToIntegral sounds like a good suggestion now that it is part of master :)

  104. laanwj merged this on Oct 4, 2021
  105. laanwj closed this on Oct 4, 2021

  106. sidhujag referenced this in commit 95062277f7 on Oct 4, 2021
  107. in src/core_read.cpp:72 in 4343f114cc
      68 | @@ -69,7 +69,7 @@ CScript ParseScript(const std::string& s)
      69 |              (w->front() == '-' && w->size() > 1 && std::all_of(w->begin()+1, w->end(), ::IsDigit)))
      70 |          {
      71 |              // Number
      72 | -            int64_t n = atoi64(*w);
      73 | +            int64_t n = LocaleIndependentAtoi<int64_t>(*w);
    


    MarcoFalke commented at 10:03 AM on October 5, 2021:

    The condition literally checks for the required format for ParseIntegral<>, so why not use that?


    practicalswift commented at 11:54 AM on October 5, 2021:

    Sounds like a good idea now that ParseIntegral is part of master :)


    MarcoFalke commented at 12:03 PM on October 5, 2021:

    Sorry, I was wrong. This is not a refactor.

  108. in src/qt/rpcconsole.cpp:253 in 4343f114cc
     249 | @@ -250,7 +250,7 @@ bool RPCConsole::RPCParseCommandLine(interfaces::Node* node, std::string &strRes
     250 |                                      for(char argch: curarg)
     251 |                                          if (!IsDigit(argch))
     252 |                                              throw std::runtime_error("Invalid result query");
     253 | -                                    subelement = lastResult[atoi(curarg.c_str())];
     254 | +                                    subelement = lastResult[LocaleIndependentAtoi<int>(curarg)];
    


    MarcoFalke commented at 10:04 AM on October 5, 2021:

    Same


    MarcoFalke commented at 12:03 PM on October 5, 2021:

    Sorry, I was wrong. This is not a refactor.

  109. in src/util/moneystr.cpp:80 in 4343f114cc
      76 | @@ -77,8 +77,7 @@ std::optional<CAmount> ParseMoney(const std::string& money_string)
      77 |          return std::nullopt;
      78 |      if (nUnits < 0 || nUnits > COIN)
      79 |          return std::nullopt;
      80 | -    int64_t nWhole = atoi64(strWhole);
      81 | -
      82 | +    int64_t nWhole = LocaleIndependentAtoi<int64_t>(strWhole);
    


    MarcoFalke commented at 10:36 AM on October 5, 2021:

    same


    MarcoFalke commented at 12:03 PM on October 5, 2021:

    Sorry, I was wrong. This is not a refactor.

  110. MarcoFalke commented at 10:37 AM on October 5, 2021: member

    concept ack

  111. MarcoFalke commented at 11:11 AM on October 5, 2021: member

    Follow up in #23184

  112. MarcoFalke commented at 12:04 PM on October 5, 2021: member

    withdrawn my comments

  113. kwvg referenced this in commit 91f5c0d329 on Oct 12, 2021
  114. jamesob referenced this in commit 5b78055d59 on Dec 22, 2021
  115. jamesob cross-referenced this on Dec 22, 2021 from issue Restore atoi64/GetIntArg saturating behavior by jamesob
  116. jamesob referenced this in commit fc11f8882c on Dec 22, 2021
  117. jamesob referenced this in commit 6d2a15adff on Dec 22, 2021
  118. jamesob referenced this in commit 1d229c783c on Jan 3, 2022
  119. jamesob referenced this in commit ac1a5b113c on Jan 4, 2022
  120. ryanofsky referenced this in commit 4a15ddc19c on Jan 11, 2022
  121. ryanofsky cross-referenced this on Jan 11, 2022 from issue util: Restore GetIntArg saturating behavior by ryanofsky
  122. ryanofsky referenced this in commit 6184764899 on Jan 11, 2022
  123. ryanofsky referenced this in commit d5c0e2ddd8 on Jan 11, 2022
  124. ryanofsky referenced this in commit 389c47a78e on Jan 11, 2022
  125. ryanofsky referenced this in commit 1b0b1ddc42 on Jan 11, 2022
  126. ryanofsky referenced this in commit b5c9bb5cb9 on Jan 12, 2022
  127. MarcoFalke referenced this in commit 290ff5ef6d on Jan 12, 2022
  128. rebroad referenced this in commit 4c76d815b2 on Feb 3, 2022
  129. barton2526 cross-referenced this on Aug 8, 2022 from issue util: Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17) by barton2526
  130. bitcoin locked this on Aug 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-19 06:53 UTC