Remove double serialization; use software encoder for fee estimation #21966

pull sipa wants to merge 6 commits into bitcoin:master from sipa:202105_softfloat changing 11 files +258 −160
  1. sipa commented at 1:38 AM on May 17, 2021: member

    Based on #21981.

    This adds a software-based platform-independent float/double encoder/decoder (platform independent in the sense that it only uses arithmetic and library calls, but never inspects the binary representation). This should strengthen our guarantee that encoded float/double values are portable across platforms. It then removes the functionality to serialize doubles from serialize.h, and replaces its only (non-test) use for fee estimation data serialization with the software encoder.

    At least on x86/ARM, the only difference should be how certain NaN values are encoded/decoded (but not whether they are NaN or not).

    It comes with tests that verify on is_iec559 platforms (which are the only ones we support, at least for now) that the serialized bytes exactly match the binary representation of floats in memory (for non-NaN).

  2. sipa force-pushed on May 17, 2021
  3. sipa force-pushed on May 17, 2021
  4. DrahtBot added the label Build system on May 17, 2021
  5. DrahtBot added the label Utils/log/libs on May 17, 2021
  6. sipa removed the label Build system on May 17, 2021
  7. sipa force-pushed on May 17, 2021
  8. DrahtBot commented at 3:43 AM on May 17, 2021: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #21981 (Remove unused float serialization by MarcoFalke)
    • #21969 (refactor: Switch serialize to uint8_t (Bundle 1/2) by MarcoFalke)
    • #10443 (Add fee_est tool for debugging fee estimation code by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  9. DrahtBot cross-referenced this on May 17, 2021 from issue Document that `ser_float_to_uint32` is not the inverse of `ser_uint32_to_float` by practicalswift
  10. practicalswift commented at 9:40 AM on May 17, 2021: contributor

    Strong Concept ACK

  11. laanwj commented at 10:23 AM on May 17, 2021: member

    While I think this is intriguing technically, I'm ~0 on this conceptually

    I think using floating point in places where 100% precision or portability across platforms is important is mistaken in the first place. This gives the wrong impression.

    It raises deeper questions for me: do we really need floating point in Bitcoin at all? Narrower: do we really need to serialize/deserialize it? (clearly it's already not used for anything consensus critical)

  12. MarcoFalke commented at 10:28 AM on May 17, 2021: member

    Haven't checked, but I presume it is only used for fees.dat?

  13. laanwj commented at 10:42 AM on May 17, 2021: member

    If so, I'd prefer to find an alternative way of serializing those values specifically (as fixed-point or numerator / denominator pairs of integers) and dropping the general float/double (de)serialization.

  14. practicalswift commented at 2:08 PM on May 17, 2021: contributor

    Narrower: do we really need to serialize/deserialize it?

    As MarcoFalke discovered ser_float_to_uint32 and ser_uint32_to_float are currently unused (#21981), and AFAICT TxConfirmStats::Write and TxConfirmStats::Read are the only remaining users of ser_double_to_uint64 and ser_uint64_to_double.

    Looks promising! :)

  15. DrahtBot cross-referenced this on May 17, 2021 from issue Remove unused float serialization by MarcoFalke
  16. sipa commented at 6:10 PM on May 17, 2021: member

    @laanwj That"s fair. I agree that avoiding serialization of floating point values directly is much more desirable.

    My thinking here is that it effectively gives us a well-specified serialization with testable properties without needing to break compatibility with existing files.

    How about rebasing this PR on top of #21981, and making it remove serialization support for double too in serialization.h, and instead make the feedata writing/reading code invoke EncodeDouble/DecodeDouble directly?

  17. laanwj commented at 7:39 AM on May 18, 2021: member

    How about rebasing this PR on top of #21981, and making it remove serialization support for double too in serialization.h, and instead make the feedata writing/reading code invoke EncodeDouble/DecodeDouble directly?

    I was about to comment this! Let's make this code private to the single case where it is used? It keeps the current format but also prevents future serialization of these types :smile:

  18. sipa force-pushed on May 18, 2021
  19. sipa renamed this:
    Software float encoding
    Remove double serialization; use software encoder for fee estimation
    on May 18, 2021
  20. sipa force-pushed on May 18, 2021
  21. sipa commented at 7:55 PM on May 18, 2021: member

    @laanwj Done.

  22. practicalswift commented at 10:06 PM on May 18, 2021: contributor

    cr ACK 892522deba52f35fe5b08e2d7aebeb80600f9cdc: patch looks correct :)

    Very happy to see ser_double_to_uint64/ser_uint64_to_double go and src/compat/assumptions.h shrink :)

  23. DrahtBot cross-referenced this on May 19, 2021 from issue refactor: Switch serialize to uint8_t (Bundle 1/2) by MarcoFalke
  24. laanwj commented at 8:23 AM on May 19, 2021: member

    Thanks! Happy to see this now.

    Code review ACK 892522deba52f35fe5b08e2d7aebeb80600f9cdc Code review re-ACK 66545da2008cd9e806e41b74522ded259cd64f86

  25. in src/test/serfloat_tests.cpp:31 in ae87a8aa4f outdated
      23 | +    } else {
      24 | +        // Everything else is.
      25 | +        BOOST_CHECK(!std::isnan(f2));
      26 | +        uint64_t i2 = EncodeDouble(f2);
      27 | +        BOOST_CHECK_EQUAL(f, f2);
      28 | +        BOOST_CHECK_EQUAL(i, i2);
    


    MarcoFalke commented at 6:15 AM on May 20, 2021:

    would it make sense to also assert equality with the "legacy" serialization helpers in both this unit test and the fuzz test?

    Otherwise we might run into a corrupted fees.dat after an upgrade. A problem worse than the problem this pull is trying to solve (making fees.dat architecture independent).


    MarcoFalke commented at 6:20 AM on May 20, 2021:

    Oh sorry, I see that is what the "should produce exactly the same as the in-memory representation for non-NaN" test is doing.


    MarcoFalke commented at 6:23 AM on May 20, 2021:

    Though, it looks like the fuzz test isn't fuzzing the in-memory representation, only the round-trip?


    sipa commented at 11:15 PM on May 24, 2021:

    Added back.


    MarcoFalke commented at 10:22 AM on June 7, 2021:

    Ah thanks, though the fuzzed_data_provider.ConsumeFloatingPoint only consumes "nice" floats, not odd ones. So the fuzzer is only testing the happy path.


    MarcoFalke commented at 11:56 AM on June 7, 2021:

    Fixed in #22180

  26. in src/test/serfloat_tests.cpp:41 in ae87a8aa4f outdated
      36 | +    BOOST_CHECK_EQUAL(TestDouble(0.0), 0);
      37 | +    BOOST_CHECK_EQUAL(TestDouble(-0.0), 0x8000000000000000);
      38 | +    BOOST_CHECK_EQUAL(TestDouble(std::numeric_limits<double>::infinity()), 0x7ff0000000000000);
      39 | +    BOOST_CHECK_EQUAL(TestDouble(-std::numeric_limits<double>::infinity()), 0xfff0000000000000);
      40 | +
      41 | +    if (std::numeric_limits<float>::is_iec559) {
    


    MarcoFalke commented at 6:24 AM on May 20, 2021:

    Any reason to check for float, but then test double?


    sipa commented at 11:15 PM on May 24, 2021:

    The technical term for this would be an "oops". Fixed.

  27. Remove unused float serialization e40224d0c7
  28. Add platform-independent float encoder/decoder 2be4cd94f4
  29. Add unit tests for serfloat module bda33f98e2
  30. Convert existing float encoding tests afd964d70b
  31. Convert uses of double-serialization to {En,De}codeDouble fff1cae43a
  32. Remove support for double serialization 66545da200
  33. sipa force-pushed on May 24, 2021
  34. in src/test/fuzz/float.cpp:24 in 66545da200
      37 | -        stream << f;
      38 | -        float f_deserialized;
      39 | -        stream >> f_deserialized;
      40 | -        assert(f == f_deserialized);
      41 | +        uint64_t encoded = EncodeDouble(d);
      42 | +        if constexpr (std::numeric_limits<double>::is_iec559) {
    


    practicalswift commented at 7:08 PM on May 25, 2021:

    Note to other reviewers: We assume this to hold true in assumptions.h:

    static_assert(std::numeric_limits<double>::is_iec559, "IEEE 754 double assumed");
    
  35. in src/test/serfloat_tests.cpp:50 in 66545da200
      45 | +    BOOST_CHECK_EQUAL(TestDouble(2.0), 0x4000000000000000ULL);
      46 | +    BOOST_CHECK_EQUAL(TestDouble(4.0), 0x4010000000000000ULL);
      47 | +    BOOST_CHECK_EQUAL(TestDouble(785.066650390625), 0x4088888880000000ULL);
      48 | +
      49 | +    // Roundtrip test on IEC559-compatible systems
      50 | +    if (std::numeric_limits<double>::is_iec559) {
    


    practicalswift commented at 7:08 PM on May 25, 2021:

    Note to other reviewers: We assume this to hold true in assumptions.h:

    static_assert(std::numeric_limits<double>::is_iec559, "IEEE 754 double assumed");
    

    sipa commented at 7:11 PM on May 25, 2021:

    I should remove that assumption now, it's no longer needed.


    practicalswift commented at 7:30 PM on May 25, 2021:

    I think we need that assumption as long as we're doing floating-point division by zero? I still think we do that in ConnectBlock, CreateTransaction and EstimateMedianVal :)


    sipa commented at 7:35 PM on May 25, 2021:

    Ah, good point.


    laanwj commented at 7:52 AM on May 26, 2021:

    Maybe a weaker assumption could do there. In any case, it's out of scope for this PR. Happy to leave it as it is for the foreseeable future unless anyone has good reason to work on porting bitcoind to non-IEC559 platforms.

    (in which case I heartily suggest: please get rid of all floating point code. it shouldn't be necessary in financial-ish code)

  36. practicalswift commented at 7:09 PM on May 25, 2021: contributor

    cr re-ACK 66545da2008cd9e806e41b74522ded259cd64f86

  37. laanwj merged this on May 26, 2021
  38. laanwj closed this on May 26, 2021

  39. DrahtBot cross-referenced this on May 26, 2021 from issue Add fee_est tool for debugging fee estimation code by ryanofsky
  40. sidhujag referenced this in commit efa10235fa on May 27, 2021
  41. MarcoFalke commented at 11:36 AM on June 7, 2021: member

    review ACK

  42. kwvg referenced this in commit 15c67f4fef on Jun 16, 2021
  43. kwvg referenced this in commit c8cd8bfcaa on Jun 16, 2021
  44. kwvg cross-referenced this on Jun 16, 2021 from issue partial bitcoin#15638, #21966, #16889, merge #14555, #20499, #14074, #17073: util refactoring by kwvg
  45. kwvg referenced this in commit cf8d5a958b on Jun 16, 2021
  46. kwvg referenced this in commit 43f1909376 on Jun 16, 2021
  47. kwvg referenced this in commit f42dcbbfb1 on Jun 24, 2021
  48. kwvg referenced this in commit d129516f22 on Jun 25, 2021
  49. kwvg referenced this in commit e8d09709c9 on Jun 25, 2021
  50. kwvg referenced this in commit eecf91cd44 on Jun 26, 2021
  51. kwvg referenced this in commit 441a37f26b on Jun 26, 2021
  52. kwvg referenced this in commit 7d319145ab on Jun 27, 2021
  53. kwvg referenced this in commit f946c68f83 on Jun 27, 2021
  54. UdjinM6 referenced this in commit 7d664c7c53 on Jun 27, 2021
  55. gades referenced this in commit 4915046193 on May 1, 2022
  56. gwillen referenced this in commit 95ce7de90f on Jun 1, 2022
  57. bitcoin locked this on Aug 18, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-20 06:54 UTC