base32 algorithm with a basic unit_test #2

DangerousFreedom1984 · 2023-06-06T20:19:23Z

One file base32 algorithm using specified Jamtis alphabet.

UkoeHB · 2023-06-06T22:03:54Z

src/common/base32.cpp

+
+
+}  // namespace base32
+}  // namespace tools


The ends of files need newlines. If you glance over the PR on github you will see a nice symbol complaining about it.

add space

rbrunner7 · 2023-06-09T09:05:17Z

A quick comment about the fact that this PR has now already two commits:

In my experience, the established workflow for PRs, especially small ones like this one, is not to make new commits for it when changing something (for whatever reason, be it review comments or be it on your own), but amend the one commit using git commit --amend and then make a force-push with git push --force.

Commenting on the PR can become difficult with changes distributed over several commits, and the commit history of the branch we merge into becomes longer with no clear advantage.

rbrunner7 · 2023-06-09T11:25:22Z

src/common/base32.cpp

+static constexpr char base32_monero_alphabet[] = {'x', 'm', 'r', 'b', 'a', 's', 'e', '3', '2', 'c', 'd',
+                                                  'f', 'g', 'h', 'i', 'j', 'k', 'n', 'p', 'q', 't', 'u',
+                                                  'w', 'y', '0', '1', '4', '5', '6', '7', '8', '9'};
+//-------------------------------------------------------------------------------------------------------------------


I see here a very good opportunity to explain why we use a different alphabet than almost anybody else for "our" Base32 variant, and mention which characters are missing or different.

That was the agreed alphabet proposed by tevador after some research and discussion: https://gist.github.com/tevador/50160d160d24cfc6c52ae02eb3d17024#35-base32-encoding

Well, yes, I know, you know, but a reader new to the code and Seraphis in general will not, hence my idea why not make some comments?

rbrunner7 · 2023-06-09T11:35:17Z

src/common/base32.cpp

+ *
+ *  Adapted from https://github.com/ahmed-masud/libbase32,
+ *  commit 79761b2b79b0545697945efe0987a8d3004512f9.
+ *  Quite different now.


Quite different now.

I wonder quite a bit about this. Why, explained on a high conceptual level, did the code have to change much? The different alphabet is clear of course, but beyond that?

I have a hard time anyway to compare your code with the forked original code, which makes it very hard to quickly decide whether the code is still correct without basically looking at everything. The original ist this file, right? https://github.com/ahmed-masud/libbase32/blob/master/base32.c

Ok, misunderstand from my side, this line is not from you, @DangerousFreedom1984 , but already present in the code that you forked:

Adapted from https://github.com/ahmed-masud/libbase32

As far as I see you don't give info in this PR where you got the code from, unfortunately, but I think I found it here: https://github.com/tplgy/cppcodec/blob/master/cppcodec/detail/base32.hpp , with parts from other files of that repo copied in, to have it all in one file I suppose.

Which still leads me to the question: That file from Ahmed Masud I linked above is wonderfully small, and wonderfully readable. Is it really not fit for purpose?

Yes, I copied most of the code from https://github.com/tplgy/cppcodec/blob/master/cppcodec/detail/base32.hpp as it seems to me the most reliable and efficient base32 library.
The code from Ahmed does not seem to be as efficient as this one as they improved it. I may be wrong though.
Since both are MIT licenses I believe that the correct way to copy is by pasting the exact header in our header file, right?

Since both are MIT licenses I believe that the correct way to copy is by pasting the exact header in our header file, right?

I am a unsure myself. The wording of the license is a bit different, maybe just a different iteration. At least we are on the safe side if we copy verbatim I guess.

rbrunner7 · 2023-06-09T11:46:04Z

src/common/base32.h

+size_t decoded_max_size(size_t encoded_size) noexcept;
+size_t encoded_size(size_t binary_size) noexcept;
+void decode_block(std::string& decoded, const alphabet_index_t* idx);
+void decode_tail(std::string& decoded, const alphabet_index_t* idx, size_t idx_len);


Do you happen to know what "decode the tail" means, and whether we really need this?

Yes we need it. It decodes correctly the last remaining block. You can check the idea of the algorithm here:
https://herongyang.com/Encoding/Base32-Encoding-Algorithm.html

I believe that this is the typical PR that needs different implementations, comparison and performance tests. I dont expect this exact code to be used in the final version but so far I do believe that it is correct, optimized and will serve for our purposes at the moment.

UkoeHB · 2023-06-09T20:05:41Z

A quick comment about the fact that this PR has now already two commits:

It is useful to have multiple commits during review, otherwise you need to re-review the entire PR after every update.

rbrunner7 · 2023-06-11T06:02:29Z

It is useful to have multiple commits during review, otherwise you need to re-review the entire PR after every update

It's easy to overlook, but GitHub solves exactly that problem: It gives you a view that shows only the changes resulting from a force-push. This screenshot is from one of my PRs:

The "Compare" link to the right gives you this handy view of the first changes I did based on early review comments.

Here is a PR from @jeffro256 with 4 such after-review-comments force-pushes, here one from @moneromooo-monero with 6, and here one from @j-berman with 4. They all seemed to stick to a single commit over the time the PRs were revised.

vtnerd

Most of these are benign.

vtnerd · 2023-09-10T16:01:26Z

src/common/base32.cpp

+
+    if (alphabet_index_info::is_invalid(*alphabet_index_ptr))
+    {
+        throw std::invalid_argument("decode: Symbol error");


We've typically returned bool in these types of (simple/basic) functions.

Is there any problem in using 'throw' ?

vtnerd · 2023-09-10T16:02:19Z

tests/unit_tests/base32.cpp

+
+    ASSERT_EQ(encoded_test, "gskwr0fmgskwr0fmgskwr0fmgskwr0fmgskwr0fmgskwr0fmgskwr0fmgskwr0fmgskwr0fmgskwr0fmgskwr0fmgskwr02");
+
+    base32::decode(encoded_test, recovered_test);


There should be some tests for invalid characters that "return false".

vtnerd · 2023-09-10T16:02:36Z

tests/unit_tests/base32.cpp

+
+TEST(base32, encode_decode) 
+{
+    do_test_simple();


Nitpick, but why have the function at all?

Was dumb. More tests were created.

vtnerd · 2023-09-10T16:03:12Z

src/common/base32.cpp

+                                                  'w', 'y', '0', '1', '4', '5', '6', '7', '8', '9'};
+//-------------------------------------------------------------------------------------------------------------------
+//-------------------------------------------------------------------------------------------------------------------
+static constexpr uint8_t binary_block_size() { return 5; }


These could be constants instead of functions, the compiler doesn't have to reserve static memory for them if the address is never taken, etc.

vtnerd · 2023-09-10T16:04:02Z

src/common/base32.cpp

+}
+//-------------------------------------------------------------------------------------------------------------------
+//-------------------------------------------------------------------------------------------------------------------
+static constexpr char symbol(alphabet_index_t idx) { return base32_monero_alphabet[idx]; }


Nitpick again, but this function isn't really doing much.

vtnerd · 2023-09-10T18:52:00Z

src/common/base32.cpp

+        auto remaining_src_len = src_end - src;
+        if (!remaining_src_len || remaining_src_len >= binary_block_size())
+        {
+            abort();


Is this possible? Seems like a good candidate for assert. Or just throw std::logic_error.

Yeah, abort is too abrupt.

vtnerd · 2023-09-10T19:09:12Z

src/common/base32.cpp

+    }
+}
+//-------------------------------------------------------------------------------------------------------------------
+void encode(const std::string input, std::string& encoded_out)


const std::string& input or even better epee::span<const std::uint8_t>. It doesn't appear the null termination byte can ever be used in the encoder or decoder.

vtnerd · 2023-09-10T19:10:33Z

src/common/base32.cpp

+    encode(encoded_out, binary, binary_size);
+}
+//-------------------------------------------------------------------------------------------------------------------
+void decode(const std::string input, std::string& decoded_out)


Again, const std::string& or epee::span<const std::uint8_t>?

vtnerd · 2023-09-10T19:13:32Z

src/common/base32.cpp

+    alphabet_indexes[0] = alphabet_index_info::eof_idx;
+
+    alphabet_index_t* const alphabet_index_start = &alphabet_indexes[0];
+    alphabet_index_t* const alphabet_index_end = &alphabet_indexes[encoded_block_size()];


Perhaps mark alphabet_index const* const for both?

I could do that but then I would have to use const_cast in this line alphabet_index_ptr = const_cast<alphabet_index_t* (alphabet_index_start);
Which I dont know if it make things safer.

vtnerd · 2023-09-10T19:23:43Z

src/common/base32.cpp

+
+    if (last_index_ptr != alphabet_index_start)
+    {
+        if (alphabet_index_ptr >= alphabet_index_end)


Is this even possible given the logic of the function ? An assert() if this should never fail is ok.

Yeah, better.

DangerousFreedom1984 · 2023-09-13T22:54:50Z

Thank you for the review @vtnerd. I also included @jeffro256 unit_tests from PR #6 .

base32 algorithm with a basic unit_test

0a9da59

UkoeHB requested changes Jun 6, 2023

View reviewed changes

Update base32.cpp

e69d685

add space

rbrunner7 reviewed Jun 9, 2023

View reviewed changes

vtnerd reviewed Sep 10, 2023

View reviewed changes

jeffro256 mentioned this pull request Sep 10, 2023

common: add Jamtis base32 encoding #6

Merged

fix vtnerd corrections and add jeffro unit_test

a88c7fe

rbrunner7 mentioned this pull request Sep 15, 2023

Choosing between the 2 PRs implementing base32 support seraphis-migration/wallet3#60

Closed

DangerousFreedom1984 closed this Sep 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

base32 algorithm with a basic unit_test #2

base32 algorithm with a basic unit_test #2

DangerousFreedom1984 commented Jun 6, 2023

UkoeHB Jun 6, 2023

DangerousFreedom1984 Jun 7, 2023

rbrunner7 commented Jun 9, 2023

rbrunner7 Jun 9, 2023

DangerousFreedom1984 Jun 11, 2023

rbrunner7 Jun 11, 2023

rbrunner7 Jun 9, 2023

rbrunner7 Jun 9, 2023

DangerousFreedom1984 Jun 11, 2023

rbrunner7 Jun 11, 2023

rbrunner7 Jun 9, 2023

DangerousFreedom1984 Jun 11, 2023

UkoeHB commented Jun 9, 2023

rbrunner7 commented Jun 11, 2023

vtnerd left a comment

vtnerd Sep 10, 2023

DangerousFreedom1984 Sep 13, 2023

vtnerd Sep 10, 2023

DangerousFreedom1984 Sep 13, 2023

vtnerd Sep 10, 2023

DangerousFreedom1984 Sep 13, 2023

vtnerd Sep 10, 2023

DangerousFreedom1984 Sep 13, 2023

vtnerd Sep 10, 2023

vtnerd Sep 10, 2023

DangerousFreedom1984 Sep 13, 2023

vtnerd Sep 10, 2023

vtnerd Sep 10, 2023

DangerousFreedom1984 Sep 13, 2023

vtnerd Sep 10, 2023

DangerousFreedom1984 Sep 13, 2023

vtnerd Sep 10, 2023

DangerousFreedom1984 Sep 13, 2023

DangerousFreedom1984 commented Sep 13, 2023


		ASSERT_EQ(encoded_test, "gskwr0fmgskwr0fmgskwr0fmgskwr0fmgskwr0fmgskwr0fmgskwr0fmgskwr0fmgskwr0fmgskwr0fmgskwr0fmgskwr02");

		base32::decode(encoded_test, recovered_test);

base32 algorithm with a basic unit_test #2

base32 algorithm with a basic unit_test #2

Conversation

DangerousFreedom1984 commented Jun 6, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rbrunner7 commented Jun 9, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

UkoeHB commented Jun 9, 2023

rbrunner7 commented Jun 11, 2023

vtnerd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DangerousFreedom1984 commented Sep 13, 2023