[stdlib] Implement Python-like base64 encoding/decoding #3513
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR implements a few features to make Mojo's base64 encoding and decoding conform more closely to Python's implementation:
b64decode
now ignores whitespace in the input, fixing [BUG] b64decode does not handle whitespaces #3446.b64encode
andb64decode
implement an optionalaltchars
argument which allows the caller to substitute two characters for the default+/
, useful for URL-safe encoding.b64decode
implements an optionalvalidate
argument which, whenTrue
, will cause the function to raise anError
if the input contains any characters outside the alphabet. WhenFalse
(the default), these characters are ignored like whitespace.b64decode
now raises anError
if the input's length (after whitespace / invalid character removal) isn't divisible by 4.Note about my comment about "leaky abstraction" in
b64encode
: I tried usingString.unsafe_ptr()
instead ofString._buffer
directly, since this was the function used forStringLiteral
originally, but it does not return the correct underlyingUInt
s corresponding to the characters in the string. I ran into this issue switching from using aStringLiteral
alias toString
variable for the alphabet, since I need to modify the alphabet according to thealtchars
argument.I'm not sure if this is a bug or just my misunderstanding
UnsafePointer
. Here is a small example that exhibits what I mean:If this is a bug and I should open an issue for it, let me know.