Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib] Implement Python-like base64 encoding/decoding #3513

Closed
wants to merge 2 commits into from

Conversation

hayduker
Copy link
Contributor

@hayduker hayduker commented Sep 21, 2024

This PR implements a few features to make Mojo's base64 encoding and decoding conform more closely to Python's implementation:

  • b64decode now ignores whitespace in the input, fixing [BUG] b64decode does not handle whitespaces #3446.
  • Both b64encode and b64decode implement an optional altchars argument which allows the caller to substitute two characters for the default +/, useful for URL-safe encoding.
  • b64decode implements an optional validate argument which, when True, will cause the function to raise an Error if the input contains any characters outside the alphabet. When False (the default), these characters are ignored like whitespace.
  • b64decode now raises an Error if the input's length (after whitespace / invalid character removal) isn't divisible by 4.

Note about my comment about "leaky abstraction" in b64encode: I tried using String.unsafe_ptr() instead of String._buffer directly, since this was the function used for StringLiteral originally, but it does not return the correct underlying UInts corresponding to the characters in the string. I ran into this issue switching from using a StringLiteral alias to String variable for the alphabet, since I need to modify the alphabet according to the altchars argument.

I'm not sure if this is a bug or just my misunderstanding UnsafePointer. Here is a small example that exhibits what I mean:

def main():
    var l = "ABC"
    var lp = l.unsafe_ptr()
    print(lp[0])  # returns 65

    var s = String("ABC")
    var sp = s.unsafe_ptr()
    print(sp[0])  # returns 0

If this is a bug and I should open an issue for it, let me know.

@hayduker hayduker requested a review from a team as a code owner September 21, 2024 00:01
@soraros
Copy link
Contributor

soraros commented Sep 21, 2024

FYI, there is already #3443.

@hayduker
Copy link
Contributor Author

Oh cool, not sure how I missed that. Closing this.

@hayduker hayduker closed this Sep 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants