Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to skip padding for base64 urlsafe encoding/decoding #73613

Open
Thorney mannequin opened this issue Feb 3, 2017 · 4 comments
Open

Option to skip padding for base64 urlsafe encoding/decoding #73613

Thorney mannequin opened this issue Feb 3, 2017 · 4 comments
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@Thorney
Copy link
Mannequin

Thorney mannequin commented Feb 3, 2017

BPO 29427
Nosy @malemburg, @loewis, @birkenfeld, @djc, @bitdancer, @mayankasthana, @puxlit, @FranklinYu, @uliludmann
PRs
  • bpo-29427: allow unpadded input and ouput in base64 module #7072
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2017-02-03.01:25:17.242>
    labels = ['3.7', 'type-bug', 'library']
    title = 'Option to skip padding for base64 urlsafe encoding/decoding'
    updated_at = <Date 2019-12-03.13:54:08.386>
    user = 'https://bugs.python.org/Thorney'

    bugs.python.org fields:

    activity = <Date 2019-12-03.13:54:08.386>
    actor = 'uliludmann'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2017-02-03.01:25:17.242>
    creator = 'Thorney'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 29427
    keywords = ['patch']
    message_count = 3.0
    messages = ['286837', '299726', '299819']
    nosy_count = 11.0
    nosy_names = ['lemburg', 'loewis', 'georg.brandl', 'nneonneo', 'djc', 'Thorney', 'r.david.murray', 'masthana', 'puxlit', 'Franklin Yu', 'uliludmann']
    pr_nums = ['7072']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue29427'
    versions = ['Python 3.6', 'Python 3.7']

    @Thorney
    Copy link
    Mannequin Author

    Thorney mannequin commented Feb 3, 2017

    Suggest changing base64 module to better handle encoding schemes that don't use padding.
    Because RFC4648 [1] allows other RFCs that implement RFC4648-compliant base64url encoding to explicitly stipulate that there is no padding. Dropping the padding is lossless when we know the length [2].
    Various standard specifications require this - often crypto related (e.g., JWS [3] or named hashes [4]).

    RFC4648 specifically makes an exemption for this and it should be better supported in Python's standard library. There is a related closed issue [5] asking for the padding to be removed or altered which wouldn't comply with the spec. This request is different with a view to better support the wider specification.

    Proposed behaviour adapted from resolution that ruby discussion on same topic [6]:

    • base64.urlsafe_b64encode(s) should continue to produce padded output, but have an additional argument, padding, which defaults to True.
    • base64.urlsafe_b64decode(s) should accept both padded and unpadded inputs. It can still reject incorrectly-padded input.

    If that sounds sensible I'd like to put a patch/PR together.

    From wikipedia [7]:

    Some variants allow or require omitting the padding '=' signs to avoid them being confused with field separators, or require that any such padding be percent-encoded. Some libraries will encode '=' to '.'.

    @Thorney Thorney mannequin added 3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Feb 3, 2017
    @nneonneo
    Copy link
    Mannequin

    nneonneo mannequin commented Aug 4, 2017

    This sounds reasonable. I ran into a similar issue today trying to decode a JSON Web Key. Although I don't have any real say, I'd say that if you put together a patch it may have a higher chance to get reviewed.

    I wonder about the following:

    • What about adding a new kwarg to b64decode, passed through by urlsafe_b64decode, called "checkpad=True" which validates padding? Then we can just set that False when we need.
    • At the same time it might be nice to pass "validate=False" through from urlsafe_b64decode and friends, so we can have some nicer validation of data.
    • I like adding the "padding=True" arg to encode, but it may not be necessary given the ease of ".rstrip('=')" as an alternative. Anyway, if you will add it to encode, please add it to b64encode and pass through from the variant encoders to unify the API somewhat.

    If you are still interested in putting together a patch, post a comment. Otherwise I may work on a patch for this.

    @Thorney
    Copy link
    Mannequin Author

    Thorney mannequin commented Aug 6, 2017

    Hi Robert, It would be at least a week or two before I could take another look at this so please feel free to work on it. Not sure why I didn't write a patch at the time!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @alonbl
    Copy link

    alonbl commented Jul 2, 2022

    Hi,

    What is the status of this issue? there are many application that require url safe base64 encoding such as JOSE (JWT) which require ugly workarounds because of this bug or actually fail because of the bug.

    From the (RFC-4648)[https://www.rfc-editor.org/rfc/rfc4648#section-5]

    """
    The pad character "=" is typically percent-encoded when used in an
    URI [9], but if the data length is known implicitly, this can be
    avoided by skipping the padding; see section 3.2.
    """

    The python bytes/str length falls into the definition of implicitly.

    Please support this behavior per RFC, there is no need for an additional parameter, just to read up to the end of the buffer.

    Regards,

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant