Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

email: folding of quoted string in display_name violates RFC #80222

Closed
SwampWalker mannequin opened this issue Feb 19, 2019 · 10 comments · Fixed by #122753
Closed

email: folding of quoted string in display_name violates RFC #80222

SwampWalker mannequin opened this issue Feb 19, 2019 · 10 comments · Fixed by #122753
Labels
stdlib Python modules in the Lib dir topic-email type-security A security issue

Comments

@SwampWalker
Copy link
Mannequin

SwampWalker mannequin commented Feb 19, 2019

BPO 36041
Nosy @warsaw, @bitdancer, @SwampWalker
PRs
  • Fix bpo-36041: fix folding of quoted string in display_name violates RFC #12054
  • Files
  • address_folding_bug.py: Illustration of issue, monkey patch fix of issue, illustration of fix
  • address_folding_bug.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2019-02-19.16:45:39.064>
    labels = ['type-bug', '3.7', 'expert-email']
    title = 'email: folding of quoted string in display_name violates RFC'
    updated_at = <Date 2019-11-20.08:30:29.764>
    user = 'https://github.com/SwampWalker'

    bugs.python.org fields:

    activity = <Date 2019-11-20.08:30:29.764>
    actor = 'ronaldevers'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['email']
    creation = <Date 2019-02-19.16:45:39.064>
    creator = 'aaryn.startmail'
    dependencies = []
    files = ['48155', '48158']
    hgrepos = []
    issue_num = 36041
    keywords = ['patch']
    message_count = 7.0
    messages = ['335975', '336010', '336049', '336093', '336109', '336667', '336668']
    nosy_count = 4.0
    nosy_names = ['barry', 'r.david.murray', 'aaryn.startmail', 'ronaldevers']
    pr_nums = ['12054']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue36041'
    versions = ['Python 3.5', 'Python 3.6', 'Python 3.7']

    Linked PRs

    @SwampWalker
    Copy link
    Mannequin Author

    SwampWalker mannequin commented Feb 19, 2019

    When using a policy for an EmailMessage that triggers folding (during serialization) of a fairly long display_name in an address field, the folding process removes the quotes from the display name breaking the semantics of the field.

    In particular, for a From address's display name like r'anything@anything.com ' + 'a' * MAX_LINE_LEN the folding puts anything@anything.com unquoted immediately after the From: header. For applications that do sender verification inside and then send it to an internal SMTP server that does not perform its own sender verification this could be considered a security issue since it enables sender spoofing. Receiving mail servers might be able to detect the broken header, but experiments show that the mail gets delivered.

    Simple demonstration (reproduced in attachment) of issue:

    SMTP_POLICY = email.policy.default.clone(linesep='\r\n', max_line_length=72)
    address = Address(display_name=r'anything@anything.com ' + 'a' * 72, addr_spec='dev@local.startmail.org')
    
    message = EmailMessage(policy=SMTP_POLICY)
    message['From'] = Address(display_name=display_name, addr_spec=addr_spec)
    
    # Trigger folding (via as_string()), then parse it back in.
    msg_string = message.as_string()
    msg_bytes = msg_string.encode('utf-8')
    msg_deserialized = BytesParser(policy=SMTP_POLICY).parsebytes(msg_bytes)
    
    # Verify badness
    from_hdr = msg_deserialized['From']
    assert from_hdr != str(address)  # But they should be equal...

    @SwampWalker SwampWalker mannequin added 3.7 (EOL) end of life topic-email type-bug An unexpected behavior, bug, or error labels Feb 19, 2019
    @bitdancer
    Copy link
    Member

    Since Address itself renders it correctly (str(address)), the problem is going to take a bit of digging to find. I'm guessing the quoted_string atom is getting transformed incorrectly into something else at some point during the folding.

    @SwampWalker
    Copy link
    Mannequin Author

    SwampWalker mannequin commented Feb 20, 2019

    Hi David, the problem is in email._header_value_parser._refold_parse_tree.

    Specifically, when the parsetree renders too long, it recursively gets newparts = list(part) (the children). When it does this to a BareQuotedString node, the child nodes are unquoted and unescaped and it just happily serializes these.

    I thought I had attached a file that monkey patches the _refold_parse_tree function with a fixed version... let me try again.

    @bitdancer
    Copy link
    Member

    I'm afraid I don't have time to parse through the file you uploaded. Can you produce a pull request or a diff showing your fix? And ideally some added tests :) But whatever you can do is great, if you don't have time maybe someone else will pick it up (I unfortunately don't have time, though I should be able to do a review of a PR).

    @SwampWalker
    Copy link
    Mannequin Author

    SwampWalker mannequin commented Feb 20, 2019

    Sure thing, I'll try to produce something tomorrow.

    @SwampWalker
    Copy link
    Mannequin Author

    SwampWalker mannequin commented Feb 26, 2019

    Sorry about the delay. I opened pull request #12054 for this. Let me know if you need anything else.

    @SwampWalker
    Copy link
    Mannequin Author

    SwampWalker mannequin commented Feb 26, 2019

    Although I am not personally interested in backporting a fix for this issue, anyone that experiences this issue in python 3.5 can execute the following monkey patch to solve the issue:

    def _fix_issue_36041_3_5():
        from email._header_value_parser import QuotedString, ValueTerminal, quote_string
        import email._header_value_parser
    
        class BareQuotedString(QuotedString):
    
            token_type = 'bare-quoted-string'
    
            def __str__(self):
                return quote_string(''.join(str(x) for x in self))
    
            @property
            def value(self):
                return ''.join(str(x) for x in self)
    
            @property
            def parts(self):
                parts = list(self)
                escaped_parts = []
                for part in parts:
                    if isinstance(part, ValueTerminal):
                        escaped = quote_string(str(part))[1:-1]
                        escaped_parts.append(ValueTerminal(escaped, 'ptext'))
                    else:
                        escaped_parts.append(part)
                # Add quotes to the first and last parts.
                escaped_parts[0] = ValueTerminal('"' + str(escaped_parts[0]), 'ptext')
                escaped_parts[-1] = ValueTerminal(str(escaped_parts[-1] + '"'), 'ptext')
                return escaped_parts
    
        email._header_value_parser.BareQuotedString = BareQuotedString

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @iritkatriel iritkatriel added the stdlib Python modules in the Lib dir label Nov 23, 2023
    @serhiy-storchaka
    Copy link
    Member

    I came to report a similar issue (discovered during writing tests for #118643). I thing they are the parts of the same problem.

    The address like "<one@example.com>," <two@example.com> can turn into two addresses <one@example.com>, <two@example.com> after re-folding.

    medmunds added a commit to medmunds/cpython that referenced this issue Aug 6, 2024
    Email generators using email.policy.default could incorrectly omit the
    quote ('"') characters from a quoted-string during header refolding,
    leading to invalid address headers and enabling header spoofing. This
    change restores the quote characters on a bare-quoted-string as the
    header is refolded, and escapes backslash and quote chars in the string.
    @medmunds
    Copy link
    Contributor

    medmunds commented Aug 6, 2024

    Still an issue in 3.13. #security-issue.

    @medmunds
    Copy link
    Contributor

    Ping! This security issue in modern email has PR #122753 waiting for a reviewer.

    @picnixz picnixz added type-security A security issue and removed type-bug An unexpected behavior, bug, or error 3.7 (EOL) end of life labels Dec 3, 2024
    bitdancer pushed a commit that referenced this issue Jan 19, 2025
    …122753)
    
    Email generators using email.policy.default could incorrectly omit the
    quote ('"') characters from a quoted-string during header refolding,
    leading to invalid address headers and enabling header spoofing. This
    change restores the quote characters on a bare-quoted-string as the
    header is refolded, and escapes backslash and quote chars in the string.
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jan 19, 2025
    …ing (pythonGH-122753)
    
    Email generators using email.policy.default could incorrectly omit the
    quote ('"') characters from a quoted-string during header refolding,
    leading to invalid address headers and enabling header spoofing. This
    change restores the quote characters on a bare-quoted-string as the
    header is refolded, and escapes backslash and quote chars in the string.
    (cherry picked from commit 5aaf416)
    
    Co-authored-by: Mike Edmunds <medmunds@gmail.com>
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jan 19, 2025
    …ing (pythonGH-122753)
    
    Email generators using email.policy.default could incorrectly omit the
    quote ('"') characters from a quoted-string during header refolding,
    leading to invalid address headers and enabling header spoofing. This
    change restores the quote characters on a bare-quoted-string as the
    header is refolded, and escapes backslash and quote chars in the string.
    (cherry picked from commit 5aaf416)
    
    Co-authored-by: Mike Edmunds <medmunds@gmail.com>
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jan 19, 2025
    …ing (pythonGH-122753)
    
    Email generators using email.policy.default could incorrectly omit the
    quote ('"') characters from a quoted-string during header refolding,
    leading to invalid address headers and enabling header spoofing. This
    change restores the quote characters on a bare-quoted-string as the
    header is refolded, and escapes backslash and quote chars in the string.
    (cherry picked from commit 5aaf416)
    
    Co-authored-by: Mike Edmunds <medmunds@gmail.com>
    bitdancer pushed a commit that referenced this issue Jan 19, 2025
    …ring (GH-122753) (#129007)
    
    gh-80222: Fix email address header folding with long quoted-string (GH-122753)
    
    Email generators using email.policy.default could incorrectly omit the
    quote ('"') characters from a quoted-string during header refolding,
    leading to invalid address headers and enabling header spoofing. This
    change restores the quote characters on a bare-quoted-string as the
    header is refolded, and escapes backslash and quote chars in the string.
    (cherry picked from commit 5aaf416)
    
    Co-authored-by: Mike Edmunds <medmunds@gmail.com>
    bitdancer pushed a commit that referenced this issue Jan 19, 2025
    …ring (GH-122753) (#129008)
    
    gh-80222: Fix email address header folding with long quoted-string (GH-122753)
    
    Email generators using email.policy.default could incorrectly omit the
    quote ('"') characters from a quoted-string during header refolding,
    leading to invalid address headers and enabling header spoofing. This
    change restores the quote characters on a bare-quoted-string as the
    header is refolded, and escapes backslash and quote chars in the string.
    (cherry picked from commit 5aaf416)
    
    Co-authored-by: Mike Edmunds <medmunds@gmail.com>
    srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this issue Jan 21, 2025
    …ing (python#122753)
    
    Email generators using email.policy.default could incorrectly omit the
    quote ('"') characters from a quoted-string during header refolding,
    leading to invalid address headers and enabling header spoofing. This
    change restores the quote characters on a bare-quoted-string as the
    header is refolded, and escapes backslash and quote chars in the string.
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir topic-email type-security A security issue
    Projects
    None yet
    Development

    Successfully merging a pull request may close this issue.

    5 participants