-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
email: folding of quoted string in display_name violates RFC #80222
Comments
When using a policy for an EmailMessage that triggers folding (during serialization) of a fairly long display_name in an address field, the folding process removes the quotes from the display name breaking the semantics of the field. In particular, for a From address's display name like r'anything@anything.com ' + 'a' * MAX_LINE_LEN the folding puts anything@anything.com unquoted immediately after the From: header. For applications that do sender verification inside and then send it to an internal SMTP server that does not perform its own sender verification this could be considered a security issue since it enables sender spoofing. Receiving mail servers might be able to detect the broken header, but experiments show that the mail gets delivered. Simple demonstration (reproduced in attachment) of issue: SMTP_POLICY = email.policy.default.clone(linesep='\r\n', max_line_length=72)
address = Address(display_name=r'anything@anything.com ' + 'a' * 72, addr_spec='dev@local.startmail.org')
message = EmailMessage(policy=SMTP_POLICY)
message['From'] = Address(display_name=display_name, addr_spec=addr_spec)
# Trigger folding (via as_string()), then parse it back in.
msg_string = message.as_string()
msg_bytes = msg_string.encode('utf-8')
msg_deserialized = BytesParser(policy=SMTP_POLICY).parsebytes(msg_bytes)
# Verify badness
from_hdr = msg_deserialized['From']
assert from_hdr != str(address) # But they should be equal... |
Since Address itself renders it correctly (str(address)), the problem is going to take a bit of digging to find. I'm guessing the quoted_string atom is getting transformed incorrectly into something else at some point during the folding. |
Hi David, the problem is in email._header_value_parser._refold_parse_tree. Specifically, when the parsetree renders too long, it recursively gets newparts = list(part) (the children). When it does this to a BareQuotedString node, the child nodes are unquoted and unescaped and it just happily serializes these. I thought I had attached a file that monkey patches the _refold_parse_tree function with a fixed version... let me try again. |
I'm afraid I don't have time to parse through the file you uploaded. Can you produce a pull request or a diff showing your fix? And ideally some added tests :) But whatever you can do is great, if you don't have time maybe someone else will pick it up (I unfortunately don't have time, though I should be able to do a review of a PR). |
Sure thing, I'll try to produce something tomorrow. |
Sorry about the delay. I opened pull request #12054 for this. Let me know if you need anything else. |
Although I am not personally interested in backporting a fix for this issue, anyone that experiences this issue in python 3.5 can execute the following monkey patch to solve the issue: def _fix_issue_36041_3_5():
from email._header_value_parser import QuotedString, ValueTerminal, quote_string
import email._header_value_parser
class BareQuotedString(QuotedString):
token_type = 'bare-quoted-string'
def __str__(self):
return quote_string(''.join(str(x) for x in self))
@property
def value(self):
return ''.join(str(x) for x in self)
@property
def parts(self):
parts = list(self)
escaped_parts = []
for part in parts:
if isinstance(part, ValueTerminal):
escaped = quote_string(str(part))[1:-1]
escaped_parts.append(ValueTerminal(escaped, 'ptext'))
else:
escaped_parts.append(part)
# Add quotes to the first and last parts.
escaped_parts[0] = ValueTerminal('"' + str(escaped_parts[0]), 'ptext')
escaped_parts[-1] = ValueTerminal(str(escaped_parts[-1] + '"'), 'ptext')
return escaped_parts
email._header_value_parser.BareQuotedString = BareQuotedString |
I came to report a similar issue (discovered during writing tests for #118643). I thing they are the parts of the same problem. The address like |
Email generators using email.policy.default could incorrectly omit the quote ('"') characters from a quoted-string during header refolding, leading to invalid address headers and enabling header spoofing. This change restores the quote characters on a bare-quoted-string as the header is refolded, and escapes backslash and quote chars in the string.
Still an issue in 3.13. #security-issue. |
…122753) Email generators using email.policy.default could incorrectly omit the quote ('"') characters from a quoted-string during header refolding, leading to invalid address headers and enabling header spoofing. This change restores the quote characters on a bare-quoted-string as the header is refolded, and escapes backslash and quote chars in the string.
…ing (pythonGH-122753) Email generators using email.policy.default could incorrectly omit the quote ('"') characters from a quoted-string during header refolding, leading to invalid address headers and enabling header spoofing. This change restores the quote characters on a bare-quoted-string as the header is refolded, and escapes backslash and quote chars in the string. (cherry picked from commit 5aaf416) Co-authored-by: Mike Edmunds <medmunds@gmail.com>
…ing (pythonGH-122753) Email generators using email.policy.default could incorrectly omit the quote ('"') characters from a quoted-string during header refolding, leading to invalid address headers and enabling header spoofing. This change restores the quote characters on a bare-quoted-string as the header is refolded, and escapes backslash and quote chars in the string. (cherry picked from commit 5aaf416) Co-authored-by: Mike Edmunds <medmunds@gmail.com>
…ing (pythonGH-122753) Email generators using email.policy.default could incorrectly omit the quote ('"') characters from a quoted-string during header refolding, leading to invalid address headers and enabling header spoofing. This change restores the quote characters on a bare-quoted-string as the header is refolded, and escapes backslash and quote chars in the string. (cherry picked from commit 5aaf416) Co-authored-by: Mike Edmunds <medmunds@gmail.com>
…ring (GH-122753) (#129007) gh-80222: Fix email address header folding with long quoted-string (GH-122753) Email generators using email.policy.default could incorrectly omit the quote ('"') characters from a quoted-string during header refolding, leading to invalid address headers and enabling header spoofing. This change restores the quote characters on a bare-quoted-string as the header is refolded, and escapes backslash and quote chars in the string. (cherry picked from commit 5aaf416) Co-authored-by: Mike Edmunds <medmunds@gmail.com>
…ring (GH-122753) (#129008) gh-80222: Fix email address header folding with long quoted-string (GH-122753) Email generators using email.policy.default could incorrectly omit the quote ('"') characters from a quoted-string during header refolding, leading to invalid address headers and enabling header spoofing. This change restores the quote characters on a bare-quoted-string as the header is refolded, and escapes backslash and quote chars in the string. (cherry picked from commit 5aaf416) Co-authored-by: Mike Edmunds <medmunds@gmail.com>
…ing (python#122753) Email generators using email.policy.default could incorrectly omit the quote ('"') characters from a quoted-string during header refolding, leading to invalid address headers and enabling header spoofing. This change restores the quote characters on a bare-quoted-string as the header is refolded, and escapes backslash and quote chars in the string.
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
The text was updated successfully, but these errors were encountered: