Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

does bencode dictionary allow duplicated keys? #153

Open
trim21 opened this issue Apr 14, 2024 · 8 comments
Open

does bencode dictionary allow duplicated keys? #153

trim21 opened this issue Apr 14, 2024 · 8 comments

Comments

@trim21
Copy link

trim21 commented Apr 14, 2024

Dictionaries are encoded as a 'd' followed by a list of alternating keys and their corresponding values followed by an 'e'. For example, d3:cow3:moo4:spam4:eggse corresponds to {'cow': 'moo', 'spam': 'eggs'} and d4:spaml1:a1:bee corresponds to {'spam': ['a', 'b']}. Keys must be strings and appear in sorted order (sorted as raw strings, not alphanumerics).

for example: d3:keyi1e3:keyi2ee for {"key": 1, b"key": 2}(python)

@trim21
Copy link
Author

trim21 commented May 29, 2024

apparently dictionary should not have duplicated keys, otherwise after sorting it will have different encoding result for same content.

for example, {"key": 1, b"key": 2} can be encode as both d3:keyi1e3:keyi2ee or d3:keyi2e3:keyi1ee

@the8472
Copy link
Contributor

the8472 commented May 29, 2024

bencode itself has no concept of encoding, that's an artifact of the language you're using.

BEP 52 clarifies this.

@trim21
Copy link
Author

trim21 commented May 29, 2024

bencode itself has no concept of encoding, that's an artifact of the language you're using.

BEP 52 clarifies this.

that's why I'm asking, it doesn't clarify if d3:keyi1e3:keyi2ee is valid bencode content.

@trim21
Copy link
Author

trim21 commented May 29, 2024

bencode itself has no concept of encoding, that's an artifact of the language you're using.

BEP 52 clarifies this.

encoding itself is a concept of programing language, but bencode content itself is not.

@the8472
Copy link
Contributor

the8472 commented May 29, 2024

{"key": 1, b"key": 2}

What I mean is the type distinction between string and binary is something that exists in the language. If that didn't exist you couldn't have duplicates there either.

But yes, the word "unique" could be inserted somewhere.

@trim21
Copy link
Author

trim21 commented May 29, 2024

{"key": 1, b"key": 2}

What I mean is the type distinction between string and binary is something that exists in the language. If that didn't exist you couldn't have duplicates there either.

But yes, the word "unique" could be inserted somewhere.

This problem here is just like url query, it's also a key-value pair, but allow duplicated keys. just being key-value pair is not enough.

@the8472
Copy link
Contributor

the8472 commented May 29, 2024

The spec was written by python developers (python 2 back then) who I figure understood dictionary to mean unique keys.

And yes, other implementations also require unique keys.

@trim21
Copy link
Author

trim21 commented May 29, 2024

The spec was written by python developers (python 2 back then) who I figure understood dictionary to mean unique keys.

And yes, other implementations also require unique keys.

thanks for your clarification, I just send a PR to document this, hope it can get merged.

(I don't know python 2 well but it has a unicode type I think?)

jasonaowen added a commit to jasonaowen/bittorrent.org that referenced this issue Jan 24, 2025
    Document that duplicated keys in bencode directory are not allowed

    bittorrent#153

    here d3:keyi1e3:keyi2ee or d3:keyi2e3:keyi1ee are same directory but
    can have different bencode content.

    This may confusing info_hash generator. save torrent will result
    into different info_hash

bittorrent#154
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants