-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not immediately verify latest root when rotating #885
Do not immediately verify latest root when rotating #885
Conversation
@tanishqjasoria Can you also take a look? |
Yeah, I should say that someone should triple-check this to make we are not skipping some important security check. But at that point, all we are trying to do is get the latest version number of the root. Cc @lukpueh |
I guess there might be a problem. Form switching to _get_file() from _get_metadata_file(), the root.json we get is not verified. |
@tanishqjasoria Good observation, but this would be true even if with a signed and verified "latest" root metadata file: attackers who control the repository can always perform a freeze attack (limited by the expiration of the previous root, which could be as long as years). |
@trishankatdatadog I missed that! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch, @trishankatdatadog! I am not sure I understand the full consequences of your changes. Could you provide answers to my questions below?
tuf/client/updater.py
Outdated
# should refresh them here as a protective measure. See Issue #736. | ||
self._rebuild_key_and_role_db() | ||
self.consistent_snapshot = \ | ||
self.metadata['current']['root']['consistent_snapshot'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment in L1152-1154 claims that self.consistent_snapshot
is set with _update_metadata()
.
Why set it here explicitly? Is the comment above inaccurate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IDK, it's possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I somehow missed the call to _update_metadata
in the top line of below for-loop. It downloads all the root files again using _get_metadata_file
exactly how it used to fetch latest_root_metadata_file
, before your patch. That means, all the checks that I listed in my last review are performed eventually.
So I'd say, LGTM! :)
Do you have capacities to add a test that has at least one intermediate root and where the latest root is signed with keys that are not available in the beginning?
On a side-note, I re-read the relevant part of the spec and it seems a bit different from what we do here. As per the spec the client downloads root files with incrementing version numbers until the server 404s. So in the spec there is no initial download of the latest root.
And on another side-note, did you see @vladimir-v-diaz's TODO comment in the refresh
, where _update_root_metadata is called?
@lukpueh Following our discussion, I have adjusted the code to follow the spec. Please review. To answer your questions:
|
Our spec should also be updated to say: once you have turned on consistent snapshots, you should always also write versioned besides unversioned root metadata files so that outdated clients can updated to your latest root. @JustinCappos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your update, @trishankatdatadog. It does look more like the spec now. However, the code does not work yet. @JustinCappos, do we have an intern who can take over, i.e. fix the things I noted below and add some tests?
@lukpueh Sorry, I wrote on one machine, and tested on another, so there were some mistakes in translation. Does this work better for you? |
d6359e1
to
eb81246
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates! I'm unsure whether you need request.exceptions
. Otherwise it looks good.
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
fbff603
to
5fc3d90
Compare
Thanks for the update, @trishankatdatadog! This looks sound now. I'd still like to see tests before we merge. @JustinCappos, do we have a student you might want to volunteer? If not, I'll add some tests. |
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
54a101b
to
3bb4f73
Compare
This is very important to Datadog for our next update. Could we get this fixed ASAP? Thanks. |
FYI. I am working on tests... |
Thanks for helping to write the tests, @lukpueh... |
And here they are: trishankatdatadog/tuf@b170601...lukpueh:correctly-rotate-root-test @trishankatdatadog, could you please sign off your most recent commit? And would you mind adding my three tests-adding commits linked above on top of your branch? Then we can review the tests here on the same PR. |
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
…trishankatdatadog/tuf into trishankatdatadog/correctly-rotate-root Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
7f32a1e
to
cd2a361
Compare
@lukpueh I had to manually set DCO to pass, should be OK (only one commit done directly from GitHub didn't have sign off). Great test, BTW. Let me know if you need anything else! |
…ectly-rotate-root Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Thanks for the fix, @trishankatdatadog!! 💯 |
Since theupdateframework#885 the tests in TestUpdater and TestKeyRevocation fail on Appveyor Python 2.7 builds. After some live debugging, it turns out that the tests fail due to the extra amount of http requests to the simple http server (see tests/simple_server.py) that were added in theupdateframework#885. The simple server runs in a subprocess and is re-used for the entire TestCase. After a certain amount of requests it becomes unresponsive. Note that neither the subprocess exits (ps -W), nor does the port get closed (netstat -a). It just doesn't serve the request, making it time out and fail the test. The following script can be used to reproduce the issue (run in tests directory): ```python import subprocess import requests import random counter = 0 port = random.randint(30000, 45000) command = ['python', 'simple_server.py', str(port)] server_process = subprocess.Popen(command, stderr=subprocess.PIPE) url = 'http://localhost:'+str(port) + '/' sess = requests.Session() try: while True: sess.get(url, timeout=3) counter +=1 finally: print(counter) server_process.kill() ``` It fails repeatedly on the 69th request, but only if `stderr=subprocess.PIPE` is passed to Popen. Given that for each request the simple server writes about ~60 characters to stderr, e.g. ``` 127.0.0.1 - - [24/Feb/2020 12:01:23] "GET / HTTP/1.1" 200 - ``` ... it looks a lot like a full pipe buffer of size 4096. Note that the `bufsize` argument to Popen does not change anything. As a simple work around we silence the test server on Windows/Python2 to not fill the buffer.
Since theupdateframework#885 the tests in TestUpdater and TestKeyRevocation fail on Appveyor Python 2.7 builds. After some live debugging, it turns out that the tests fail due to the extra amount of http requests to the simple http server (see tests/simple_server.py) that were added in theupdateframework#885. The simple server runs in a subprocess and is re-used for the entire TestCase. After a certain amount of requests it becomes unresponsive. Note that neither the subprocess exits (ps -W), nor does the port get closed (netstat -a). It just doesn't serve the request, making it time out and fail the test. The following script can be used to reproduce the issue (run in tests directory): ```python import subprocess import requests import random counter = 0 port = random.randint(30000, 45000) command = ['python', 'simple_server.py', str(port)] server_process = subprocess.Popen(command, stderr=subprocess.PIPE) url = 'http://localhost:'+str(port) + '/' sess = requests.Session() try: while True: sess.get(url, timeout=3) counter +=1 finally: print(counter) server_process.kill() ``` It fails repeatedly on the 69th request, but only if `stderr=subprocess.PIPE` is passed to Popen. Given that for each request the simple server writes about ~60 characters to stderr, e.g. ``` 127.0.0.1 - - [24/Feb/2020 12:01:23] "GET / HTTP/1.1" 200 - ``` ... it looks a lot like a full pipe buffer of size 4096. Note that the `bufsize` argument to Popen does not change anything. As a simple work around we silence the test server on Windows/Python2 to not fill the buffer. Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Since theupdateframework#885 the tests in TestUpdater and TestKeyRevocation fail on Appveyor Python 2.7 builds. After some live debugging, it turns out that the tests fail due to the extra amount of http requests to the simple http server (see tests/simple_server.py) that were added in theupdateframework#885. The simple server runs in a subprocess and is re-used for the entire TestCase. After a certain amount of requests it becomes unresponsive. Note that neither the subprocess exits (ps -W), nor does the port get closed (netstat -a). It just doesn't serve the request, making it time out and fail the test. The following script can be used to reproduce the issue (run in tests directory): ```python import subprocess import requests import random counter = 0 port = random.randint(30000, 45000) command = ['python', 'simple_server.py', str(port)] server_process = subprocess.Popen(command, stderr=subprocess.PIPE) url = 'http://localhost:'+str(port) + '/' sess = requests.Session() try: while True: sess.get(url, timeout=3) counter +=1 finally: print(counter) server_process.kill() ``` It fails repeatedly on the 69th request, but only if `stderr=subprocess.PIPE` is passed to Popen. Given that for each request the simple server writes about ~60 characters to stderr, e.g. ``` 127.0.0.1 - - [24/Feb/2020 12:01:23] "GET / HTTP/1.1" 200 - ``` ... it looks a lot like a full pipe buffer of size 4096. Note that the `bufsize` argument to Popen does not change anything. As a simple work around we silence the test server on Windows/Python2 to not fill the buffer. Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Since theupdateframework#885 the tests in TestUpdater and TestKeyRevocation fail on Appveyor Python 2.7 builds. After some live debugging, it turns out that the tests fail due to the extra amount of http requests to the simple http server (see tests/simple_server.py) that were added in theupdateframework#885. The simple server runs in a subprocess and is re-used for the entire TestCase. After a certain amount of requests it becomes unresponsive. Note that neither the subprocess exits (ps -W), nor does the port get closed (netstat -a). It just doesn't serve the request, making it time out and fail the test. The following script can be used to reproduce the issue (run in tests directory): ```python import subprocess import requests import random counter = 0 port = random.randint(30000, 45000) command = ['python', 'simple_server.py', str(port)] server_process = subprocess.Popen(command, stderr=subprocess.PIPE) url = 'http://localhost:'+str(port) + '/' sess = requests.Session() try: while True: sess.get(url, timeout=3) counter +=1 finally: print(counter) server_process.kill() ``` It fails repeatedly on the 69th request, but only if `stderr=subprocess.PIPE` is passed to Popen. Given that for each request the simple server writes about ~60 characters to stderr, e.g. ... ``` 127.0.0.1 - - [24/Feb/2020 12:01:23] "GET / HTTP/1.1" 200 - ``` ... it looks a lot like a full pipe buffer of size 4096. Note that the `bufsize` argument to Popen does not change anything. As a simple work around we silence the test server on Windows/Python2 to not fill the buffer. Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Since theupdateframework#885 the tests in TestUpdater and TestKeyRevocation fail on Appveyor Python 2.7 builds. After some live debugging, it turns out that the tests fail due to the extra amount of http requests to the simple http server (see tests/simple_server.py) that were added in theupdateframework#885. The simple server runs in a subprocess and is re-used for the entire TestCase. After a certain amount of requests it becomes unresponsive. Note that neither the subprocess exits (ps -W), nor does the port get closed (netstat -a). It just doesn't serve the request, making it time out and fail the test. The following script can be used to reproduce the issue (run in tests directory): ```python import subprocess import requests import random counter = 0 port = random.randint(30000, 45000) command = ['python', 'simple_server.py', str(port)] server_process = subprocess.Popen(command, stderr=subprocess.PIPE) url = 'http://localhost:'+str(port) + '/' sess = requests.Session() try: while True: sess.get(url, timeout=3) counter +=1 finally: print(counter) server_process.kill() ``` It fails repeatedly on the 69th request, but only if `stderr=subprocess.PIPE` is passed to Popen. Given that for each request the simple server writes about ~60 characters to stderr, e.g. ... ``` 127.0.0.1 - - [24/Feb/2020 12:01:23] "GET / HTTP/1.1" 200 - ``` ... it looks a lot like a full pipe buffer of size 4096. Note that the `bufsize` argument to Popen does not change anything. As a simple work around we silence the test server on Windows/Python2 to not fill the buffer. Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Fixes issue #:
N/A
Description of the changes being introduced by the pull request:
When rotating the root (say from v2 to v3, then v3 to v4), the reference implementation currently immediately verifies signatures on the latest root (say, v4). This will fail when v4 is signed using v3's but not v2's keys. This PR changes the implementation so that signatures on the latest root is not immediately verified.
Please verify and check that the pull request fulfills the following
requirements: