Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeatable installs via hashing #3137

Closed
wants to merge 40 commits into from
Closed
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
e058486
Fix some docstring typos.
erikrose Sep 3, 2015
62ac258
Delete dead _copy_dist_from_dir().
erikrose Sep 11, 2015
9211d6e
Style tweaks
erikrose Sep 11, 2015
3303be0
Teach requirements parser how to parser hash options, like --sha256.
erikrose Sep 3, 2015
1e41f01
Add checks against requirements-file-dwelling hashes for most kinds o…
erikrose Sep 9, 2015
11dbb92
Switch from --sha256 etc. to a single option: --hash.
erikrose Sep 24, 2015
0c17248
Pass PEP 8 checks.
erikrose Sep 24, 2015
b0ef6ab
Fix unicode errors in unit tests of Hashes under Python 3.
erikrose Sep 25, 2015
f3f73f1
Remove the -H spelling for --hashes.
erikrose Sep 25, 2015
910b82c
--require-hashes no longer implies --no-deps.
erikrose Sep 25, 2015
4f67374
Correct the level of the Wheel Cache heading.
erikrose Oct 7, 2015
14506f8
Document hash-checking mode.
erikrose Oct 7, 2015
bf0ff80
pep8 fixes
erikrose Oct 7, 2015
c62cd71
Add --require-hashes option to pip download and pip wheel.
erikrose Oct 7, 2015
09008bf
Add `pip hash` command.
erikrose Oct 8, 2015
d477ae6
Add warning about `python setup.py install`.
erikrose Oct 8, 2015
7a0a97c
Merge 'develop' into 'hashing' to bring the latter up to date.
erikrose Oct 8, 2015
0e6058b
Change head() method to an attr in hashing exceptions. Tweak English.
erikrose Oct 8, 2015
6f828c3
Correct and clarify docs and comments.
erikrose Oct 9, 2015
52111c1
Demote package-is-already-installed log message to debug-level.
erikrose Oct 9, 2015
b95599a
Change _good_hashes() to a whitelist.
erikrose Oct 9, 2015
3824d73
Revise what hashes protect you against.
erikrose Oct 9, 2015
be4e315
Rewrap args of unpack_http_url() to match the style in send(), above.
erikrose Oct 9, 2015
304c90a
Break after initial """ in multi-paragraph docstrings in exceptions m…
erikrose Oct 9, 2015
05b7ef9
Rename "goods" to "allowed" for clarity.
erikrose Oct 11, 2015
f35ce75
Make "installation bundles" less of an official term.
erikrose Oct 11, 2015
d541304
Allow === as a pinning operator.
erikrose Oct 11, 2015
76983f3
Restore documentation about alternate hash algorithms in URLs.
erikrose Oct 12, 2015
be6dccb
Factor up the idiom of reading chunks from a file until EOF.
erikrose Oct 12, 2015
9e5e34e
Add --algorithm flag to `pip hash`.
erikrose Oct 12, 2015
4c405a0
Restore deleted _copy_dist_from_dir().
erikrose Oct 12, 2015
dcf39bf
Add imports to make the pep8 checker happy about the dead _copy_dist_…
erikrose Oct 12, 2015
7c5e503
Remove unneeded triple quotes.
erikrose Oct 12, 2015
e23f596
Consolidate hash constants in pip.utils.hashing.
erikrose Oct 12, 2015
925e4b4
Fix false hash mismatches when installing a package that has a cached…
erikrose Oct 16, 2015
622b430
Typos and docstrings
erikrose Oct 20, 2015
ee9d6fb
Modernize recommendations to not call setuptools-level things directly.
erikrose Oct 20, 2015
3af5ffa
Improve flow of --require-hashes help message.
erikrose Oct 20, 2015
f38fc90
Obey --require-hashes option in requirements files.
erikrose Oct 21, 2015
4488047
Update the wheel-cache-disabling docs with our latest understanding o…
erikrose Oct 21, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions pip/cmdoptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from __future__ import absolute_import

from functools import partial
import hashlib
from optparse import OptionGroup, SUPPRESS_HELP, Option
import warnings

Expand Down Expand Up @@ -523,6 +524,47 @@ def only_binary():
)


def _good_hashes():
"""Return names of hashlib algorithms at least as strong as sha256."""
# Remove getattr when 2.6 dies.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think 2.7 has this before like, 2.7.9 so probably we can't get rid of it until we get rid of 2.7.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.python.org/3/whatsnew/2.7.html suggests it was added in 2.7.0; it lists things added in 2.7.x separately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'm stupid I was confusing it with guaranteed_hashes.

algos = set(
getattr(hashlib,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably get moved to pip/compat.py so we'd have like from pip.compat import hashlib_algorithms

'algorithms',
('md5', 'sha1', 'sha224', 'sha256', 'sha384', 'sha512')))
return algos - set(['md5', 'sha1', 'sha224'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a good way to actually tell what the strength of these algorithms are. It assumes that there can never be an additional algorithm that isn't as strong as sha256 added to Python. Something like:

import hashlib

def _good_hashes():
    return set([x for x in hashlib.algorithms if hashlib.new(x).digest_size >= 32])

Ideally without a magical 32 in there, for that matter this probably doesn't need to be a function at all does it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, digest size is not an authoritative indicator of hash strength. I think we have to go with either a whitelist or a blacklist. I chose a blacklist so people would have the ability to use newer, stronger (or even custom) hashes without waiting for a pip update. Would you rather play it safe and go with a whitelist?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a whitelist then.



def _merge_hash(option, opt_str, value, parser):
"""Given a value spelled "algo:digest", append the digest to a list
pointed to in a dict by the algo name."""
if not parser.values.hashes:
parser.values.hashes = {}
try:
algo, digest = value.split(':', 1)
except ValueError:
parser.error('Arguments to %s must be a hash name '
'followed by a value, like --hash=sha256:abcde...' %
opt_str)
goods = _good_hashes()
if algo not in goods:
parser.error('Allowed hash algorithms for %s are %s.' %
(opt_str, ', '.join(sorted(goods))))
parser.values.hashes.setdefault(algo, []).append(digest)


hash = partial(
Option,
'-H', '--hash',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this deserve a short option like -H? It feels like this is something that is only rarely going to be typed out by users, especially given that you need to copy/paste (likely anyways, I doubt anyone is typing it out) a long hash digest. I'm not strictly opposed to it, but I also kind of think we should reserve giving short options for things that users are likely to actually type as part of the command invocation.

# Hash values eventually end up in InstallRequirement.hashes due to
# __dict__ copying in process_line().
dest='hashes',
action='callback',
callback=_merge_hash,
type='string',
help="Verify that the package's archive matches this "
'hash before installing. Example: --hash=sha256:abcdef...')


##########
# groups #
##########
Expand Down
9 changes: 9 additions & 0 deletions pip/commands/install.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,14 @@ def __init__(self, *args, **kw):

cmd_opts.add_option(cmdoptions.no_clean())

cmd_opts.add_option(
'--require-hashes',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be for pip install only? Does it make sense to be able to mandate hashes for pip download or pip wheel as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable and not too hard, looking at the code. I did all the operative stuff in RequirementSet, after all. But perhaps it could be a separate PR, later, since I'm about out of time to work on this. Also, nothing supports --hash options on the commandline yet, which is a sad hole in the story. That will need either some serious monkeying around with optparse to support interspersed args and options or else your aforementioned switch to click.

dest='require_hashes',
action='store_true',
help='Perform a provably repeatable installation by requiring a '
'hash to check each package against. Implied by the presence '
'of a --hash option on any individual requirement')

index_opts = cmdoptions.make_option_group(
cmdoptions.index_group,
self.parser,
Expand Down Expand Up @@ -266,6 +274,7 @@ def run(self, options, args):
pycompile=options.compile,
isolated=options.isolated_mode,
wheel_cache=wheel_cache,
require_hashes=options.require_hashes,
)

self.populate_requirement_set(
Expand Down
Loading