Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package scan error when scanning pip 22.0.4 #2911

Closed
JonoYang opened this issue Apr 7, 2022 · 3 comments
Closed

Package scan error when scanning pip 22.0.4 #2911

JonoYang opened this issue Apr 7, 2022 · 3 comments
Assignees

Comments

@JonoYang
Copy link
Member

JonoYang commented Apr 7, 2022

I downloaded pip 22.0.4 from https://files.pythonhosted.org/packages/33/c9/e2164122d365d8f823213a53970fa3005eb16218edcfc56ca24cb6deba2b/pip-22.0.4.tar.gz and I get the following error:

ERROR: failed to run scan plugin: packages:
Traceback (most recent call last):
  File "/home/jono/nexb/src/scancode-toolkit-2/src/scancode/cli.py", line 1055, in run_codebase_plugins
    plugin.process_codebase(codebase, **kwargs)
  File "/home/jono/nexb/src/scancode-toolkit-2/src/packagedcode/plugin_package.py", line 120, in process_codebase
    create_package_and_dep_instances(codebase, **kwargs)
  File "/home/jono/nexb/src/scancode-toolkit-2/src/packagedcode/plugin_package.py", line 202, in create_package_and_dep_instances
    for dep_instance in create_dependency_instances(
  File "/home/jono/nexb/src/scancode-toolkit-2/src/packagedcode/plugin_package.py", line 265, in create_dependency_instances
    purl = PackageURL.from_string(dependency['purl'])
  File "/home/jono/nexb/src/scancode-toolkit-2/venv/lib/python3.8/site-packages/packageurl/__init__.py", line 338, in from_string
    raise ValueError('A purl string argument is required.')
ValueError: A purl string argument is required.

From a casual glance at https://github.com/nexB/scancode-toolkit/blob/develop/src/packagedcode/plugin_package.py#L265, it appears that we are getting this issue because some dependencies do not have purl values.

@AyanSinhaMahapatra
Copy link
Member

The culprit is the file below:

pip-22.0.4/docs/requirements.txt:

sphinx ~= 4.2, != 4.4.0
towncrier
furo
myst_parser
sphinx-copybutton
sphinx-inline-tabs
sphinxcontrib-towncrier >= 0.2.0a0

# `docs.pipext` uses pip's internals to generate documentation. So, we install
# the current directory to make it work.
.

Here in the last line we have a . which is parsed and the following dependency is returned:

           {
              "purl": null,
              "extracted_requirement": ".",
              "scope": "install",
              "is_runtime": true,
              "is_optional": false,
              "is_resolved": false,
              "resolved_package": {}
            }

This purl less dependency causes the failure at dependency creation.

@pombredanne what should be the approach here to fix this?

  1. Is this something the requirements parser should handle?
  2. Should we discard these while assigning dependencies to package data/ creating dependency objects?
  3. Also this shouldn't have crashed right? Should add codebase errors and exit package/dependency creation?

@pombredanne
Copy link
Member

Is this something the requirements parser should handle?

It does afaik. If not please submit a patch at https://github.com/nexB/pip-requirements-parser

Should we discard these while assigning dependencies to package data/ creating dependency objects?

This sounds the best approach. "dot" and editable requirements are not really something that can be processed further (pip-requirements-parser should provide all that is needed to determine which is which)

Also this shouldn't have crashed right? Should add codebase errors and exit package/dependency creation?

in all cases this should not crash. IMHO we should skip the one record that failed to be assembled in a Depdency and add an error in the general case. But a "dot" should not be an error as this should not error out at all.

@AyanSinhaMahapatra
Copy link
Member

AyanSinhaMahapatra commented Apr 11, 2022

It does afaik.

So pip-requirements-parser does have a flag is_local_path which is set True for this.

"dot" and editable requirements are not really something that can be processed further

There are two places where we can drop this:

  1. If is_editable or is_local_path is True, skip adding a DependentPackage for this line to PackageData.dependencies
  2. If Purl is None, skip adding top-level Dependency for this.

Should both be done here?

in all cases this should not crash.

Doesn't errors in process_codebase steps get collected in codebase errors and not crash (like it happens in various scanners)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants