Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DejaCode scan_single_package for previously failed scans results in bad request #222

Open
ghsa-retrieval opened this issue Jan 8, 2025 · 3 comments
Assignees
Labels
bug Something isn't working design needed Design details needed to complete the issue enhancement New feature or request

Comments

@ghsa-retrieval
Copy link

ghsa-retrieval commented Jan 8, 2025

Describe the bug
When DejaCode is tasked with analyzing an SBOM it roughly performs two steps:

  1. Create a load_sbom pipeline in ScanCode.io and import the packages into the inventory
  2. If the "Scan all packages of this product post-import" is enabled, submit scan_single_package for each of the entries in the inventory

Due to unforseen circumstance it can happen that a scan_single_package pipeline fails in ScanCode.io. If one attempts to load another SBOM or rerun the scan through Action > Scan all packages, this results in Bad Request HTTP 400 responses. ScanCode.io rejects the requests to /api/projects with {"name":["project with this name already exists."]}.

It seems this can only be fixed by manually deleting the failed projects in ScanCode.io. For the end user is not clear why the package scan does not start and if only some are affected, it may even go entirely unnoticed resulting in incomplete data for the product in DejaCode.

To Reproduce
Setup DejaCode to use a ScanCode.io instance.

Steps to reproduce the behavior:

  1. Create a test project in DejaCode
  2. Navigate to Action > Load Packages from SBOMs
  3. Select an SBOM and check "Update existing packages with discovered packages data" and "can all packages of this product post-import"
  4. Press "Load Packages"
  5. Interrupt the ScanCode.io worker in some way after load_sbom has completed e.g. terminate it, interrupt connection to DB
  6. Either repeat steps 2-4 or use Action > Scan all packages

Observe in the logs that ScanCode.io complains about Bad Requests and pipeline not being restarted.

Expected behavior
The expected behavior is that DejaCode would either restart the pipeline if it already exists or deletes and recreates it. Perhaps behavior could also be changed on ScanCode.io's side where a call to an existing project simply restarts the pipeline.

Screenshots
Example requests and response (IPs, URLs and tokens replaced with dummy data)

POST /api/projects/ HTTP/1.0
X-Forwarded-For: 198.51.100.150, 198.51.100.166
X-Forwarded-Host: scancodeio.example.com:8080
X-Forwarded-Proto: http
Host: scancodeio.example.com
Connection: close
Content-Length: 380
X-Request-ID: 9c6363a1bd7f8bc19898eaee2d34e5a5
X-Real-IP: 198.51.100.150
X-Forwarded-Port: 443
X-Forwarded-Scheme: https
X-Scheme: https
User-Agent: python-requests/2.32.3
Accept-Encoding: gzip, deflate
Accept: */*
Authorization: Token xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Content-Type: application/json

{"name": "f58d1b9466.cabd8d8bf5.ca4c555982", "input_urls": "https://registry.npmjs.org/range-parser/-/range-parser-1.2.1.tgz", "pipeline": "scan_single_package", "execute_now": true, "webhook_url": "https://dejacode.example.com/notifications/send_scan_notification/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:XXXXXX:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/"}

HTTP/1.0 400 Bad Request
Server: gunicorn
Date: Wed, 08 Jan 2025 08:14:47 GMT
Connection: close
Content-Type: application/json
Vary: Accept
Allow: GET, POST, HEAD, OPTIONS
X-Frame-Options: DENY
Content-Length: 51
X-Content-Type-Options: nosniff
Referrer-Policy: same-origin
Cross-Origin-Opener-Policy: same-origin

{"name":["project with this name already exists."]}

Context (OS, Browser, Device, etc.):
n.a.

@ghsa-retrieval ghsa-retrieval added bug Something isn't working design needed Design details needed to complete the issue enhancement New feature or request labels Jan 8, 2025
@DennisClark
Copy link
Member

@ghsa-retrieval thanks very much for providing all the pertinent details.

@ghsa-retrieval
Copy link
Author

ghsa-retrieval commented Feb 7, 2025

It seems this even happens with successful scanned projects in ScanCode.io. This can cause problems when trying to fix a package import was seemingly not properly completed. For instance, in my test the usage policy was not assigned to some packages, but I could not scan the packages again to trigger the assignment.

@ghsa-retrieval
Copy link
Author

ghsa-retrieval commented Feb 10, 2025

Ideally, one should be able to pick whether to:

  • Repeat the entire scan
  • Only repeat the scan for previously failed pipelines and reimport for already successfully completed

The latter would be significantly more efficient than rerunning the entire analysis of all packages when there are only a handful of failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working design needed Design details needed to complete the issue enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants