-
-
Notifications
You must be signed in to change notification settings - Fork 582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KeyError exception in multiprocessing module #413
Comments
Let me look into this. |
I can reproduce this with: |
The error comes from here: https://github.com/nexB/scancode-toolkit/blob/645902627cd82a9e90f7f3a82038c5b5c6f8130d/src/scancode/cache.py#L119 and this is because there is a backslash in the file name. |
FWIW, these are a bunch of test files for nodemon. Fix coming in soon. |
Eek. Which means it's not even a valid file name on Windows as also reported here. |
So the fix is going to be in multiple steps:
|
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* this ensure that exceptions that happening while multiprocessing and multithreading are not truncated and reported as errors. Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* this ensure that valid paths can still be processed including a path with backslah on Linux/POSIX. * also make sure that file info are always returned (eventually empty if there was an error to fetch them) Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* this ensure that a file name can still be extracted from a valid path even a path with backslah on Linux/POSIX. Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
The PR #414 should fix things... but It need to review the tests on Windows and Mac first too. |
So almost there with the Mac and Windows tests. I am using archives as test files such that the repo can still be cloned everywhere ... and updating extractcode to properly handle these cases and rename files with illegal file names on any given OS. |
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* this ensure that exceptions that happening while multiprocessing and multithreading are not truncated and reported as errors. Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* this ensure that valid paths can still be processed including a path with backslah on Linux/POSIX. * also make sure that file info are always returned (eventually empty if there was an error to fetch them) Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* this ensure that a file name can still be extracted from a valid path even a path with backslah on Linux/POSIX. Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* file and directory names are now transformed to POSIX portable names * COM, PRN and other illegal Windows names are updated to be legal names. * when scanning and not extracting, file_name is properly extracted by detecting possible backslash in a file name * as a result it is possible to process on Windows archives that contain illegal names. Ror instance the repo at https://github.com/remy/nodemon contains such files and cannot be cloned on Windows. Yet a tarball of this repo is extracted properly by extractcode and can then be scanned alright. Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* separate code from "new_name" for portable file name transforms * update safe_path accordingly * improve relative paths resolution for corner cases Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* unfortunately the behavior is not consistent on all OSes Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* ensure that scancode and extractcode tests are running verbose and are running first Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* Mac and Linux behavior are now the same thanks to the new path unicode transliteration Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* this should help review the CI failures more easily and update the (different) expectations for each OS. Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* this is still an issue to solve with #16 though Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* this test cannot be made on Windows as these illegal filenames cannot be created there. Instead other tests of extractcode are testing a proper handling at extraction time of such names transforming illegal names in legal names. Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* marked some tests as expected to fail for now. These are corner cases and seem to only pass correctly on Linux Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* the transliteration and file renaming is taking place correctly and no file is skipped. Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
* test a posix version of the paths for portable assertions Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
#413 no crash on weird file names * This is a fairly significant change in particular on the extractcode side. * We now can handle properly files that could otherwise not be processed on some OS such as windows because they have illegal names for that OS.
All the planned fixes have been applied eg.:
|
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
Using ScanCode 1ec43b7 on Linux with Python 2.7 I get:
The command line used was
and the project scanned was remy/nodemon@2cd85b1.
The text was updated successfully, but these errors were encountered: