-
-
Notifications
You must be signed in to change notification settings - Fork 582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Control what is extracted with the extractcode command #15
Comments
--extract
I'd to like to work on this. By my understanding, this aims to provide more user-options. |
Considering that this issue has seemingly gone stale, I am not sure how useful my contribution here is. One problem I continue to have with extractcode is that it tries to extract sparse archives that are sometimes contained within source distributions, for example in the docker-ce source: These seemingly innocuous archives bloat to several gigabytes in size when extracted and regularly fill our scratchpad. My simple solution for this problem would be to add a CLI flag to extractcode that allows excluding files matching certain regex file patterns from being extracted. An example might be: extractcode myArchiveWithSparseTestFiles.tar --exclude="*sparse*;pax*" Where the exclusion patterns are separated by semicolons. I also assume that this issue is not unique to our use-case so if you agree that such an option provides additional value to users, I am more than willing to contribute it upstream @pombredanne |
@agschrei There is an open PR that deals with a similar issue: #1946 This is done by a prospective GSoC student and AFAIK is a working solution. There is some cleanup to do on the PR iteself, and I have pinged the original author to see if he has some extra code changes. Otherwise I will clean it up and hopefully have it merged soon. We use |
@agschrei If interested, you can test out that particular branch here: https://github.com/JRavi2/scancode-toolkit/tree/add-ignore-flag You may want to test out this solution on your end to see if there are any bugs in a real use case. |
Thanks. Looking forward to your response! |
Update .gitignore to ignore Jupyter temp files
extractcode
supports selecting what is extracted but this is not exposed as a command line option.There should be a way to control what is extracted possibly with an expanded option
--extract=<kind>
or but making extraction a separate of a sub commandThe text was updated successfully, but these errors were encountered: