Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Upload: Clean up files left in /tmp after successful ingest #3818

Closed
kcondon opened this issue May 3, 2017 · 5 comments
Closed

File Upload: Clean up files left in /tmp after successful ingest #3818

kcondon opened this issue May 3, 2017 · 5 comments
Labels

Comments

@kcondon
Copy link
Contributor

kcondon commented May 3, 2017

Ingest leaves files in /tmp. This is not normally an issue but ideally we clean up after ourselves so that we don't consume space, especially in the case of large uploads of many files. Though /tmp cleaner jobs take care of this in most cases, it would make the task lighter if we clean up when we can.

For normal file ingest, in v4.6.1 we leave files as a result of csv, excel ingest of the form firstpass*.tab

In v4.6.2 we have added support for Swift storage and in addition to the behavior in v4.6.1 for local files, Swift files leave the following:
-csv and excel ingest leaves files of form firstpass*.tab
-large images leave files of the form tempFileToRescale*.tmp
-other ingest files (.sav, .spss, .por) leave files of form tempIngestSourceFile*.tmp

Also see related ticket on documenting general temp file use and recommendations for sys admins, #2848

@oscardssmith
Copy link
Contributor

as of #3767, csv ingest will no longer do this.

@pdurbin pdurbin added the User Role: Sysadmin Installs, upgrades, and configures the system, connects via ssh label Jul 12, 2017
@mheppler
Copy link
Contributor

@kcondon as of #5089, which was merged in 4.9.3, @qqmyers suggests he has improved how we clean up files left in /tmp.

Working on behalf of TDL, I've found one case for shapefile zips where a successful upload leaves temporary files in the defined temp directory and several ways that user actions to delete or cancel when uploading can cause temp file to remain. I've gone through the code and have changes to submit.

The only cases I'm aware of where 'persistent' temp files can still be created would be 1) where network errors break the communication with the browser and neither a save or cancel is ever received, and 2) some code that writes directly to subdirs under /tmp (eg. some of the R code) which is nominally cleaned up by the operating system.

@kcondon
Copy link
Contributor Author

kcondon commented Jul 16, 2019

@mheppler .xlsx files still produce a firstpass* file in /tmp and ingesting RData files produce a data-*.tab in /tmp

@haarli
Copy link
Contributor

haarli commented Mar 28, 2024

tempFiletoRescale*.tmp files are now deleted with PR #9637 .

firstpass*.tab and tempIngestSourceFile*.tmp still left.

@DS-INRAE DS-INRAE moved this to 🔍 Interest in Recherche Data Gouv Jul 10, 2024
@cmbz
Copy link

cmbz commented Aug 20, 2024

To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'.

If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment.

@cmbz cmbz closed this as completed Aug 20, 2024
@github-project-automation github-project-automation bot moved this from 🔍 Interest to Done in Recherche Data Gouv Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

No branches or pull requests

6 participants