Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some excel files don't go through ingest process. #2264

Closed
scolapasta opened this issue Jun 15, 2015 · 13 comments
Closed

Some excel files don't go through ingest process. #2264

scolapasta opened this issue Jun 15, 2015 · 13 comments
Labels
Feature: File Upload & Handling Type: Bug a defect User Role: Depositor Creates datasets, uploads data, etc.

Comments

@scolapasta
Copy link
Contributor

When uploading an excel file, the code doesn't even try to ingest the file and treats like any other file.

@scolapasta scolapasta added this to the 4.0.2 milestone Jun 15, 2015
@scolapasta
Copy link
Contributor Author

@sbarbosadataverse will add some sample files to review.

@4tikhonov
Copy link
Contributor

We also faced the same problem with our datasets. I think it has something to do with sheet ID that by default probably is 0 but can be different in Excel files.

@landreev landreev assigned kcondon and unassigned landreev Jun 25, 2015
@landreev
Copy link
Contributor

Please re-test; this should have been addressed by the same fix I committed earlier today.

@landreev
Copy link
Contributor

@4tikhonov :
Now, this was something else. This issue was preventing XSLX files from being recognized as such - so we weren't even trying to parse/ingest them.
But yes, there may be issues related to sheet IDs in our Excel parser. I am investigating that separately.

@4tikhonov
Copy link
Contributor

Is it possible to turn this feature off? In our system we do conversion of all data files in the different formats to the format of structured dataframe so it's kind of useless.

@mercecrosas
Copy link
Member

@4tikhonov we are evaluating the best way of configuring this. We might still want to apply he ingest to pull metadata from the file and index that metadata for more comprehensive search. But we could configure this differently for each Dataverse software installation. This should be part of another issue, however.

@kcondon kcondon assigned sbarbosadataverse and unassigned kcondon Jul 10, 2015
@sbarbosadataverse
Copy link

@kcondon
i tested two excel files related to an earlier ticket - both failed in some way. I gave the ticket back to Leonid doing the same with this

@landreev
Copy link
Contributor

@sbarbosadataverse
Sonia, could you please post the number of the earlier ticket, from which the files you tested came?
thanks!

@landreev I sent you the files from the tickets that failed and were supposed to be tested, quite a while back. The tickets in RT with your name on them, probably still have some of those files.

@scolapasta scolapasta modified the milestones: 4.2, 4.1 Jul 24, 2015
@landreev landreev modified the milestones: 4.3, 4.2 Sep 28, 2015
@landreev
Copy link
Contributor

I'm bumping this to 4.3; there is a lot of files to re-test before it can be closed. Also, more minor bugfixes may still be needed for this.
I'm working on this now, and I'm going to clean it up during the 4.3 cycle, for real (honest, seriously ).

@scolapasta scolapasta removed this from the Not Assigned to a Release milestone Jan 28, 2016
@landreev landreev removed their assignment Jan 29, 2016
@pdurbin pdurbin added Type: Bug a defect and removed zTriaged labels Jun 29, 2017
@pdurbin pdurbin added the User Role: Depositor Creates datasets, uploads data, etc. label Jul 13, 2017
@mheppler
Copy link
Contributor

With #2202 having been merged, I uploaded 13 .xls files and 2 .xlsx files to see what would get ingested or not. The result was that the files ingested were the 2 .xlsx files.

In a comment above @landreev suggested this issue was specific to the .xlsx files, so I will leave it to him to determine if this issue is done.

@pdurbin
Copy link
Member

pdurbin commented Jun 13, 2019

@mheppler awesome that the two .xlsx files ingested successfully. If they're relatively small and safe to upload to a GitHub issue, please upload one or both to this one: Add tests for Excel/XSLX ingest #5896

@mheppler
Copy link
Contributor

mheppler commented Jun 13, 2019

@pdurbin not sure where I got one of the files, and the other is available on production, but is restricted, so I do not believe it is appropriate to share or use in tests.

@DS-INRAE DS-INRAE moved this to ⚠️ Needed/Important in Recherche Data Gouv Jul 10, 2024
@cmbz
Copy link

cmbz commented Aug 20, 2024

To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'.

If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment.

@cmbz cmbz closed this as completed Aug 20, 2024
@github-project-automation github-project-automation bot moved this from ⚠️ Needed/Important to Done in Recherche Data Gouv Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: File Upload & Handling Type: Bug a defect User Role: Depositor Creates datasets, uploads data, etc.
Projects
Status: Done
Development

No branches or pull requests

9 participants