-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Add a script to generate csv files #46
Conversation
First, convert bson to json using mongodb's tool. Second, use this script to generate csv tables.
|
||
data = defaultdict(emptylist, {}) | ||
for entry in dictlist: | ||
md5sum = entry['provenance']['md5sum'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dumps from the rating collection don't have a provenance field, md5sum should exist at the top level of the object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, you need to use the flag --no-dedup
for those. Probably the flag could use a renaming.
import json | ||
|
||
import numpy as np | ||
import pandas as pd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas and numpy aren't requirements for eve, maybe add them to the requirements.txt so the main docker container can run the tool? Otherwise I'm not opposed to a requirements file for the tools directory since it seems it might be used outside the containers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I see this more as a place to have the script, not as a dependency.
You know the dependencies and how to call it, I'm merging. |
First, convert bson to json using mongodb's tool. Second, use this script to generate csv tables.