-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Harvest DB and Table Structure #4612
Comments
sqlachemy work in harvesting logic |
migration and db versioning links |
DB created in dev space, see keys with |
Just FYI, we have ssh turned off on prod at the entire space level. So if we need to use ssh to connect, that's probably not a good long term approach... However, knowing that we cannot connect outside of cloud.gov to the RDS services is helpful to know! |
I would think the long-term solution is a proxy app. |
I tried to connect with SQLEctron yesterday and got basically the same connection error. |
Developed a Flask application and set up a PostgreSQL Docker container for local execution, as detailed in branch create-harvest-db. Additionally, interfaces were established to interact with the corresponding tables. for examples, to insert test records, utilize the following endpoints: To-do: |
@Jin-Sun-tts in response to your question, so far all the tables inherit an |
Pushed the Fask app to cloud.gov development https://harvesting-logic.app.cloud.gov/ |
@Jin-Sun-tts chating with @krishnasanaka about the Flask form she's working on, and I believe there is work to be done to update the harvest_source table to the latest schema (here).
@GSA/data-gov-team can you take a look at the updates and add comments if you feel something was lost in translation? thanks. https://docs.google.com/document/d/1XzfTrPxu-asJ_55GoeZ2UOJsie9CuCegStS28BAL_40/edit |
we ended up exploding the |
@btylerburton @rshewitt agree, that is also what @FuhuXia and I talked about the other day. And we may have the |
Do we require the primary key |
Removed AC/Sketch regarding load test, as that is dependent on #4617 and related to #4619. Copied old AC/Sketch below:
|
i figure |
Which |
@btylerburton I believe we're talking about the |
to add to harvest_source db table:
@krishnasanaka these are non-required text fields, copy/pasted over from CKAN |
I have drafted a pull request (PR) at GSA/datagov-harvester#36 containing all the changes related to this issue. Need to transfer these changes to a separate repository, https://github.com/GSA/datagov-harvest-orchestrator, to facilitate further Flask UI modifications. |
@krishnasanaka Please reference the changes mentioned above and ensure that they are incorporated into the new repository |
we should think about things we want to seed into the database as part of the initialization |
@jbrown-xentity @rshewitt @btylerburton @FuhuXia @robert-bryson Please let me know if it's necessary to create a new ticket for conducting research and experimentation with Flask-SQLAlchemy? |
I don't have any knowledge or opinion on the subject, discovery would be warranted unless someone else has experience... |
I don't have a strong opinion. I haven't used that specific project. Sometimes those intersectional type projects are very handy. Sometimes they are more hassle than they're worth. Sometimes they aren't maintained or the maintainer goes MIA without explanation. |
This offers some good context. Overall, seems like global session management is the win, and locking us into that lib for better or worse is the price of that. https://stackoverflow.com/questions/14343740/flask-sqlalchemy-or-sqlalchemy |
thank you everyone for the feedback. We will conduct further research and uncover additional insights within the Flask app ticket #4619 Close this ticket as the table structure settled and defined in SQLAlchemy, and DB initialized and pytest tests run successfully, with results posted and verifiable via interfaces. |
User Story
In order to track the state of the new Harvesting 2.0 pipeline, datagovteam wants to create a Harvesting DB to record the status of harvest sources, harvest jobs, and harvest errors.
The Harvest DB will be defined in code using SQLAlchemy, so as to be reproducible through environments and deployments.
NOTE: As Google docs allows for more flexibility in table structure as modifications are ongoing, we will use the Harvesting Research Doc (here) to finalize table structure before adding an accompanying ERD to
datagov-harvesting-logic
docs.Acceptance Criteria
[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]
GIVEN the team has settled on the Table structure above
AND it has been defined in an SQLAlchemy file
THEN it should be able to be created programaticaly in a Docker container by running a DB initialization script.
GIVEN that the DB has been initialized
WHEN tests are run locally in pytest
THEN the results should be posted to the DB and be verifiable by running a query to generate the resulting report
Background
[Any helpful contextual notes or links to artifacts/evidence, if needed]
Security Considerations (required)
[Any security concerns that might be implicated in the change. "None" is OK, just be explicit here!]
Sketch
The text was updated successfully, but these errors were encountered: