OpenFDA BigData Pipeline enables collection, processing, and real-time presentation of data - on adverse drug events from the openFDA database.
The solution uses Apache Kafka as a message broker, Mongo DB as a document storage, Spring Boot for services and is Dockerized.
This repository contains the code for the openFDA BigData Pipeline solution
- openfda-producer it's a microservice build with Spring Boot and written in Java
- openfda-consumer it's a microservice build with Spring Boot and written in Java
- openfda-live-dashboard it's a web application build with Flask, Dash and written in Python
The project runs with the default configuration defined in each of services and in pipeline.yml
. For more details refer directly to:
If you intend to try running project yourself, I have put together a pipeline.yml
configuration that can help you get started.
Calling the following command
docker-compose -f pipeline.yml up
will:
- Start
openfda-producer
container - Start
zookeper
container - Start
kafka
container - Start
mongodb
container - Start
openfda-consumer
container - Start
openfda-live-dashboard
container which will expose port8050
- Start
jupyter-notebook
container which will expose port8888
Once all your Docker containers are up and running you can access openfda-live-dashaboard
web dashboard via a browser under the following URL:
In addition, you can access Jupyter Notebook jupyter-notebook
via a browser under the following URL:
Bug reports and pull requests are welcome on GitHub at https://github.com/koziolk/openfda-bigdata-pipeline