This project hosts the code for the three components dealing with the FL workflow in Synthema:
- The FL restapi is the main entrypoint for the system. It will allow users to manage tasks.
- The FL server is the coordinator of the FL cluster.
- The FL client is the entity deployed at the edge where data chunks live.
The componentes are structured as a monorepo to harmonise common utilities and to pair the versions for all FL restapi, FL server and FL client.
- The folder common provides general utilities, datasets and models.
- The folder apps includes the various applications mentioned.
- The folder fl_client is dedicated to the component FL client of the architecture.
- The folder fl_server is dedicated to the component FL server of the architecture.
- The folder restapi is dedicated to the REST API to allow for human interaction with the system.
Pull requests are welcome. Please make sure to update tests as appropriate.
Extending the code requires a set of tools to be present in your machine:
- Poetry
- A running mlflow server -> check the repo for mlflow for a local mlflow deployment on K8s or use docker container directly through the command
docker run -p 5000:5000 ghcr.io/mlflow/mlflow
- A running instance of rabbitmq -> see the docker-compose.yml file.
There are several configuration options that need to be established in order to start contributing.
The process is as follows:
- Clone the repository
- Copy the .env from the templates folder with
cp templates/dot_env .env
and sync it with your IDE. (In vscode this is done automatically if the file is named ".env"). Note that the path appended to both PYTHONPATH and MYPYPATH must be the root dir of this project. - Run the script generate_envs.sh to create all dedicated venvs. Note that the script generate_envs.sh install runs
poetry install
and creates a virtual environment by default in the home directory, if you want to set the project dir as the parent folder for the venv, then make sure to config poetry.
The main entrypoint for testing purposes is the file tests.sh, which perform local tests. You can also use the script images_test.sh to perform tests on top of docker.
To build the images, the script can be used as follows.
bash images_build.sh build <image_tag> <docker_registry>
There are 2 main options for deploying the apps. Running them in docker, and running them in kubernetes.
You can deploy the apps in docker as follows.
- Open the docker-compose.yml file and set the appropriate env variables for each component.
- Run
docker compose up
to start the FL components and their dependencies. - Go to localhost:8000/docs in the browser, which is the OpenAPI spec for the REST API.
You can deploy the apps in kubernetes seamlessly with helm. To do that, you can create a custom values.yml file following what is done in this file. Then you can use the following command.
helm install fl-components ./fl-chart -f <your_values.yaml>
To run an end to end example, the easiest option is to run the preloaded model targeting a classification task on the iris dataset.
- Make sure that all apps and dependencies are deployed.
- Go to common/fl_models/iris/fl_model.py and run it to upload a federated model into mlflow and copy the model details (name and version) returned. Bear in mind that you should change the url in there.
- Open the docs while we finalise the UI.
- Run the FL task by using the model attributes (name and version) returned when the model was uploaded. Set the use case to iris as it is the demo use case.
This project is licensed under the MIT License. See the LICENSE file for more details.
This project extends and uses the following Open Softwares, which are compliant with MIT License:
- FastAPI (MIT License)
- fastapi-cli (MIT License)
- Jinja (MIT License)
- email_validator (MIT License)
- pydantic (MIT License)
- python-multipart (MIT License)
- pytest (MIT License)
- Uvicorn (MIT License)
- starlette (MIT License)
- httpx (MIT License)
- setuptools (MIT License)
- sqlmodel (MIT License)
- pika (MIT License)
- Flower (Apache 2.0 License)
- MLFlow (Apache 2.0 License)
- boto3 (Apache 2.0 License)
- Numpy (BSD License)
- Pandas (BSD License)
- torch (BSD License)
- psycopg2-binary (PostgreSQL License)
- typing-extensions (PSF License)
- Psutil (PSL License)