If you wish to contribute code to this pipeline engine, please let us know at web@geobon.org.
The recommended method is to setup an instance of BON in a Box somewhere you can easily play with the script files, using the local or remote setup in the user documentation. You can create a branch or fork to save your work. Once complete, open a pull request to this repository. The pull request will be peer-reviewed before acceptation.
The code in this repository runs an engine, but the engine needs content! Here are the steps to start the server in development mode with the BON in a Box scripts and pipelines:
- docker and docker compose must be installed.
- Clone this repo:
git clone git@github.com:GEO-BON/bon-in-a-box-pipeline-engine.git pipeline-engine
cd pipeline-engine
- Clone the BON in a Box repo (or any compatible repo of your choice) into the pipeline-repo folder:
git clone git@github.com:GEO-BON/bon-in-a-box-pipelines.git pipeline-repo
- Create a runner.env file as per user instructions.
cd ..
- Pull the pre-compiled images:
./dev-server.sh pull
For the global project, Visual Studio Code. Recommended extensions:
- Git graph
- Markdown Preview Mermaid
- Mermaid Markdown Syntax Highlighting
For the script-server (Kotlin code), IntelliJ Idea. Note that on Linux there will be an ownership conflict between gradle files generated by the development docker and those from the IDE. To solve this, make sure to stop the dockers and run sudo chown -R <yourinfo>:<yourinfo> .
before running the tests in IntelliJ.
-
Build the remaining images:
./dev-server.sh build
-
Start the development server:
./dev-server.sh up
- If there is a container name conflict, run
./dev-server.sh clean
- If there is a container name conflict, run
This command enables:
- OpenAPI editor at http://localhost/swagger/
- UI server: React automatic hot-swapping
- Script-server: Kotlin hot-swapping by launching ./script-server/hotswap.sh
- NGINX: http-proxy/conf.d/ngnix.conf will be loaded
Once in a while you should use docker compose -f compose.yml -f compose.dev.yml pull
to have the latest base images.
The servers are versionned by date of build of the docker image. One can check the version in the version tab of the UI.
- Create a branch that ends with "staging" from the head of the main branch.
- Merge your changes to that branch. The docker hub GH action will trigger for branch main and any branch with name ending by "staging". The branch name is appended to the tag of the docker image. See
- Caveat: this only compiles the image where the paths were modified. For example, if
viewer
folder is modified, only the gateway will be rebuilt. However, the server will look for both images with the same prefix. In this casescript-server-staging
might not exist, or might be outdated. It is possible to launch the build of the script-server manually to make sure it exists and is up to date.- On github website, navigate to the Actions tab
- open the desired action
- Click on the arrow next to "run workflow"
- Select the desired staging branch
- Run workflow
- Wait for completion
- It is now possible to test the staging prod servers by running
./server-up.sh <branchname>
. The launch script will look for this special tag in the docker hub.For example,./server-up.sh staging
will download and use both "gateway-staging" and "script-server-staging" images. - Send the above command to a few beta users.
The changes are live as soon as they are merged to main branch: the dockers are built, pushed to geobon's docker hub, and next time someone starts the server, the new dockers will be pulled.
Yes, we all know problems occur in production that do not happen in dev mode. So in order to build and test production dockers locally, do the following:
-
In pipeline-repo folder, delete the .server folder.
-
Create a symbolic link from .server to the parent:
ln -s ../ .server
-
Build the server with
.server/prod-server.sh command build
-
Then run it with
.server/prod-server.sh command up
(
.server/prod-server.sh clean
might be needed if you get the usual name conflict error) -
Stop the process with ctrl+c unless you used -d option in the previous command.
Warning: Undo this by removing the symlink if you are to use ./server-up.sh
for a regular launch of the production servers, otherwise it will checkout files in your parent pipeline engine repo though the symlink.
stateDiagram-v2
state "script-server" as script
state "scripts (static)" as scripts
state "output (static)" as output
state "R runner" as r
state "Julia runner" as julia
[*] --> ngnix
ngnix --> ui
ngnix --> viewer
ngnix --> script
ngnix --> output
script --> scripts
script --> r
script --> julia
- ui and viewer: React front-end. In production, those are served statically in the NGINX gateway.
- script-server: Running scripts and pipeline orchestration
- R runner: Docker dedicated to runs R code, with most relevant packages pre-installed
- Julia runner: Docker dedicated to runs Julia code
In addition to these services,
flowchart TD
never[Never ran] --> running[Running]
running --> input[(- run folder\n- input.json)]
running --> log[(log file)]
running --> success{Success?}
success --> |Yes| Done
Done --> output[(output.json)]
success --> |No| Failed
Failed --> |Add error flag|output
The OpenApi specification file is used by the UI to launch runs and track them until completion.
sequenceDiagram
ui->>script_server: script/list
script_server-->>ui: json map of scripts -> names
ui->>script_server: script/{path}/info
script_server-->>ui: script info json
ui->>script_server: script/run
script_server->>script: launch
script_server-->>ui: runId
loop Until output.json file generated
ui->>script_server: output/{runId}/logs.txt
script_server-->>ui: logs text
ui->>script_server: output/{id}/output.json
end
script-->>script_server: output.json
ui->>script_server: output/{runId}/output.json
script_server-->>ui: script output json
sequenceDiagram
ui->>script_server: pipeline/list
script_server-->>ui: json map of pipeline -> names
ui->>script_server: pipeline/{path}/info
script_server-->>ui: pipeline info json
ui->>script_server: pipeline/{path}/run
script_server-->>ui: runId
loop For each step
script_server->>script: run
Note right of script: see previous diagram
script-->>script_server: output.json (script)
ui->>script_server: pipeline/{runId}/outputs
script_server-->>ui: pipelineOutput.json (pipeline)
end
Every second, the UI polls for:
- pipelineOutput.json from the pipeline, to get the output folders of individual scripts. Stops polling when pipeline stops.
- logs.txt of individual scripts, for realtime logging, only if log section is opened. Stops when individual script completes, or when log section closed.
- output.json of individual scripts, to know when script completes and display its outputs. Stops when script stops.
- Using http://localhost/swagger, edit the specification.
- Copy the result to script-server/api/openapi.yaml
- Use ui/BonInABoxScriptService/generate-client.sh and script-server/generate-server-openapitools.sh to regenerate the client and the server.
- Merge carefully, not all generated code is to be kept.
- Implement the gaps.
Since runner-conda and runner-julia run in a separate docker, when the user stops the pipeline, the signal must go from the script-server, to the runner, to the running script. Docker does not allow this by default, this is why we save the PID in a file and use a separate exec command to kill the process.
The PID file is called .pid
and is located in the output folder of the run. It is deleted when the script completes. For details, see ScriptRun.kt.