CI for Windows (#587)

* Rewrite 'Hello Federation' test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Isolate 'Hello Federation' test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Rewrite 'dockerization_test' Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Add copyright notice Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Add FeTS challenge test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Move initialization of 'Hello federation' test in the main part Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Update FeTS Challenge workflow Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Fix FeTS challenge test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Fix FeTS challenge test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Check col2 existence in hello federation test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Add pause Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Specify worker number explicitly Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Specify workers count Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Fix errors Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Include Windows in workflows Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Fix Kvasir interactive API test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Fix imports Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Fix paths Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Fix import Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Replace wget Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Replace wget & unzip Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Bump tensorflow version Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Run linter and coverage tests on Ubuntu only Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Add comments Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Include saving logic in Hello Federation script Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Fix linter Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Add Dockerization workflow Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Switch docker engine for Windows Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Remove Windows in dockerization test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Add test for double workspace export Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Fix workflow name Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Add graminize test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Update Hello Federation script Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Use low-level Docker API Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Use threading in dockerization test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Use pauses between agg & col start Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Exclude gramine test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * use shell Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Try interactive test on ubuntu Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Remove Windows in FeTS challenge test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Add graminize test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Move tensorflow interactive API test to Python Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Parametrize config files Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Remove config parameters Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Add removing utility Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Pin tensorflow to 2.8.3 Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Remove Graminize test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Update straggler handling test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Remove Windows from Interactive TF test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Reuse common test logic Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Fix imports & calls Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Use relative imports Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Debug sys path Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Execute module Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Run Python modules Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Remove unused strategy matrix Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Fix FeTS challenge test Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Resolve review comments Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> * Remove debug code Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com> --------- Signed-off-by: Ilya Trushkin <ilya.trushkin@intel.com>
securefederatedai · Jan 3, 2023 · 30e212f · 30e212f
1 parent 41ad51f
commit 30e212f
Show file tree

Hide file tree

Showing 35 changed files with 859 additions and 783 deletions.
diff --git a/.github/workflows/dockerization.yml b/.github/workflows/dockerization.yml
@@ -0,0 +1,32 @@
+# This workflow will install Python dependencies, run tests and lint with a single version of Python
+# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions
+
+name: Dockerization
+
+on:
+  push:
+    branches: [ develop ]
+  pull_request:
+    branches: [ develop ]
+
+permissions:
+  contents: read
+
+jobs:
+  build:
+    runs-on: ubuntu-latest
+
+    steps:
+    - uses: actions/checkout@v3
+    - name: Set up Python 3.8
+      uses: actions/setup-python@v3
+      with:
+        python-version: "3.8"
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install .
+    - name: Dockerization test
+      run: |
+        python -m tests.github.dockerization_test
+    
diff --git a/.github/workflows/double_ws_export.yml b/.github/workflows/double_ws_export.yml
@@ -0,0 +1,32 @@
+# This workflow will install Python dependencies, run tests and lint with a single version of Python
+# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions
+
+name: Double workspace export
+
+on:
+  push:
+    branches: [ develop ]
+  pull_request:
+    branches: [ develop ]
+
+permissions:
+  contents: read
+
+jobs:
+  build:
+    runs-on: 'ubuntu-latest'
+
+    steps:
+    - uses: actions/checkout@v3
+    - name: Set up Python 3.8
+      uses: actions/setup-python@v3
+      with:
+        python-version: "3.8"
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install .
+    - name: Double workspace export test
+      run: |
+        python -m tests.github.test_double_ws_export
+    
diff --git a/.github/workflows/fets-challenge.yml b/.github/workflows/fets-challenge.yml
@@ -48,5 +48,5 @@ jobs:
         head -n 8 testing/data/train_3d_rad_segmentation.csv > /home/runner/work/openfl/openfl/seg_test_train.csv
         cd /home/runner/work/openfl/openfl
         ls
-        bash tests/github/test_fets_challenge.sh fets_challenge_seg_test aggregator col1 col2 $(hostname --all-fqdns | awk '{print $1}') --rounds-to-train 1
+        python -m tests.github.test_fets_challenge --template fets_challenge_seg_test --fed_workspace aggregator --col1 col1 --col2 col2 --rounds-to-train 1
         
diff --git a/.github/workflows/interactive-kvasir.yml b/.github/workflows/interactive-kvasir.yml
@@ -15,7 +15,10 @@ permissions:
 jobs:
   build:
 
-    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+       os: ['ubuntu-latest', 'windows-latest']
+    runs-on: ${{ matrix.os }}
 
     steps:
     - uses: actions/checkout@v3
@@ -30,8 +33,6 @@ jobs:
     - name: Interactive API - pytorch_kvasir_unet
       run: |
         python setup.py build_grpc
-        pip install torch==1.7.1
-        pip install torchvision==0.8.2
-        cd tests/github/interactive_api_director/experiments/pytorch_kvasir_unet
-        ./run.sh
-        pkill fx
+        pip install torch==1.8.1
+        pip install torchvision==0.9.1
+        python -m tests.github.interactive_api_director.experiments.pytorch_kvasir_unet.run
diff --git a/.github/workflows/interactive-tensorflow.yml b/.github/workflows/interactive-tensorflow.yml
@@ -14,8 +14,7 @@ permissions:
 
 jobs:
   build:
-
-    runs-on: ubuntu-latest
+    runs-on: ubuntu-latest # Add Windows support after https://github.com/keras-team/keras/issues/16308 is merged
 
     steps:
     - uses: actions/checkout@v3
@@ -31,6 +30,4 @@ jobs:
       run: |
         python setup.py build_grpc
         pip install tensorflow==2.8
-        cd tests/github/interactive_api_director/experiments/tensorflow_mnist
-        ./run.sh
-        pkill fx
+        python -m tests.github.interactive_api_director.experiments.tensorflow_mnist.run
diff --git a/.github/workflows/straggler-handling.yml b/.github/workflows/straggler-handling.yml
@@ -14,8 +14,10 @@ permissions:
 
 jobs:
   build:
-
-    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+       os: ['ubuntu-latest', 'windows-latest']
+    runs-on: ${{ matrix.os }}
 
     steps:
     - uses: actions/checkout@v3
@@ -29,4 +31,4 @@ jobs:
         pip install .
     - name: Test Straggler Handling Interface
       run: |
-        bash tests/github/test_hello_federation.sh torch_cnn_mnist_straggler_check aggregator col1 col2 $(hostname --all-fqdns | awk '{print $1}') --rounds-to-train 3
+        python -m tests.github.test_hello_federation --template torch_cnn_mnist_straggler_check --fed_workspace aggregator --col1 col1 --col2 col2  --rounds-to-train 3
diff --git a/.github/workflows/taskrunner.yml b/.github/workflows/taskrunner.yml
@@ -15,7 +15,10 @@ permissions:
 jobs:
   build:
 
-    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+       os: ['ubuntu-latest', 'windows-latest']
+    runs-on: ${{ matrix.os }}
 
     steps:
     - uses: actions/checkout@v3
@@ -29,4 +32,4 @@ jobs:
         pip install .
     - name: Test TaskRunner API
       run: |
-        bash tests/github/test_hello_federation.sh keras_cnn_mnist aggregator col1 col2 $(hostname --all-fqdns | awk '{print $1}') --rounds-to-train 3 --save-model output_model
+        python -m tests.github.test_hello_federation --template keras_cnn_mnist --fed_workspace aggregator --col1 col1 --col2 col2  --rounds-to-train 3 --save-model output_model
diff --git a/.github/workflows/taskrunner_python_3.10.yml b/.github/workflows/taskrunner_python_3.10.yml
@@ -15,7 +15,10 @@ permissions:
 jobs:
   build:
 
-    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+       os: ['ubuntu-latest', 'windows-latest']
+    runs-on: ${{ matrix.os }}
 
     steps:
     - uses: actions/checkout@v3
@@ -29,4 +32,4 @@ jobs:
         pip install .
     - name: Test TaskRunner API
       run: |
-        bash tests/github/test_hello_federation.sh keras_cnn_mnist aggregator col1 col2 $(hostname --all-fqdns | awk '{print $1}') --rounds-to-train 3
+        python -m tests.github.test_hello_federation --template keras_cnn_mnist --fed_workspace aggregator --col1 col1 --col2 col2 --rounds-to-train 3
diff --git a/.github/workflows/taskrunner_python_3.9.yml b/.github/workflows/taskrunner_python_3.9.yml
@@ -15,7 +15,10 @@ permissions:
 jobs:
   build:
 
-    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+       os: ['ubuntu-latest', 'windows-latest']
+    runs-on: ${{ matrix.os }}
 
     steps:
     - uses: actions/checkout@v3
@@ -29,4 +32,4 @@ jobs:
         pip install .
     - name: Test TaskRunner API
       run: |
-        bash tests/github/test_hello_federation.sh keras_cnn_mnist aggregator col1 col2 $(hostname --all-fqdns | awk '{print $1}') --rounds-to-train 3
+        python -m tests.github.test_hello_federation --template keras_cnn_mnist --fed_workspace aggregator --col1 col1 --col2 col2 --rounds-to-train 3
diff --git a/openfl/interface/interactive_api/experiment.py b/openfl/interface/interactive_api/experiment.py
@@ -313,11 +313,12 @@ def _pack_the_workspace():
         from shutil import copytree
         from shutil import ignore_patterns
         from shutil import make_archive
-        from shutil import rmtree
         from os import getcwd
         from os import makedirs
         from os.path import basename
 
+        from openfl.utilities.utils import rmtree
+
         archive_type = 'zip'
         archive_name = basename(getcwd())
 

diff --git a/openfl/interface/workspace.py b/openfl/interface/workspace.py
@@ -394,22 +394,23 @@ def dockerize_(context, base_image, save):
         'BASE_IMAGE': base_image
     }
 
-    client = docker.from_env(timeout=3600)
+    cli = docker.APIClient()
     echo('Building the Docker image')
     try:
-        client.images.build(
+        for line in cli.build(
             path=str(workspace_path),
             tag=workspace_name,
             buildargs=build_args,
-            dockerfile=dockerfile_workspace
-        )
-    except docker.errors.BuildError as e:
-        for log in e.build_log:
-            msg = log.get('stream')
-            if msg:
-                echo(msg)
-        echo('Failed to build the image\n' + str(e) + '\n')
-        sys.exit(1)
+            dockerfile=dockerfile_workspace,
+            timeout=3600,
+            decode=True
+        ):
+            if 'stream' in line:
+                print(f'> {line["stream"]}', end='')
+            elif 'error' in line:
+                echo('Failed to build the Docker image:')
+                echo(line)
+                sys.exit(1)
     finally:
         os.remove(workspace_archive)
         os.remove(dockerfile_workspace)
@@ -419,6 +420,7 @@ def dockerize_(context, base_image, save):
     if save:
         workspace_image_tar = workspace_name + '_image.tar'
         echo('Saving the Docker image...')
+        client = docker.from_env(timeout=3600)
         image = client.images.get(f'{workspace_name}')
         resp = image.save(named=True)
         with open(workspace_image_tar, 'wb') as f:

diff --git a/openfl/utilities/utils.py b/openfl/utilities/utils.py
@@ -13,6 +13,8 @@
 from typing import List
 from typing import Optional
 from typing import Tuple
+import stat
+import shutil
 
 import numpy as np
 from dynaconf import Dynaconf
@@ -258,3 +260,12 @@ def change_tags(tags, *, add_field=None, remove_field=None) -> Tuple[str, ...]:
 
     tags = tuple(sorted(tags))
     return tags
+
+
+def rmtree(path, ignore_errors=False):
+    def remove_readonly(func, path, _):
+        "Clear the readonly bit and reattempt the removal"
+        if os.name == 'nt':
+            os.chmod(path, stat.S_IWRITE)  # Windows can not remove read-only files.
+        func(path)
+    return shutil.rmtree(path, ignore_errors=ignore_errors, onerror=remove_readonly)
diff --git a/tests/github/dockerization_test.py b/tests/github/dockerization_test.py
@@ -0,0 +1,108 @@
+# Copyright (C) 2020-2021 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+from argparse import ArgumentParser
+import socket
+from subprocess import check_call
+import shutil
+import os
+from pathlib import Path
+import tarfile
+import time
+from concurrent.futures import ProcessPoolExecutor
+
+from tests.github.utils import start_aggregator_container
+from tests.github.utils import start_collaborator_container
+from tests.github.utils import create_signed_cert_for_collaborator
+
+
+if __name__ == '__main__':
+    # 1. Create the workspace
+    parser = ArgumentParser()
+    workspace_choice = []
+    with os.scandir('openfl-workspace') as iterator:
+        for entry in iterator:
+            if entry.name not in ['__init__.py', 'workspace', 'default']:
+                workspace_choice.append(entry.name)
+    parser.add_argument('--template', default='keras_cnn_mnist', choices=workspace_choice)
+    parser.add_argument('--fed_workspace', default='fed_work12345alpha81671')
+    parser.add_argument('--col', default='one123dragons')
+    parser.add_argument('--data_path', default='1')
+    parser.add_argument('--base_image_tag', default='openfl')
+    args = parser.parse_args()
+    base_image_tag = args.base_image_tag
+    fed_workspace = args.fed_workspace
+    col = args.col
+
+    # If an aggregator container will run on another machine
+    # a relevant FQDN should be provided
+    fqdn = socket.getfqdn()
+    # Build base image
+    check_call([
+        'docker', 'build', '-t', base_image_tag, '-f', 'openfl-docker/Dockerfile.base', '.'
+    ])
+
+    # Create FL workspace
+    shutil.rmtree(fed_workspace, ignore_errors=True)
+    check_call([
+        'fx', 'workspace', 'create', '--prefix', fed_workspace, '--template', args.template
+    ])
+    os.chdir(fed_workspace)
+    fed_directory = Path().resolve()  # Get the absolute directory path for the workspace
+
+    # Initialize FL plan
+    check_call(['fx', 'plan', 'initialize', '-a', fqdn])
+
+    # 2. Build the workspace image and save it to a tarball
+
+    # This commant builds an image tagged $FED_WORKSPACE
+    # Then it saves it to a ${FED_WORKSPACE}_image.tar
+
+    check_call(['fx', 'workspace', 'dockerize', '--base_image', base_image_tag])
+
+    # We remove the base OpenFL image as well
+    # as built workspace image to simulate starting
+    # on another machine
+    workspace_image_name = fed_workspace
+    check_call(['docker', 'image', 'rm', '-f', base_image_tag, workspace_image_name])
+
+    # 3. Generate certificates for the aggregator and the collaborator
+
+    # Create certificate authority for the workspace
+    check_call(['fx', 'workspace', 'certify'])
+
+    # Prepare a tarball with the collab's private key, the singed cert,
+    # and data.yaml for collaborator container
+    # This step can be repeated for each collaborator
+    create_signed_cert_for_collaborator(args.col, args.data_path)
+
+    # Also perform certificate generation for the aggregator.
+    # Create aggregator certificate
+    check_call(['fx', 'aggregator', 'generate-cert-request', '--fqdn', fqdn])
+    # Sign aggregator certificate
+    # Remove '--silent' if you run this manually
+    check_call(['fx', 'aggregator', 'certify', '--fqdn', fqdn, '--silent'])
+
+    # Pack all files that aggregator need to start training
+    aggregator_required_files = 'cert_agg.tar'
+    with tarfile.open(aggregator_required_files, 'w') as f:
+        for d in ['plan', 'cert', 'save']:
+            f.add(d)
+            shutil.rmtree(d)
+
+    # 4. Load the image
+    image_tar = f'{fed_workspace}_image.tar'
+    check_call(['docker', 'load', '--input', image_tar])
+    time.sleep(5)
+    with ProcessPoolExecutor(max_workers=2) as executor:
+        executor.submit(start_aggregator_container, args=(
+            workspace_image_name,
+            aggregator_required_files
+        ))
+        time.sleep(5)
+        executor.submit(start_collaborator_container, args=(
+            workspace_image_name,
+            col
+        ))
+    # If containers are started but collaborator will fail to
+    # conect the aggregator, the pipeline will go to the infinite loop