Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misc DFP Documentation & fixes #368

Merged
91 commits merged into from
Sep 29, 2022
Merged
Show file tree
Hide file tree
Changes from 78 commits
Commits
Show all changes
91 commits
Select commit Hold shift + click to select a range
2778f55
Fix typing hint
dagardner-nv Sep 19, 2022
b45832d
document DataFrameInputSchema
dagardner-nv Sep 19, 2022
9e2da5e
wip
dagardner-nv Sep 19, 2022
bb82168
Rename variables and update comments to be generalized
dagardner-nv Sep 19, 2022
5814bd4
Override the H4 style for the DFP guide preserving the camel casing o…
dagardner-nv Sep 19, 2022
5b6455a
Revert "Override the H4 style for the DFP guide preserving the camel …
dagardner-nv Sep 19, 2022
285cbdc
Fill in missing constructor arguments for DFPTraining and update doc …
dagardner-nv Sep 19, 2022
63e0376
Fill in sections from documentation in jupyter notebooks
dagardner-nv Sep 19, 2022
808970d
wip
dagardner-nv Sep 19, 2022
363b5c5
Script was renamed
dagardner-nv Sep 20, 2022
03ac69a
only_new_batches was removed from the stage
dagardner-nv Sep 20, 2022
645431e
only_new_batches was removed from the stage
dagardner-nv Sep 20, 2022
3af40e5
wip
dagardner-nv Sep 20, 2022
fe6a5d3
Rename morpheus_training service to morpheus_pipeline since it runs b…
dagardner-nv Sep 20, 2022
4ac3656
wip
dagardner-nv Sep 20, 2022
735382a
Explain --train_users=none
dagardner-nv Sep 20, 2022
809facb
Fix spelling of MLflow
dagardner-nv Sep 20, 2022
71f550c
file://mlruns is no longer the default
dagardner-nv Sep 20, 2022
9b17645
Document cli flags for dfp scripts
dagardner-nv Sep 20, 2022
0b8848a
wip
dagardner-nv Sep 21, 2022
b19ce47
wip
dagardner-nv Sep 21, 2022
f178dce
Add a Defining a New Data Source section
dagardner-nv Sep 21, 2022
2a9c487
wip
dagardner-nv Sep 21, 2022
c52dea3
Restrict mlflow versions prior to 1.29.0 per Pete
dagardner-nv Sep 21, 2022
92192c6
Fix casing of header names, move schema section
dagardner-nv Sep 21, 2022
8c4572c
Fix spelling mistakes
dagardner-nv Sep 21, 2022
b4dc986
Troubleshooting ci
dagardner-nv Sep 21, 2022
332b4d6
Fix spelling mistakes
dagardner-nv Sep 21, 2022
786ef95
Spelling fixes
dagardner-nv Sep 21, 2022
241c258
Use a better url for duo
dagardner-nv Sep 21, 2022
2bd5724
Work-around for 404 error
dagardner-nv Sep 20, 2022
a4b2739
wip
dagardner-nv Sep 21, 2022
fd9da32
Revert "Troubleshooting ci"
dagardner-nv Sep 21, 2022
8f03d1c
Use 'docker-compose' rather than 'docker compose', restructure the sy…
dagardner-nv Sep 21, 2022
0efd559
Applying recomendations from @efajardo-nv
dagardner-nv Sep 21, 2022
95d23c4
Add Azure & Duo source stages to cli registry, update readme with fix…
dagardner-nv Sep 22, 2022
6f02e08
Ensure the build is using the same docker tag as the morpheus release…
dagardner-nv Sep 22, 2022
884ac8b
Merge branch 'branch-22.09' into david_docs_345_p2
dagardner-nv Sep 22, 2022
b7bb8f9
MORPHEUS_CONTAINER_VERSION needs to be exported
dagardner-nv Sep 22, 2022
e0e8c3e
Formatting suggestions from @efajardo-nv
dagardner-nv Sep 22, 2022
440b215
Remove only_new_batches as it no longer exists
dagardner-nv Sep 22, 2022
2f88bb2
Set runtime
dagardner-nv Sep 22, 2022
69b5982
Update file pattern for duo files
dagardner-nv Sep 22, 2022
9edaaff
Revert "Update file pattern for duo files"
dagardner-nv Sep 22, 2022
1bff5f3
Fix handling of enum command line flags
dagardner-nv Sep 22, 2022
edadd4f
fixed typos in digital_fingerpringing/production readme
mpenn Sep 23, 2022
7ff2bb5
updated digital_fingerprinting/production Dockerfile to include Jupyt…
mpenn Sep 23, 2022
7d329ab
updated docker-compose.yml. changed runtime: nvidia to the device str…
mpenn Sep 26, 2022
68e7a0c
Helm chart info for prod DFP
pdmack Sep 26, 2022
dd9ced6
updated README with localhost url to mlflow ui and updated broken def…
mpenn Sep 26, 2022
d893f0e
actually updated the readme this time with local host url for mlflow
mpenn Sep 26, 2022
a6bc016
Merge branch 'branch-22.09' into david_docs_345_p2
dagardner-nv Sep 27, 2022
d67e0e2
Remove filecache arg as it raises an error when used with s3
dagardner-nv Sep 27, 2022
fca142a
Merge pull request #1 from pdmack/pdmack_dfp-helm
dagardner-nv Sep 27, 2022
015697a
Start & end date wip
dagardner-nv Sep 27, 2022
c0b8e18
Merge branch 'david_docs_345_p2' of github.com:dagardner-nv/Morpheus …
dagardner-nv Sep 27, 2022
ef1496d
Merge branch 'david_docs_345_p2' into david_docs_345_p2
dagardner-nv Sep 27, 2022
b23fa64
Merge pull request #2 from mpenn/david_docs_345_p2
dagardner-nv Sep 27, 2022
47bdd65
Merge branch 'david_docs_345_p2' of github.com:dagardner-nv/Morpheus …
dagardner-nv Sep 27, 2022
767430e
Apply time window filtering if defined.
dagardner-nv Sep 27, 2022
20fb75d
Ensure we always have a tz aware date window
dagardner-nv Sep 27, 2022
3ecc9e2
Handle case where no file match the date window
dagardner-nv Sep 27, 2022
57e7ec8
Add start_time flag to azure pipeline
dagardner-nv Sep 27, 2022
a2b2697
Update help string for duration
dagardner-nv Sep 27, 2022
15831b5
Update docs for date filtering
dagardner-nv Sep 27, 2022
242742a
Merge branch 'branch-22.09' into david_docs_345_p2
dagardner-nv Sep 27, 2022
18d6148
Merge branch 'branch-22.09' into david_docs_345_p2
dagardner-nv Sep 28, 2022
7d9c8d7
Merge branch 'branch-22.09' into david_docs_345_p2
dagardner-nv Sep 28, 2022
b87f4a6
Merge branch 'david_docs_345_p2' of github.com:dagardner-nv/Morpheus …
dagardner-nv Sep 28, 2022
58f4206
Enable verbose output from pytest
dagardner-nv Sep 28, 2022
08983e3
Show std out during tests
dagardner-nv Sep 28, 2022
8ef2824
starter dfp readme updates
efajardo-nv Sep 29, 2022
99e512c
Merge pull request #3 from efajardo-nv/starter-dfp-readme-updates
dagardner-nv Sep 29, 2022
e63f2a5
Pin numba version
dagardner-nv Sep 29, 2022
32f4d5e
Merge branch 'david_docs_345_p2' of github.com:dagardner-nv/Morpheus …
dagardner-nv Sep 29, 2022
014fa61
Revert "Show std out during tests"
dagardner-nv Sep 29, 2022
db6cc38
Revert "Enable verbose output from pytest"
dagardner-nv Sep 29, 2022
4e2d35f
Remove redundant call
dagardner-nv Sep 29, 2022
93b34c5
Update diagrams
dagardner-nv Sep 29, 2022
5f35de3
Update diagrams
dagardner-nv Sep 29, 2022
dca8850
Update regect for both colon and underscore time separators
dagardner-nv Sep 29, 2022
ecc55bd
Document the --start_time flag
dagardner-nv Sep 29, 2022
63d924d
Formatting fix
dagardner-nv Sep 29, 2022
a9d1a84
Merge branch 'branch-22.09' into david_docs_345_p2
dagardner-nv Sep 29, 2022
c6368cb
Use ver 2022.8.2 of s3fs per Eli
dagardner-nv Sep 29, 2022
a95cc81
Add instructions for downloading and running against the example data…
dagardner-nv Sep 29, 2022
8d11605
Fix indenting of code examples
dagardner-nv Sep 29, 2022
c8fc64e
Update paths in notebooks
dagardner-nv Sep 29, 2022
511bcc6
Merge branch 'branch-22.09' into david_docs_345_p2
dagardner-nv Sep 29, 2022
4c10784
Merge branch 'branch-22.09' into david_docs_345_p2
dagardner-nv Sep 29, 2022
4f674a8
Merge branch 'david_docs_345_p2' of github.com:dagardner-nv/Morpheus …
dagardner-nv Sep 29, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docker/conda/environments/cuda11.5_dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ dependencies:
- networkx=2.8
- ninja=1.10
- nodejs=17.4.0
- numba==0.55
- numpydoc=1.4
- pandas=1.3
- pip
Expand Down
16 changes: 13 additions & 3 deletions docs/source/_static/omni-style.css
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,16 @@ h4
text-transform: uppercase;
}

h3 code
{
text-transform: none;
}

h4 code
{
text-transform: none;
}

/* Paragraph Formatting */

p
Expand Down Expand Up @@ -218,7 +228,7 @@ html.writer-html5 .rst-content table.docutils th>p

/* cell text */
html.writer-html5 .rst-content table.docutils td>p,
html.writer-html5 .rst-content table.docutils th>p
html.writer-html5 .rst-content table.docutils th>p
{
font-size: var(--body-font-size);
line-height: var(--body-line-height);
Expand All @@ -230,7 +240,7 @@ html.writer-html5 .rst-content table.docutils th>p
.rst-content table.field-list td p:first-child,
.wy-table th p:first-child,
.rst-content table.docutils th p:first-child,
.rst-content table.field-list th p:first-child
.rst-content table.field-list th p:first-child
{
margin-top: 0px;
}
Expand All @@ -241,7 +251,7 @@ html.writer-html5 .rst-content table.docutils th>p
.rst-content table.field-list td p:last-child,
.wy-table th p:last-child,
.rst-content table.docutils th p:last-child,
.rst-content table.field-list th p:last-child
.rst-content table.field-list th p:last-child
{
margin-bottom: 0px;
}
Expand Down
410 changes: 333 additions & 77 deletions docs/source/developer_guide/guides/5_digital_fingerprinting.md

Large diffs are not rendered by default.

6 changes: 4 additions & 2 deletions examples/digital_fingerprinting/production/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,10 @@ FROM base as jupyter
RUN source activate morpheus \
&& mamba install -y -c conda-forge \
ipywidgets \
jupyterlab \
nb_conda_kernels
nb_conda_kernels \
&& pip install jupyter_contrib_nbextensions==0.5.1 \
&& jupyter contrib nbextension install --user \
&& pip install jupyterlab_nvdashboard==0.7.0

# Launch jupyter
CMD ["jupyter-lab", "--ip=0.0.0.0", "--no-browser", "--allow-root"]
127 changes: 122 additions & 5 deletions examples/digital_fingerprinting/production/README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,134 @@
# "Production" Digital Fingerprinting Pipeline

### Build the Morpheus container
This example is designed to show what a full scale, production ready, DFP deployment in Morpheus would look like. It contains all of the necessary components (such as a model store), to allow multiple Morpheus pipelines to communicate at a scale that can handle the workload of an entire company.

This is necessary to get the latest changes needed for DFP
Key Differences:
* Multiple pipelines are specialized to perform either training or inference
* Requires setting up a model store to allow the training and inference pipelines to communicate
* Organized into a docker-compose deployment for easy startup
* Contains a Jupyter notebook service to ease development and debugging
* Can be deployed to Kubernetes using provided Helm charts
* Uses many customized stages to maximize performance.

## Build the Morpheus container
This is necessary to get the latest changes needed for DFP. From the root of the Morpheus repo:
```bash
./docker/build_container_release.sh
```

### Running locally via `docker-compose`

## Building and Running via `docker-compose`
### Build
```bash
cd examples/digital_fingerprinting/production
export MORPHEUS_CONTAINER_VERSION="$(git describe --tags --abbrev=0)-runtime"
docker-compose build
```

docker-compose up
### Running the services
#### Jupyter Server
From the `examples/digital_fingerprinting/production` dir run:
```bash
docker-compose up jupyter
```

Once the build is complete and the service has started you will be prompted with a message that should look something like:
```
jupyter | To access the server, open this file in a browser:
jupyter | file:///root/.local/share/jupyter/runtime/jpserver-7-open.html
jupyter | Or copy and paste one of these URLs:
jupyter | http://localhost:8888/lab?token=<token>
jupyter | or http://127.0.0.1:8888/lab?token=<token>
```

Copy and paste the url into a web browser. There are four notebooks included with the DFP example:
* dfp_azure_training.ipynb - Training pipeline for Azure Active Directory data
* dfp_azure_inference.ipynb - Inference pipeline for Azure Active Directory data
* dfp_duo_training.ipynb - Training pipeline for Duo Authentication
* dfp_duo_inference.ipynb - Inference pipeline for Duo Authentication

> **Note:** The token in the url is a one-time use token, and a new one is generated with each invocation.

#### Morpheus Pipeline
By default the `morpheus_pipeline` will run the training pipeline for Duo data, from the `examples/digital_fingerprinting/production` dir run:
```bash
docker-compose up morpheus_pipeline
```

If instead you wish to run a different pipeline, from the `examples/digital_fingerprinting/production` dir run:
```bash
docker-compose run morpheus_pipeline bash
```

From the prompt within the `morpheus_pipeline` container you can run either the `dfp_azure_pipeline.py` or `dfp_duo_pipeline.py` pipeline scripts.
```bash
python dfp_azure_pipeline.py --help
python dfp_duo_pipeline.py --help
```

Both scripts are capable of running either a training or inference pipeline for their respective data sources. The command line options for both are the same:
| Flag | Type | Description |
| ---- | ---- | ----------- |
| `--train_users` | One of: `all`, `generic`, `individual`, `none` | Indicates whether or not to train per user or a generic model for all users. Selecting `none` runs the inference pipeline. |
| `--skip_user` | TEXT | User IDs to skip. Mutually exclusive with `only_user` |
| `--only_user` | TEXT | Only users specified by this option will be included. Mutually exclusive with `skip_user` |
| `--duration` | TEXT | The duration to run starting from now [default: 60d] |
| `--cache_dir` | TEXT | The location to cache data such as S3 downloads and pre-processed data [env var: `DFP_CACHE_DIR`; default: `./.cache/dfp`] |
| `--log_level` | One of: `CRITICAL`, `FATAL`, `ERROR`, `WARN`, `WARNING`, `INFO`, `DEBUG` | Specify the logging level to use. [default: `WARNING`] |
| `--sample_rate_s` | INTEGER | Minimum time step, in milliseconds, between object logs. [env var: `DFP_SAMPLE_RATE_S`; default: 0] |
| `-f`, `--input_file` | TEXT | List of files to process. Can specify multiple arguments for multiple files. Also accepts glob (*) wildcards and schema prefixes such as `s3://`. For example, to make a local cache of an s3 bucket, use `filecache::s3://mybucket/*`. See [fsspec documentation](https://filesystem-spec.readthedocs.io/en/latest/api.html?highlight=open_files#fsspec.open_files) for list of possible options. |
| `--tracking_uri` | TEXT | The MLflow tracking URI to connect to the tracking backend. [default: `http://localhost:5000`] |
| `--help` | | Show this message and exit. |


#### Optional MLflow Service
Starting either the `morpheus_pipeline` or the `jupyter` service, will start the `mlflow` service in the background. For debugging purposes it can be helpful to view the logs of the running MLflow service.

From the `examples/digital_fingerprinting/production` dir run:
```bash
docker-compose up mlflow
```

By default, a mlflow dashboard will be available at:
```bash
http://localhost:5000
```

## Kubernetes deployment

The Morpheus project also maintains Helm charts and container images for Kubernetes deployment of Morpheus and MLflow (both for serving and for the Triton plugin). These are located in the NVIDIA GPU Cloud (NGC) [public catalog](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/morpheus/collections/morpheus_).

### MLflow Helm chart

MLflow for this production digital fingerprint use case can be installed from NGC using these same instructions for the [MLflow Triton Plugin from the Morpheus Quick Start Guide](../../../docs/source/morpheus_quickstart_guide.md#install-morpheus-mlflow-triton-plugin). The chart and image can be used for both the Triton plugin and also MLflow server.

### Production DFP Helm chart

The deployment of the [Morpheus SDK Client](../../../docs/source/morpheus_quickstart_guide.md#install-morpheus-sdk-client) is also done _almost_ the same way as what's specified in the Quick Start Guide. However, you would specify command arguments differently for this production DFP use case.

#### Notebooks

```
helm install --set ngc.apiKey="$API_KEY",sdk.args="cd /workspace/examples/digital_fingerprinting/production/morpheus && jupyter-lab --ip='*' --no-browser --allow-root --ServerApp.allow_origin='*'" <sdk-release-name> morpheus-sdk-client/
```

Make note of the Jupyter token by examining the logs of the SDK pod:
```
kubectl logs sdk-cli-<sdk-release-name>
```

You should see something similar to this:

```
Or copy and paste one of these URLs:
http://localhost:8888/lab?token=d16c904468fdf666c5030e18fb82f840e531178bf716e575
or http://127.0.0.1:8888/lab?token=d16c904468fdf666c5030e18fb82f840e531178bf716e575
```

Open your browser to the reachable address and NodePort exposed by the pod (default value of 30888) and use the generated token to login into the notebook.

#### Unattended

```
helm install --set ngc.apiKey="$API_KEY",sdk.args="cd /workspace/examples/digital_fingerprinting/production/morpheus && ./launch.sh --train_users=generic --duration=1d" <sdk-release-name> morpheus-sdk-client/
```

16 changes: 14 additions & 2 deletions examples/digital_fingerprinting/production/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,12 @@ services:
target: jupyter
args:
- MORPHEUS_CONTAINER_VERSION=${MORPHEUS_CONTAINER_VERSION:-v22.09.00-runtime}
deploy:
resources:
reservations:
devices:
- driver: nvidia
capabilities: [gpu]
image: dfp_morpheus_jupyter
container_name: jupyter
ports:
Expand All @@ -58,7 +64,7 @@ services:
cap_add:
- sys_nice

morpheus_training:
morpheus_pipeline:
# restart: always
build:
context: ./
Expand All @@ -67,7 +73,13 @@ services:
args:
- MORPHEUS_CONTAINER_VERSION=${MORPHEUS_CONTAINER_VERSION:-v22.09.00-runtime}
image: dfp_morpheus
container_name: morpheus_training
container_name: morpheus_pipeline
deploy:
resources:
reservations:
devices:
- driver: nvidia
capabilities: [gpu]
networks:
- frontend
- backend
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ RUN apt update && \
rm -rf /var/cache/apt/* /var/lib/apt/lists/*

# Install python packages
RUN pip install mlflow boto3 pymysql pyyaml
RUN pip install "mlflow<1.29.0" boto3 pymysql pyyaml

# We run on port 5000
EXPOSE 5000
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@

import logging
import typing
from collections import namedtuple
from datetime import datetime

import fsspec
import pandas as pd
Expand All @@ -26,15 +28,25 @@

logger = logging.getLogger("morpheus.{}".format(__name__))

TimestampFileObj = namedtuple("TimestampFileObj", ["timestamp", "file_object"])


class DFPFileBatcherStage(SinglePortStage):

def __init__(self, c: Config, date_conversion_func, period="D", sampling_rate_s=0):
def __init__(self,
c: Config,
date_conversion_func,
period="D",
sampling_rate_s=0,
start_time: datetime = None,
end_time: datetime = None):
super().__init__(c)

self._date_conversion_func = date_conversion_func
self._sampling_rate_s = sampling_rate_s
self._period = period
self._start_time = start_time
self._end_time = end_time

@property
def name(self) -> str:
Expand All @@ -48,48 +60,70 @@ def accepted_types(self) -> typing.Tuple:

def on_data(self, file_objects: fsspec.core.OpenFiles):

file_object_list = file_objects
# Determine the date of the file, and apply the window filter if we have one
ts_and_files = []
for file_object in file_objects:
ts = self._date_conversion_func(file_object)

# Exclude any files outside the time window
if ((self._start_time is not None and ts < self._start_time)
or (self._end_time is not None and ts > self._end_time)):
continue

ts_and_files.append(TimestampFileObj(ts, file_object))

# sort the incoming data by date
ts_and_files.sort()

# Create a dataframe with the incoming metadata
if ((len(file_object_list) > 1) and (self._sampling_rate_s > 0)):
if ((len(ts_and_files) > 1) and (self._sampling_rate_s > 0)):
file_sampled_list = []

file_object_list.sort(key=lambda file_object: self._date_conversion_func(file_object))
ts_last = ts_and_files[0].timestamp

ts_last = self._date_conversion_func(file_object_list[0])
file_sampled_list.append(ts_and_files[0])

file_sampled_list.append(file_object_list[0])

for idx in range(1, len(file_object_list)):
ts = self._date_conversion_func(file_object_list[idx])
for idx in range(1, len(ts_and_files)):
ts = ts_and_files[idx].timestamp

if ((ts - ts_last).seconds >= self._sampling_rate_s):

file_sampled_list.append(file_object_list[idx])
ts_and_files.append(ts_and_files[idx])
ts_last = ts
else:
file_object_list = file_sampled_list
ts_and_files = file_sampled_list

df = pd.DataFrame()

df["dfp_timestamp"] = [self._date_conversion_func(file_object) for file_object in file_object_list]
df["key"] = [file_object.full_name for file_object in file_object_list]
df["objects"] = file_object_list

# Now split by the batching settings
df_period = df["dfp_timestamp"].dt.to_period(self._period)
timestamps = []
full_names = []
file_objs = []
for (ts, file_object) in ts_and_files:
timestamps.append(ts)
full_names.append(file_object.full_name)
file_objs.append(file_object)

period_gb = df.groupby(df_period)
df["dfp_timestamp"] = timestamps
df["key"] = full_names
df["objects"] = file_objs

output_batches = []

n_groups = len(period_gb)
for group in period_gb.groups:
period_df = period_gb.get_group(group)
if len(df) > 0:
# Now split by the batching settings
df_period = df["dfp_timestamp"].dt.to_period(self._period)

period_gb = df.groupby(df_period)

n_groups = len(period_gb)
for group in period_gb.groups:
period_df = period_gb.get_group(group)

obj_list = fsspec.core.OpenFiles(period_df["objects"].to_list(), mode=file_objects.mode, fs=file_objects.fs)
obj_list = fsspec.core.OpenFiles(period_df["objects"].to_list(),
mode=file_objects.mode,
fs=file_objects.fs)

output_batches.append((obj_list, n_groups))
output_batches.append((obj_list, n_groups))

return output_batches

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -242,7 +242,6 @@ def convert_to_dataframe(self, s3_object_batch: typing.Tuple[fsspec.core.OpenFil
return output_df
except Exception:
logger.exception("Error while converting S3 buckets to DF.")
self._get_or_create_dataframe_from_s3_batch(s3_object_batch)
raise

def _build_single(self, builder: srf.Builder, input_stream: StreamPair) -> StreamPair:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ def supports_cpp_node(self):

def _generate_frames_fsspec(self):

files: fsspec.core.OpenFiles = fsspec.open_files(self._filenames, filecache={'cache_storage': './.cache/s3tmp'})
files: fsspec.core.OpenFiles = fsspec.open_files(self._filenames)

if (len(files) == 0):
raise RuntimeError(f"No files matched input strings: '{self._filenames}'. "
Expand Down
Loading