Did every trail need to load data? #3294

liouxy · 2021-01-12T11:41:02Z

train_data, val_data, X_test, y_test = load_data()
default_params = {'min_data_in_leaf': 0, 'min_sum_hessian_in_leaf': 100}
received_params = nni.get_next_parameter()
default_params.update(received_params)
run(train_data, val_data, default_params, X_test, y_test)

I found that every trail will load data, which is time consuming.
Is this normal or there is another way to avoid reading data every trail?
Thank you very much!

The text was updated successfully, but these errors were encountered:

sharpe5 · 2021-01-12T16:51:58Z

This is by design.

However, there is an experimental switch which avoids this, see https://nni.readthedocs.io/en/stable/Tutorial/ExperimentConfig.html and reuse. But this means that using the Accessor to early-stop a process will result in a reload of data on all early-killed processes.

To speed things up, cache everything in Apache Arrow .parquet format, which loads blazingly fast compared to other formats such as .csv.

Most GPUs are limited to somewhere around 10GB of GPU RAM. Loading enough data to max out that RAM should take about 5 to 10 seconds with a fast SSD, and if training takes 20 minutes, that is a small overhead compared to the whole. As long as the loading overhead is only a few percentage points of the training time, this is not so much of a disadvantage as it would seem.

liouxy · 2021-01-13T02:28:30Z

This is by design.

However, there is an experimental switch which avoids this, see https://nni.readthedocs.io/en/stable/Tutorial/ExperimentConfig.html and reuse. But this means that using the Accessor to early-stop a process will result in a reload of data on all early-killed processes.

To speed things up, cache everything in Apache Arrow .parquet format, which loads blazingly fast compared to other formats such as .csv.

Most GPUs are limited to somewhere around 10GB of GPU RAM. Loading enough data to max out that RAM should take about 5 to 10 seconds with a fast SSD, and if training takes 20 minutes, that is a small overhead compared to the whole. As long as the loading overhead is only a few percentage points of the training time, this is not so much of a disadvantage as it would seem.

Thanks for your response!

kvartet · 2021-06-10T14:41:27Z

We are discussing this feature and will support it in the future. Thanks for your issue again.

kvartet added the user raised label Jan 13, 2021

kvartet mentioned this issue Jan 15, 2021

NNI 2021 Jan~Feb Iteration Planning #3308

Closed

94 tasks

This was referenced Mar 11, 2021

NNI 2021 Mar~Apr Iteration Planning #3445

Closed

NNI Backlog - welcome to comment and suggest #1917

Closed

kvartet assigned QuanluZhang May 13, 2021

QuanluZhang mentioned this issue Jun 10, 2021

NNI 2021 June~July Iteration Planning #3724

Closed

acured mentioned this issue Aug 16, 2021

NNI 2021 August~September Iteration Planning #3986

Closed

78 tasks

scarlett2018 closed this as completed Aug 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Did every trail need to load data? #3294

Did every trail need to load data? #3294

liouxy commented Jan 12, 2021

sharpe5 commented Jan 12, 2021 •

edited

Loading

liouxy commented Jan 13, 2021

kvartet commented Jun 10, 2021

Did every trail need to load data? #3294

Did every trail need to load data? #3294

Comments

liouxy commented Jan 12, 2021

sharpe5 commented Jan 12, 2021 • edited Loading

liouxy commented Jan 13, 2021

kvartet commented Jun 10, 2021

sharpe5 commented Jan 12, 2021 •

edited

Loading