Skip to content

Commit

Permalink
Merge pull request #355 from E3SM-Project/non-block-testing
Browse files Browse the repository at this point in the history
Non block testing
  • Loading branch information
TonyB9000 authored Jan 8, 2025
2 parents 082afc7 + 4a2a8df commit 10a5b8f
Show file tree
Hide file tree
Showing 12 changed files with 372 additions and 26 deletions.
2 changes: 1 addition & 1 deletion tests/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -325,7 +325,7 @@ def add_files(self, use_hpss, zstash_path, keep=False, cache=None):
expected_present = ["Transferring file to HPSS"]
else:
expected_present = ["put: HPSS is unavailable"]
expected_present += ["INFO: Creating new tar archive"]
expected_present += ["Creating new tar archive"]
# Make sure none of the old files or directories are moved.
expected_absent = ["ERROR", "file0", "file_empty", "empty_dir"]
self.check_strings(cmd, output + err, expected_present, expected_absent)
Expand Down
103 changes: 103 additions & 0 deletions tests3/README_TEST_BLOCKING
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@

This document outlines the procedures conducted to test the zstash bloclking
and non-blocking behavior.

Note: As it was intended to test blocking with regard to archive tar-creations
vs Globus transfers, it wsa convenient to have both source snd destination be
the same Globus endpoint. Effectively, we are employing Globus merely to move
tar archive files from one directory to another on the same file system.

The core intent in implementing zstash blocking is to address a potential
"low-disk" condition, where tar-files created to archive source files could
add substantially to the disk load. To avoid disk exhaustion, "blocking"
("--non-blocking" is absent on the command line), tar-file creation will
pause to wait for the previous tarfile globus transfer to complete, so that
the local copy can be deleted before the next tar-file is created.

I. File System Setup
====================

s one may want, or need to re-conduct testing under varied conditions, the
test script:

test_zstash_blocking.sh

will establish the following directory structure in the operator's current
working directory:

[CWD]/src_data/

- contains files to be tar-archived. One can experiment
with different sizes of files to trigger behaviors.

[CWD]/src_data/zstash/

- default location of tarfiles produced. This directory is
created automatically by zstash unless "--cache" indicates
an alternate location.

[CWD]/dst_data/

- destination for Globus transfer of archives.

[CWD]/tmp_cache/

- [Optional] alternative location for tar-file generation.

Note: It may be convenient to create a "hold" directory to store files of
various sizes that can be easily produced by running the supplied scripts.

gen_data.sh
gen_data_runner.sh

The files to be used for a given test must be moved or copied to the src_data
directory before a test is initiated.

Note: It never hurts to run the supplied script:

reset_test.sh

before a test run. This will delete any archives in the src_data/zstash
cache and the receiving dst_data directories, and delete the src_data/zstash
directory itself if it exists. This ensures a clean restart for testing.
The rad data files placed into src_data are not affected.

II. Running the Test Script
===========================

The test script "test_zstash_blocking.sh" accepts two positional parameters:

test_zstash_blocking.sh (BLOCKING|NON_BLOCKING) [NEW_CREDS]

On an initial run, or whenever Globus complains of authentication failures,
add "NEW_CREDS" as the second parameter. This will act to delete your
cached Globus credentials and trigger prompts for you to paste login URLs
to your browser (generally one per endpoint) which requires that you conduct
a login sequence, and then paste a returned key-value at the bash command
prompt. After both keys are accepted, you can re-run the test script
without "NEW_CREDS", until the credentials expire (usually 24 hours.)

If "BLOCKING" is selected, zstash will run in default mode, waiting for
each tar file to complete transfer before generating another tar file.

If "NON_BLOCKING" is selected, the zstash flag "--non-blocking" is supplied
to the zstash command line, and tar files continue to be created in parallel
to running Globus transfers.

It is suggested that you reun the test script with

test_zstash_blocking.sh (BLOCKING|NON_BLOCKING) > your_logfile 2>&1

so that your command prompt returns and you can monitor progress with

snapshot.sh

which will provide a view of both the tarfile cache and the destination
directory for delivred tar files. It is also suugested that you name your
logfile to reflect the date, and whether BLOCKING or not.


FINAL NOTE: In the zstash code, the tar file "MINSIZE" parameter is taken
to be (int) multiples of 1 GB. During testing, this had been changed to
"multiple of 100K" for rapid testing. It may be useful to expose this as
a command line parameter for debugging purposes.
11 changes: 11 additions & 0 deletions tests3/gen_data.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/bin/bash

if [[ $# -lt 2 ]]; then
echo "Usage: gen_data.sh <bytes> <outputfile>"
exit 0
fi

len=$1
out=$2

head -c $len </dev/urandom >$out
8 changes: 8 additions & 0 deletions tests3/gen_data_runner.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash

i=1

while [[ $i -lt 12 ]]; do
./gen_data.sh 1000000 small_0${i}_1M
i=$((i+1))
done
5 changes: 5 additions & 0 deletions tests3/reset_test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/bash

rm -rf src_data/zstash/
rm -f dst_data/*
rm -f tmp_cache/*
14 changes: 14 additions & 0 deletions tests3/snapshot.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
#!/bin/bash

echo "dst_data:"
ls -l dst_data

echo ""
echo "src_data/zstash:"
ls -l src_data/zstash

echo ""
echo "tmp_cache:"
ls -l tmp_cache


73 changes: 73 additions & 0 deletions tests3/test_zstash_blocking.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
#!/bin/bash

if [[ $# -lt 1 ]]; then
echo "Usage: text_zstash_blocking.sh (BLOCKING|NON_BLOCKING) [NEW_CREDS]"
echo " One of \"BLOCKING\" or \"NON_BLOCKING\" must be supplied as the"
echo " first parameter."
echo " Add \"NEW_CREDS\" if Globus credentials have expired."
echo " This will cause Globus to prompt for new credentials."
exit 0
fi

NON_BLOCKING=1

if [[ $1 == "BLOCKING" ]]; then
NON_BLOCKING=0
elif [[ $1 == "NON_BLOCKING" ]]; then
NON_BLOCKING=1
else
echo "ERROR: Must supply \"BLOCKING\" or \"NON_BLOCKING\" as 1st argument."
exit 0
fi

# remove old auth data, if exists, so that globus will prompt us
# for new auth credentials in case they have expired:
if [[ $# -gt 1 ]]; then
if [[ $2 == "NEW_CREDS" ]]: then
rm -f ~/.globus-native-apps.cfg
fi
fi


base_dir=`pwd`
base_dir=`realpath $base_dir`


# See if we are running the zstash we THINK we are:
echo "CALLING zstash version"
zstash version
echo ""

# Selectable Endpoint UUIDs
ACME1_GCSv5_UUID=6edb802e-2083-47f7-8f1c-20950841e46a
LCRC_IMPROV_DTN_UUID=15288284-7006-4041-ba1a-6b52501e49f1
NERSC_HPSS_UUID=9cd89cfd-6d04-11e5-ba46-22000b92c6ec

# 12 piControl ocean monthly files, 49 GB
SRC_DATA=$base_dir/src_data
DST_DATA=$base_dir/dst_data

SRC_UUID=$LCRC_IMPROV_DTN_UUID
DST_UUID=$LCRC_IMPROV_DTN_UUID

# Optional
TMP_CACHE=$base_dir/tmp_cache

mkdir -p $SRC_DATA $DST_DATA $TMP_CACHE

# Make maxsize 1 GB. This will create a new tar after every 1 GB of data.
# (Since individual files are 4 GB, we will get 1 tarfile per datafile.)

if [[ $NON_BLOCKING -eq 1 ]]; then
echo "TEST: NON_BLOCKING:"
zstash create -v --hpss=globus://$DST_UUID/$DST_DATA --maxsize 1 --non-blocking $SRC_DATA
else
echo "TEST: BLOCKING:"
zstash create -v --hpss=globus://$DST_UUID/$DST_DATA --maxsize 1 $SRC_DATA
# zstash create -v --hpss=globus://$DST_UUID --maxsize 1 --non-blocking --cache $TMP_CACHE $SRC_DATA
fi

echo "Testing Completed"

exit 0

18 changes: 13 additions & 5 deletions zstash/create.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
get_files_to_archive,
run_command,
tars_table_exists,
ts_utc,
)


Expand All @@ -37,7 +38,7 @@ def create():
raise TypeError("Invalid config.hpss={}".format(config.hpss))

# Start doing actual work
logger.debug("Running zstash create")
logger.debug(f"{ts_utc()}: Running zstash create")
logger.debug("Local path : {}".format(path))
logger.debug("HPSS path : {}".format(hpss))
logger.debug("Max size : {}".format(config.maxsize))
Expand All @@ -54,11 +55,13 @@ def create():
if hpss != "none":
url = urlparse(hpss)
if url.scheme == "globus":
# identify globus endpoints
logger.debug(f"{ts_utc()}:Calling globus_activate(hpss)")
globus_activate(hpss)
else:
# config.hpss is not "none", so we need to
# create target HPSS directory
logger.debug("Creating target HPSS directory")
logger.debug(f"{ts_utc()}: Creating target HPSS directory {hpss}")
mkdir_command: str = "hsi -q mkdir -p {}".format(hpss)
mkdir_error_str: str = "Could not create HPSS directory: {}".format(hpss)
run_command(mkdir_command, mkdir_error_str)
Expand All @@ -71,7 +74,7 @@ def create():
run_command(ls_command, ls_error_str)

# Create cache directory
logger.debug("Creating local cache directory")
logger.debug(f"{ts_utc()}: Creating local cache directory")
os.chdir(path)
try:
os.makedirs(cache)
Expand All @@ -84,11 +87,14 @@ def create():
# TODO: Verify that cache is empty

# Create and set up the database
logger.debug(f"{ts_utc()}: Calling create_database()")
failures: List[str] = create_database(cache, args)

# Transfer to HPSS. Always keep a local copy.
logger.debug(f"{ts_utc()}: calling hpss_put() for {get_db_filename(cache)}")
hpss_put(hpss, get_db_filename(cache), cache, keep=True)

logger.debug(f"{ts_utc()}: calling globus_finalize()")
globus_finalize(non_blocking=args.non_blocking)

if len(failures) > 0:
Expand Down Expand Up @@ -145,7 +151,7 @@ def setup_create() -> Tuple[str, argparse.Namespace]:
optional.add_argument(
"--non-blocking",
action="store_true",
help="do not wait for each Globus transfer until it completes.",
help="do not wait for each Globus transfer to complete before creating additional archive files. This option will use more intermediate disk-space, but can increase throughput.",
)
optional.add_argument(
"-v", "--verbose", action="store_true", help="increase output verbosity"
Expand Down Expand Up @@ -185,7 +191,7 @@ def setup_create() -> Tuple[str, argparse.Namespace]:

def create_database(cache: str, args: argparse.Namespace) -> List[str]:
# Create new database
logger.debug("Creating index database")
logger.debug(f"{ts_utc()}:Creating index database")
if os.path.exists(get_db_filename(cache)):
# Remove old database
os.remove(get_db_filename(cache))
Expand Down Expand Up @@ -254,6 +260,7 @@ def create_database(cache: str, args: argparse.Namespace) -> List[str]:
args.keep,
args.follow_symlinks,
skip_tars_md5=args.no_tars_md5,
non_blocking=args.non_blocking,
)
except FileNotFoundError:
raise Exception("Archive creation failed due to broken symlink.")
Expand All @@ -268,6 +275,7 @@ def create_database(cache: str, args: argparse.Namespace) -> List[str]:
args.keep,
args.follow_symlinks,
skip_tars_md5=args.no_tars_md5,
non_blocking=args.non_blocking,
)

# Close database
Expand Down
Loading

0 comments on commit 10a5b8f

Please sign in to comment.