Added a remote testing program #3889

dmitry-ganyushin · 2023-11-01T18:50:47Z

A testing program to read 1D, 3D arrays and slices from 3D arrays. In the current state...

eisenhauer · 2023-11-01T19:55:27Z

source/utils/adios_iotest/adios_remote_read.cpp

+            {
+                if (!variables_in.empty() && var_name.first != variables_in[0])
+                    continue;
+                adios2::Variable<double> var = io.InquireVariable<double>(var_name.first);


InquireVariable will return NULL if the variable exists but isn't of type double. You should probably check for this possibility rather than segfaulting on the next line.

Actually, why do AvailableVariables if you're only matching a single name and requiring it to be a double? Just call InquireVariable with the name you want. No reason to have this loop here at all.

Yes, read functions should be generic, but currently supports only one type. The loop was made mainly for future development, for matching multiple variables from the input list.

OK, but I don't think you'd ever actually do it that way with multiple variables. Those would pretty naturally come in as a list that you iterated over and do InqVar on, which would be vastly easier than iterating through all the variables and then seeing if each one was listed. Alternatively, if you actually supported types you'd call VariableType first to see if it existed and then a templated InqVar to get the actual handle. So, I don't think you'd ever use this loop structure...

In the expression
adios2::Variable<double> var = io.InquireVariable<double>(var_name.first);
var cannot return nullptr or NULL. Do we have a test for that? Maybe things got broken.
The compiler does not understand if (var != nullptr)

No, because var is not a pointer here so you can't compare it to null. But the Variable class does have a boolean operator, so you can do if (var) instead of if (var != nullptr). That's probably the test that you want (and it does appear in many, many places in our tests).

eisenhauer

In the adios_iotest directory may not a good place for this. adios_iotest is a complex tool, will this fit in there in some way? Or was this placement another WIP? I guess one way that it might fit in was if adios_iotest could be used to generate the data upon which the read test might rely. I know this has been used with data generated elsewhere, but to actually be useful as some kind of standard performance test that is deployed in a variety of situations (which was the goal), then it needs data, even if its dummy data.

anagainaru

Is this a testing for the CI or just a tool for people to use? If it's for manual performance testing, should we use TAU for performance measurements in addition to the time you are plotting for the entire test?

source/utils/adios_iotest/adios_remote_read.cpp

anagainaru · 2023-11-14T15:01:06Z

source/utils/adios_iotest/adios_remote_read.cpp

+        adios2::Engine reader = io.Open(filename, adios2::Mode::Read);
+
+        std::string out;
+        for (size_t step = 0; step < NSTEPS; step++)


I don't know if we want to fix the number of steps since you could deduce that from the file. Or is this not true for remote access?

Suggested change

for (size_t step = 0; step < NSTEPS; step++)

for (; reader.BeginStep() == adios2::StepStatus::OK; ++step)

{

Would this work?

The number of steps should be an input parameter, and, if missing, all steps in the BP files should be read.
Unfortunately the C++ function which retrieves the number of steps (like adios2_steps) is missing. We can change it like that and quite the loop on NSTEPS counter.

Well, it isn't missing, but that still doesn't mean you should use it. Engine::Steps() can only return useful information for file engines because streaming engines don't know when the stream will end until the writer closes it. (Generally I'd be surprised to find anything missing from the C++ interface that is present in others because most things are implemented in terms of C++.)

If the number of steps written is lower than your input variable I think this code would crash so I would still use what is returned by BeginStep in the for and break is you reached the desired number of steps.

anagainaru · 2023-11-14T15:43:54Z

source/utils/adios_iotest/adios_remote_read.cpp

+                        startX = (1 - ratio) / 2.0 * globalSizeX;
+                        countX = globalSizeX * ratio;
+
+                        startY = (1 - ratio) / 2.0 * globalSizeY;
+                        countY = globalSizeY * ratio;
+
+                        startZ = (1 - ratio) / 2.0 * globalSizeZ;
+                        countZ = globalSizeZ * ratio;


I am a bit confused by what the ratio is (it might be good to add a comment below where you introduce all the parameters for the testing). I was assuming you want to control the amount of data you request so for a ratio of 0.5 the count is half in each direction. The start seems to be 1/4 of the entire direction. Is this just a random way of having a subselection or it represents something?

The ratio parameter is specified as an input parameter. The initial requirement was to implement test cases from the paper https://dl.acm.org/doi/10.1145/1996130.1996139 figure 3.

Gotcha, this makes sense. I would use the phrasing of the paper to describe each case, for this case the paper says: "An arbitrary area on an orthogonal plane"

anagainaru · 2023-11-14T18:10:44Z

source/utils/adios_remote_read/adios_remote_read.cpp

+                vars_intersection = available_variables_str;
+            }
+
+            for (auto const &var_name : vars_intersection)


Why not just parse over the required_variables?

In the case that --variables parameter is not specified, It think the program should try to read all available variables.
Additionally, input parameters not always correct.

If input parameters are incorrect, the inquire variable function will return a null variable and you check for this so you would skip this variable anyway. So I would ask for available variables only if the provided variables is empty (I would also add a comment to be clear how the function should behave for any valid input). There is no point to do the intersection, it just makes the code harder to read.

anagainaru · 2023-11-14T18:14:11Z

source/utils/adios_remote_read/adios_remote_read.cpp

+                        startX = (1 - ratio) / 2.0 * globalSizeX;
+                        countX = globalSizeX * ratio;
+
+                        startY = (1 - ratio) / 2.0 * globalSizeY;
+                        countY = globalSizeY * ratio;
+
+                        startZ = (1 - ratio) / 2.0 * globalSizeZ;
+                        countZ = globalSizeZ * ratio;


What exactly is this ratio ? (maybe you should add a comment to explain below where you define the variables)
It seems you want to control the amount you are reading (ratio of the total data on that axes). My concern is with the start. For ratio of 0.5 you are bringing half the data starting from position globalSize/4
Is this just a random start point or it represents something?

Yes, the ratio parameter corresponds to the amount of data that should be read. Probably it is the most short description of the task. Specifying starts and counts for every MPI rank is not really convenient for the user.

Yes, now that I looked at the paper it makes sense. My confusion was with the start point, but it's just an arbitrary choice.

anagainaru · 2023-11-14T18:15:00Z

source/utils/adios_remote_read/adios_remote_read.cpp

+    }
+}
+
+void read3DPlane(int nproc, int rank, const std::string &filename, const int nsteps, adios2::IO &io,


You might want to add a comment for each of these functions so that it's clear what you are testing. In this case it seems you are bringing the XZ plane (with X data divided across ranks) and only a fixed ration of the Z data (using the weird start equation as starting point)

anagainaru · 2023-11-14T18:15:13Z

source/utils/adios_remote_read/adios_remote_read.cpp

+
+    option longopts[] = {{"help", no_argument, NULL, 'h'},
+                         {"case", required_argument, NULL, 'c'},
+                         {"filename", required_argument, NULL, 'f'},


Do you create the data as part of the test or where do you expect to get the input file?

It can be actually used for reading data generated by adios_iotest program or any other data file, say generated by the S3D program.

Ok, at some point we should add documentation on how to use this and examples of how you can generate the data for readthedocs

anagainaru · 2023-11-14T18:19:23Z

In the adios_iotest directory may not a good place for this. adios_iotest is a complex tool, will this fit in there in some way? Or was this placement another WIP? I guess one way that it might fit in was if adios_iotest could be used to generate the data upon which the read test might rely. I know this has been used with data generated elsewhere, but to actually be useful as some kind of standard performance test that is deployed in a variety of situations (which was the goal), then it needs data, even if its dummy data.

+1 to Greg's question. I also think we need data for this test.
It is not clear to me how and by whom this will be ran.

dmitry-ganyushin · 2023-11-15T15:14:23Z

Is this a testing for the CI or just a tool for people to use? If it's for manual performance testing, should we use TAU for performance measurements in addition to the time you are plotting for the entire test?

It can be used for automatic testing and hand testing. We never discussed that TAU should be used. I think it brings one more build dependency but probably not necessary for making measurements in 3 places.

dmitry-ganyushin · 2023-11-15T15:41:26Z

In the adios_iotest directory may not a good place for this. adios_iotest is a complex tool, will this fit in there in some way? Or was this placement another WIP? I guess one way that it might fit in was if adios_iotest could be used to generate the data upon which the read test might rely. I know this has been used with data generated elsewhere, but to actually be useful as some kind of standard performance test that is deployed in a variety of situations (which was the goal), then it needs data, even if its dummy data.

+1 to Greg's question. I also think we need data for this test. It is not clear to me how and by whom this will be ran.

I moved it to separated adios_remote_read directory

anagainaru · 2023-11-15T22:20:07Z

Is this a testing for the CI or just a tool for people to use? If it's for manual performance testing, should we use TAU for performance measurements in addition to the time you are plotting for the entire test?

It can be used for automatic testing and hand testing. We never discussed that TAU should be used. I think it brings one more build dependency but probably not necessary for making measurements in 3 places.

TAU already exists in adios2 so it's not an extra dependency. We don't have to use it but since this is a benchmarking tool, it would be nice having it. We can maybe add it to potential future work.

Revison issues addressed Revision & Clean-up

* master: Enable Shell-Check for gh-actions scripts Enable Shell-Check for circle CI scripts Enable Shell-Check for tau contract scripts Enable Shell-Check for scorpio contract scripts Enable Shell-Check for lammps contract scripts Delete VTK code in examples

…ad_test * origin/remote_read_test:

…ad_test

dmitry-ganyushin · 2023-12-07T16:48:30Z

Can this be merged now?

eisenhauer

So, my comments:

In this form, it's not really adios_REMOTE_read. it's just a generic reading program (albeit one that only works if you have written type double variables).
"Ratio" is applied oddly. If ratio is .5 (implying you want to read half the data), you'll apply the .5 to each dimension. This means that for a 3D array you read 1/8 of the data.
I'm not at all clear on how ratio and direction interact, particularly in the nprocs>1 case. It looks like the countD (where D is a direction) values calculated as the result of applying the ratio are overwritten for that Direction (at least for the 3d Case) and instead each processor is just assigned a fraction of the data (1/nproc) in that direction.
there's no checking to see if the dimensionality requested actually applies to the variable.
I spent an extraordinary amount of time actually trying to get this to do something. If you don't specify --filename it silently doesn't do anything. If you specify a file that doesn't exist, it silently doesn't do anything. If you specify a variable that doesn't exist in the file, it silently doesn't do anything. If you specify a variable that isn't of type double, it silently doesn't do anything. If you specify a case it doesn't recognize, it silently doesn't do anything. If you specify to do a 3D read on a 1D variable, it gives you output, but it's not clear what it did, if anything. It wasn't immediately obvious to me when I first ran it, that it would never produce timing output to the terminal, but would instead write it to a file whose name is based upon the case, but which you can't otherwise control.
That timing is essentially done only on reading all specified variables on an entire file seems like a missed opportunty. One could measure open time, how long individual reads took, impact of sync vs. defer, etc.
I'll reiterate an earlier comment. I'm still not at all clear what role this program would play in a generalized test suite that might exercise remote read functionality. (To be specific, I know that it could do most anything, but it is maybe one component of a test suite that might include data generation, a performance test plan, etc. Whether or not this is useful as a component, is missing features, etc. is hard to say without any idea of a bigger picture.

Added a remote testing program

5a5c004

dmitry-ganyushin requested review from eisenhauer, pnorbert and anagainaru November 1, 2023 18:50

dmitry-ganyushin requested a review from vicentebolea as a code owner November 1, 2023 18:50

eisenhauer reviewed Nov 1, 2023

View reviewed changes

vicentebolea removed their request for review November 2, 2023 01:51

eisenhauer requested changes Nov 2, 2023

View reviewed changes

Issues addressed

f3bfe47

anagainaru reviewed Nov 14, 2023

View reviewed changes

Revision & Clean-up

34241e4

Added a remote testing program

f747b6f

Revison issues addressed Revision & Clean-up

dmitry-ganyushin force-pushed the remote_read_test branch from 34241e4 to f747b6f Compare December 3, 2023 03:20

dmitry-ganyushin added 13 commits December 4, 2023 17:53

Merge branch 'master' into remote_read_test

3642d8d

Merge remote-tracking branch 'origin/remote_read_test' into remote_re…

3993ae4

…ad_test * origin/remote_read_test:

fixed cmake file

b7dcc6f

Merge remote-tracking branch 'origin/remote_read_test' into remote_re…

d1b15ac

…ad_test

fixed cmake file

0ab8042

fixed gcc warning

26d94d1

fixed windows compiler problems

be7fd60

compiler warnings

1f846d7

formatting

c12e0d0

fixed windows warnings

67069d1

fixed windows warnings

19d3d97

windows warnings

be82911

dmitry-ganyushin added 5 commits December 6, 2023 14:32

windows warnings

885bcf8

formatting

6796b3c

types for windows

fc53cbc

fixed windows warnings

55f3966

added static cast for conversion

5883623

pnorbert approved these changes Dec 11, 2023

View reviewed changes

pnorbert requested a review from eisenhauer December 11, 2023 16:04

eisenhauer reviewed Dec 11, 2023

View reviewed changes

dmitry-ganyushin closed this May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added a remote testing program #3889

Added a remote testing program #3889

dmitry-ganyushin commented Nov 1, 2023

eisenhauer Nov 1, 2023 •

edited

Loading

dmitry-ganyushin Nov 1, 2023

eisenhauer Nov 2, 2023

dmitry-ganyushin Nov 13, 2023

eisenhauer Nov 14, 2023

eisenhauer left a comment

anagainaru left a comment

anagainaru Nov 14, 2023

dmitry-ganyushin Nov 15, 2023

eisenhauer Nov 15, 2023

anagainaru Nov 15, 2023

anagainaru Nov 14, 2023

dmitry-ganyushin Nov 15, 2023

anagainaru Nov 15, 2023

anagainaru Nov 14, 2023

dmitry-ganyushin Nov 15, 2023 •

edited

Loading

anagainaru Nov 15, 2023

anagainaru Nov 14, 2023

dmitry-ganyushin Nov 15, 2023 •

edited

Loading

anagainaru Nov 15, 2023

anagainaru Nov 14, 2023

anagainaru Nov 14, 2023

dmitry-ganyushin Nov 15, 2023

anagainaru Nov 15, 2023

anagainaru commented Nov 14, 2023

dmitry-ganyushin commented Nov 15, 2023

dmitry-ganyushin commented Nov 15, 2023

anagainaru commented Nov 15, 2023

dmitry-ganyushin commented Dec 7, 2023

eisenhauer left a comment

	for (size_t step = 0; step < NSTEPS; step++)
	for (; reader.BeginStep() == adios2::StepStatus::OK; ++step)
	{

Added a remote testing program #3889

Added a remote testing program #3889

Conversation

dmitry-ganyushin commented Nov 1, 2023

eisenhauer Nov 1, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eisenhauer left a comment

Choose a reason for hiding this comment

anagainaru left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmitry-ganyushin Nov 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmitry-ganyushin Nov 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anagainaru commented Nov 14, 2023

dmitry-ganyushin commented Nov 15, 2023

dmitry-ganyushin commented Nov 15, 2023

anagainaru commented Nov 15, 2023

dmitry-ganyushin commented Dec 7, 2023

eisenhauer left a comment

Choose a reason for hiding this comment

eisenhauer Nov 1, 2023 •

edited

Loading

dmitry-ganyushin Nov 15, 2023 •

edited

Loading

dmitry-ganyushin Nov 15, 2023 •

edited

Loading