-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added a remote testing program #3889
Added a remote testing program #3889
Conversation
{ | ||
if (!variables_in.empty() && var_name.first != variables_in[0]) | ||
continue; | ||
adios2::Variable<double> var = io.InquireVariable<double>(var_name.first); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InquireVariable will return NULL if the variable exists but isn't of type double. You should probably check for this possibility rather than segfaulting on the next line.
Actually, why do AvailableVariables if you're only matching a single name and requiring it to be a double? Just call InquireVariable with the name you want. No reason to have this loop here at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, read functions should be generic, but currently supports only one type. The loop was made mainly for future development, for matching multiple variables from the input list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, but I don't think you'd ever actually do it that way with multiple variables. Those would pretty naturally come in as a list that you iterated over and do InqVar on, which would be vastly easier than iterating through all the variables and then seeing if each one was listed. Alternatively, if you actually supported types you'd call VariableType first to see if it existed and then a templated InqVar to get the actual handle. So, I don't think you'd ever use this loop structure...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the expression
adios2::Variable<double> var = io.InquireVariable<double>(var_name.first);
var
cannot return nullptr or NULL. Do we have a test for that? Maybe things got broken.
The compiler does not understand if (var != nullptr)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because var is not a pointer here so you can't compare it to null. But the Variable class does have a boolean operator, so you can do if (var)
instead of if (var != nullptr)
. That's probably the test that you want (and it does appear in many, many places in our tests).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the adios_iotest directory may not a good place for this. adios_iotest is a complex tool, will this fit in there in some way? Or was this placement another WIP? I guess one way that it might fit in was if adios_iotest could be used to generate the data upon which the read test might rely. I know this has been used with data generated elsewhere, but to actually be useful as some kind of standard performance test that is deployed in a variety of situations (which was the goal), then it needs data, even if its dummy data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a testing for the CI or just a tool for people to use? If it's for manual performance testing, should we use TAU for performance measurements in addition to the time you are plotting for the entire test?
adios2::Engine reader = io.Open(filename, adios2::Mode::Read); | ||
|
||
std::string out; | ||
for (size_t step = 0; step < NSTEPS; step++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if we want to fix the number of steps since you could deduce that from the file. Or is this not true for remote access?
for (size_t step = 0; step < NSTEPS; step++) | |
for (; reader.BeginStep() == adios2::StepStatus::OK; ++step) | |
{ |
Would this work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The number of steps should be an input parameter, and, if missing, all steps in the BP files should be read.
Unfortunately the C++ function which retrieves the number of steps (like adios2_steps) is missing. We can change it like that and quite the loop on NSTEPS counter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it isn't missing, but that still doesn't mean you should use it. Engine::Steps() can only return useful information for file engines because streaming engines don't know when the stream will end until the writer closes it. (Generally I'd be surprised to find anything missing from the C++ interface that is present in others because most things are implemented in terms of C++.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the number of steps written is lower than your input variable I think this code would crash so I would still use what is returned by BeginStep in the for and break is you reached the desired number of steps.
startX = (1 - ratio) / 2.0 * globalSizeX; | ||
countX = globalSizeX * ratio; | ||
|
||
startY = (1 - ratio) / 2.0 * globalSizeY; | ||
countY = globalSizeY * ratio; | ||
|
||
startZ = (1 - ratio) / 2.0 * globalSizeZ; | ||
countZ = globalSizeZ * ratio; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit confused by what the ratio is (it might be good to add a comment below where you introduce all the parameters for the testing). I was assuming you want to control the amount of data you request so for a ratio of 0.5 the count is half in each direction. The start seems to be 1/4 of the entire direction. Is this just a random way of having a subselection or it represents something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ratio parameter is specified as an input parameter. The initial requirement was to implement test cases from the paper https://dl.acm.org/doi/10.1145/1996130.1996139 figure 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, this makes sense. I would use the phrasing of the paper to describe each case, for this case the paper says: "An arbitrary area on an orthogonal plane"
vars_intersection = available_variables_str; | ||
} | ||
|
||
for (auto const &var_name : vars_intersection) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just parse over the required_variables
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case that --variables parameter is not specified, It think the program should try to read all available variables.
Additionally, input parameters not always correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If input parameters are incorrect, the inquire variable function will return a null variable and you check for this so you would skip this variable anyway. So I would ask for available variables only if the provided variables is empty (I would also add a comment to be clear how the function should behave for any valid input). There is no point to do the intersection, it just makes the code harder to read.
startX = (1 - ratio) / 2.0 * globalSizeX; | ||
countX = globalSizeX * ratio; | ||
|
||
startY = (1 - ratio) / 2.0 * globalSizeY; | ||
countY = globalSizeY * ratio; | ||
|
||
startZ = (1 - ratio) / 2.0 * globalSizeZ; | ||
countZ = globalSizeZ * ratio; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly is this ratio
? (maybe you should add a comment to explain below where you define the variables)
It seems you want to control the amount you are reading (ratio of the total data on that axes). My concern is with the start. For ratio of 0.5 you are bringing half the data starting from position globalSize
/4
Is this just a random start point or it represents something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the ratio parameter corresponds to the amount of data that should be read. Probably it is the most short description of the task. Specifying starts and counts for every MPI rank is not really convenient for the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, now that I looked at the paper it makes sense. My confusion was with the start point, but it's just an arbitrary choice.
} | ||
} | ||
|
||
void read3DPlane(int nproc, int rank, const std::string &filename, const int nsteps, adios2::IO &io, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to add a comment for each of these functions so that it's clear what you are testing. In this case it seems you are bringing the XZ plane (with X data divided across ranks) and only a fixed ration of the Z data (using the weird start equation as starting point)
|
||
option longopts[] = {{"help", no_argument, NULL, 'h'}, | ||
{"case", required_argument, NULL, 'c'}, | ||
{"filename", required_argument, NULL, 'f'}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you create the data as part of the test or where do you expect to get the input file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be actually used for reading data generated by adios_iotest program or any other data file, say generated by the S3D program.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, at some point we should add documentation on how to use this and examples of how you can generate the data for readthedocs
+1 to Greg's question. I also think we need data for this test. |
It can be used for automatic testing and hand testing. We never discussed that TAU should be used. I think it brings one more build dependency but probably not necessary for making measurements in 3 places. |
I moved it to separated adios_remote_read directory |
TAU already exists in adios2 so it's not an extra dependency. We don't have to use it but since this is a benchmarking tool, it would be nice having it. We can maybe add it to potential future work. |
Revison issues addressed Revision & Clean-up
34241e4
to
f747b6f
Compare
* master: Enable Shell-Check for gh-actions scripts Enable Shell-Check for circle CI scripts Enable Shell-Check for tau contract scripts Enable Shell-Check for scorpio contract scripts Enable Shell-Check for lammps contract scripts Delete VTK code in examples
…ad_test * origin/remote_read_test:
Can this be merged now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, my comments:
- In this form, it's not really adios_REMOTE_read. it's just a generic reading program (albeit one that only works if you have written type double variables).
- "Ratio" is applied oddly. If ratio is .5 (implying you want to read half the data), you'll apply the .5 to each dimension. This means that for a 3D array you read 1/8 of the data.
- I'm not at all clear on how ratio and direction interact, particularly in the nprocs>1 case. It looks like the countD (where D is a direction) values calculated as the result of applying the ratio are overwritten for that Direction (at least for the 3d Case) and instead each processor is just assigned a fraction of the data (1/nproc) in that direction.
- there's no checking to see if the dimensionality requested actually applies to the variable.
- I spent an extraordinary amount of time actually trying to get this to do something. If you don't specify --filename it silently doesn't do anything. If you specify a file that doesn't exist, it silently doesn't do anything. If you specify a variable that doesn't exist in the file, it silently doesn't do anything. If you specify a variable that isn't of type double, it silently doesn't do anything. If you specify a case it doesn't recognize, it silently doesn't do anything. If you specify to do a 3D read on a 1D variable, it gives you output, but it's not clear what it did, if anything. It wasn't immediately obvious to me when I first ran it, that it would never produce timing output to the terminal, but would instead write it to a file whose name is based upon the case, but which you can't otherwise control.
- That timing is essentially done only on reading all specified variables on an entire file seems like a missed opportunty. One could measure open time, how long individual reads took, impact of sync vs. defer, etc.
- I'll reiterate an earlier comment. I'm still not at all clear what role this program would play in a generalized test suite that might exercise remote read functionality. (To be specific, I know that it could do most anything, but it is maybe one component of a test suite that might include data generation, a performance test plan, etc. Whether or not this is useful as a component, is missing features, etc. is hard to say without any idea of a bigger picture.
A testing program to read 1D, 3D arrays and slices from 3D arrays. In the current state...