Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use triple dashes at the end of the build args. #86

Merged
merged 5 commits into from
Jul 12, 2017
Merged

Conversation

clalancette
Copy link
Contributor

This ensures that the -- makes it onto the command-line,
and should fix jobs that sometimes gets the arguments
embedded into the directory names.

I believe this will fix: ros2/build_farmer#33

@clalancette clalancette added the in progress Actively being worked on (Kanban column) label Jun 21, 2017
@mikaelarguedas
Copy link
Member

👍 that's the first thing to try. Can you provide a job that proves that it fixes the problem ?

@mikaelarguedas
Copy link
Member

Sorry just saw that this is still in progress. Hopefully this will fix #51 as well

@clalancette
Copy link
Contributor Author

Yeah, I should have linked to it earlier, forgot to. Here it is:

Build Status

Unfortunately, a single build doesn't tell us if it is fixed, but at least it is a start. I'm not entirely sure how to test this on all of the jobs on the build farm; suggestions?

@mikaelarguedas
Copy link
Member

I think that deploying a test job on the farm with these settings and try to pass it various parameters is a good way to test it. Note that the logic of https://github.com/ros2/ci/blob/master/job_templates/ci_job.xml.em#L279-L284 and its derivation (for test args and for windows) will need to be updated accordingly as well

@clalancette
Copy link
Contributor Author

@mikaelarguedas I'm testing with the code I just pushed here now. Thoughts about this approach of always doing it in the job?

@@ -77,7 +77,7 @@ def main(argv=None):
'use_osrf_connext_debs_default': 'false',
'use_fastrtps_default': 'true',
'use_opensplice_default': 'false',
'ament_build_args_default': '--parallel --cmake-args -DSECURITY=ON --',
'ament_build_args_default': '--parallel --cmake-args -DSECURITY=ON',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should stay, otherwise what happens to the next argument added ? it will be considered part of the cmake args ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right, I see. I misunderstood the intent of that.

This actually points out a weakness in this approach, which I don't quite know how to resolve yet. If the user has a parameter like --cmake-args -DSECURITY=ON --, then the command-line ends up resolving to something like:

--cmake-args -DSECURITY=ON -- --- --ament-test-args

Which causes the run_ros2_batch.py command-line parser to explode. We could fix that in the parser, but it doesn't quite seem to be the correct thing to do; that -- --- is gibberish, and probably shouldn't be allowed. I'm not sure how to handle it yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which causes the run_ros2_batch.py command-line parser to explode.

The -- --- doesn't make sense, but --- -- would. So I'd say we need to fix the input, not the parsing in run_ros2_batch.py. But that's just my opinion without looking into the details very much.

In case you're wondering how this is handled in the build tool, see:

@clalancette
Copy link
Contributor Author

The -- --- doesn't make sense, but --- -- would. So I'd say we need to fix the input, not the parsing in run_ros2_batch.py. But that's just my opinion without looking into the details very much.

I agree with your assessment that we need to fix the input, not the parsing (I'm working on that now). I'm just curious about why you think --- -- makes sense; what would that mean, exactly?

@wjwwood
Copy link
Member

wjwwood commented Jun 21, 2017

--- would get converted to -- and be included in the captured arguments, please see:

http://osrf-pycommon.readthedocs.io/en/latest/cli_utils.html#osrf_pycommon.cli_utils.common.extract_argument_group

@clalancette
Copy link
Contributor Author

I just pushed what I'm currently working with. This piece of code mostly works on Linux, I have not yet tested Windows. However, it does not work in the situation that @wjwwood described in his last post; namely, if you pass --- -- at the end of the build args, you end up with: -- -- in the final output, which again causes the parser to explode. This is because we end with --, so the bash script thinks this is the end of the section, and trying to avoid -- ---, we replace the final -- with ---. We now have --- ---, which gets parsed down to -- -- by run_ros2_batch.py, and then causes ament to fail when trying to parse.

It's starting to feel like this is the wrong path; but that being said, I don't have a better idea on what I should do at the moment. Suggestions are welcome, otherwise I'll sleep on it and give it another try tomorrow.

@wjwwood
Copy link
Member

wjwwood commented Jun 21, 2017

Why not just ---? (if --- --- works except for an extra -- that we don't want)

Maybe it would be helpful to list the steps it goes through (something like bash -> run_ros2_batch.py -> ament.py build ...) and what you might desire at each step or at least what you want the start and end states to be. Without digging into the details myself I don't know what we're trying to accomplish, but if I did I could probably say what needs to be done or if there is a conceptual issue. Also I bet by laying it out step by step you can decide what needs to be done as well.

Let me know if I can help.

@clalancette
Copy link
Contributor Author

@wjwwood Thanks, that was a helpful way to go about it. There are actually a lot of steps that the whole thing goes through to get to the ament command-line. Just so I have it written down, here is a basic list:

  • The Jenkins job generates CI_* variables for the parameters (e.g. CI_AMENT_BUILD_ARGS). These variables are available as environment variables for the Jenkins build script that gets executed.
  • The Jenkins build script takes these CI_* variables, does some modifications to them, and collects them together into a single CI_ARGS environment variable.
  • That CI_ARGS environment variable is passed into the Docker container at runtime as an environment variable with the same name.
  • The ENTRYPOINT + CMD in the Dockerfile ends up executing /entry_point.sh python3 -u run_ros2_patch.py $CI_ARGS.
  • /entry_point.sh only looks to see if there is a connext command-line argument; if so, it installs additional packages. After that, it execs (as rosbuild) python3 -u run_ros2_batch.py $CI_ARGS.
  • run_ros2_batch.py parses all of the command-line arguments with the cli_utils module. It's at this point that double-dashes get dropped, and triple-dashes get converted into double-dashes. After parsing, run_ros2_batch.py ends up building the ament command-line using the parsed arguments.
  • Finally, ament is run with the command-line from above, and re-parses the command-line using the cli_utils module. Again, double-dashes get dropped and triple-dashes get converted into double-dashes.

Fundamentally, it seems to me like the problem lies in the double-parsing of the command-line options between run_ros2_batch.py and ament. One way we could get around this is by not passing the ament build/test arguments on the run_ros2_batch.py command-line, and instead have entry_point.sh write them to a file. Then run_ros2_batch.py would read in this file and pass those options unmolested through to ament, and things would work better (the same basic idea would be used on OSX and Windows, though the details would be different). Thoughts on this approach?

@clalancette
Copy link
Contributor Author

clalancette commented Jun 22, 2017

I did do an implementation of the above on the pass-args-in-files branch. It seems to work OK (I've only tested Linux so far), though it introduces a bit more coupling between the Dockerfile and the ros2_batch_job script. If we like that approach better, I'll make that one work for Windows and OSX, give up on this branch, and open a different PR over there.

@wjwwood
Copy link
Member

wjwwood commented Jun 26, 2017

@clalancette I actually think that the double parsing of the arguments, while somewhat hard to read, is not untenable. The whole point of being able to escape the -- as --- is to allow something like this to occur. Will something like this not work:

  • run_ros2_batch.py --ament-args --cmake-args -DBUILD_TYPE=Debug --- --parallel -- --connext ...
  • which would make args.ament_args equal to --cmake-args -DBUILD_TYPE=Debug -- --parallel which could get passed to ament.py build as is

Which means that anytime a bare -- is encountered in the text field of jenkins we would just escape that by replacing it with ---.

Again, maybe I'm missing some piece of the puzzle, but this seems like it should just work if we use the escaping mechanism correctly.

@clalancette
Copy link
Contributor Author

It is possible to make it work with the escaping mechanism as you suggest, by replacing all -- on the command-line with ---. However, I'm not sure it's a general mechanism; what if the user puts --- in the Jenkins parameter? Do we then escape that with a ---- , etc?

It just seems to me that having run_ros2_batch.py parse arguments that are not intended for it is a recipe for pain. By putting the ament Jenkins parameters into a file, we avoid all of that and give ament exactly the arguments the user intended.

@wjwwood
Copy link
Member

wjwwood commented Jun 27, 2017

However, I'm not sure it's a general mechanism; what if the user puts --- in the Jenkins parameter? Do we then escape that with a ----, etc?

Yes, it's a relatively simple find/replace (maybe not with batch, but we could call out to Python if needed).

It just seems to me that having run_ros2_batch.py parse arguments that are not intended for it is a recipe for pain. By putting the ament Jenkins parameters into a file, we avoid all of that and give ament exactly the arguments the user intended.

I think the escaping of the terminating character, i.e. --, is the only thing that needs to be handled at the moment (I don't foresee any other issues with it right now). So I would personally use the mechanism that's already there to handle it. But if you really feel the file is necessary, that's fine too. You just need to make sure it is obvious what is being passed when the job is running, by doing things like echoing the file to the screen before running. The one downside of the file is that it doesn't have as much visibility as command line arguments.

@clalancette
Copy link
Contributor Author

Yeah, I agree about the visibility part; I can definitely have it print out (though in some sense, it already is because it ends up on the ament command-line, which is printed out).

I actually don't feel that strongly about it. If you feel like there aren't further issues that we'll encounter down the road, then I can go with the simpler alternative of just escaping things. I'll update this PR to do that instead of the file, and see how things turn out.

@wjwwood
Copy link
Member

wjwwood commented Jun 27, 2017

I actually don't feel that strongly about it. If you feel like there aren't further issues that we'll encounter down the road, then I can go with the simpler alternative of just escaping things. I'll update this PR to do that instead of the file, and see how things turn out.

Sure, as I said, it's up to you. The escaping seems like a smaller change to me, but on the other hand it might be less robust long term.

Let me know if I can help.

According to

https://social.technet.microsoft.com/Forums/scriptcenter/en-US/a009485a-046b-4dca-afd3-20555305e617/batch-variable-not-updating-as-expected?forum=ITCG

and empirical evidence, variable expansion is done at read time,
not execute time, by default in batch.  The way to make it work
at execute time is to setlocal enableDelayedExpansion (which we have),
and use ! for all variables (which we don't have).  Switch all
of these to ! so that we get expansion at execute time.  I can't
say I understand why this worked before; it should have always
had this problem.

Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
This ensures that the -- makes it onto the command-line,
and should fix jobs that sometimes gets the arguments
embedded into the directory names.

Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
Regular for loops always split on space and equals signs, and
there is nothing you can do about it.  Switch to /f instead
and do the loop ourselves, which is more ugly but actually
works.

Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
@clalancette
Copy link
Contributor Author

All right, I've gone back and done what @wjwwood suggested and did the escaping inline. It does seem like it is going to work, and I'm running some test builds now. I'll set this back to review once I'm happy with those tests (and I'll link to them).

@clalancette
Copy link
Contributor Author

clalancette commented Jun 30, 2017

This comment is a collection of tests I've run to test out the parsing capabilities:

Linux, default ament build args: Build Status
Windows, default ament build args: Build Status
OSX, default ament build args: Build Status
Linux, no ament build args: Build Status
Windows, no ament build args: Build Status
OSX, params after --: Build Status
Windows, params after --: Build Status

@mikaelarguedas
Copy link
Member

all these test cases were working with the current implementation. Is it possible to test configurations that break the current one (e.g. http://ci.ros2.org/job/ci_osx/1901/parameters/ from the original issue)?

@clalancette
Copy link
Contributor Author

Yeah, I'll do those additional tests as well. I just wanted to make sure all of the combinations were going to work, particularly on Windows (since I did a lot of batch hackery there, and I've never really done that before).

@clalancette clalancette added in review Waiting for review (Kanban column) and removed in progress Actively being worked on (Kanban column) labels Jul 11, 2017
@clalancette
Copy link
Contributor Author

I'm still running an additional test or two, but I'm pretty happy with this. So I think this is ready for review; I won't merge anything until the tests are complete.

Copy link
Member

@wjwwood wjwwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had two questions, otherwise lgtm.

set "PATH=!PATH:"=!"
set "CI_ARGS=--force-ansi-color --workspace-path !WORKSPACE!"
if "!CI_BRANCH_TO_TEST!" NEQ "" (
set "CI_ARGS=!CI_ARGS! --test-branch !CI_BRANCH_TO_TEST!"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you need to replace all of the %...% with !...!? Is there a new double expansion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it's an interesting question. Windows batch files are kind of dumb; they expand any %...% variables at the time the statement is read, not at the time it is executed. Thus, constructs where you append arguments don't generally work like you expect. I saw this while I was testing on a local Windows instance. The workaround for this behavior is to specify setlocal enableDelayedExpansion, but that doesn't change the behavior of %...% variables; it just introduces the concept of delayed expansion to !...! variables. Thus, since we are indeed doing appending on some of the variables, I just switched them all to !...!. One of the main references I used for this is here: https://ss64.com/nt/delayedexpansion.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just switched them all to !...!.

I was a bit concerned about this, because you're change their behavior just because. I was always trying to use !...! sparingly, defaulting to %...%, because that is what you want after all.

It might not break anything, and it might even been necessary if there is a new mechanism for expanding that requires them all to be, but just switching them for consistency is not right imo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the code stands now, you are right in that it is not strictly necessary for the vast majority of them (CI_ARGS is the obvious one that needs to be !...!). My motivation for converting them all is that this whole %...% vs. !...! business isn't really intuitive, particularly in the Jenkins environment. If someone ever tries to make another change where they change the value of the variable, they'll have to re-discover the correct mechanism. By switching them all, following the pattern becomes easier, and the variables will more-or-less always do what you expect.

That being said, if you still think we should leave them as %...%, I can switch it back (except for the ones that need to be !...!). Thoughts?

Copy link
Member

@wjwwood wjwwood Jul 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind them being left as your patch is, but I just have no way of knowing (as a reviewer) if this will produce undesired behavior in some currently untested scenario. For all I know one or more of those should have been %...%. It's a bit like replacing all ++x with x++ because its function is confusing. I don't disagree that this mechanism in batch is confusing, but anyone who wants to touch any batch code needs to understand that difference when it is important.

As I said, in this case I'm fine to leave it, but in the future I'd avoid changing them unless there is a need to do so.

@@ -12,8 +12,6 @@ export ORIG_GID=$(echo $ORIGPASSWD | cut -f4 -d:)
export UID=${UID:=$ORIG_UID}
export GID=${GID:=$ORIG_GID}

ARCH=`uname -i`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is not needed anymore? Maybe it was used by dpkg or in one of the later scripts? I don't know if either is the case, but I was wondering if you knew specifically that it wasn't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, you know, I'm not 100% certain. I should probably just leave it alone for now. I'll revert that part.

We don't know for sure it is not needed anymore, so just
leave it alone for now.

Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
@clalancette
Copy link
Contributor Author

All right, I'm going to merge this, and then I'll have to deploy it (which I haven't done before). I've read the README.md about running the create_jenkins_job.py script; is that all I need to do to deploy? Are there any gotchas that I need to be aware of?

@clalancette clalancette merged commit 3210be0 into master Jul 12, 2017
@clalancette clalancette removed the in review Waiting for review (Kanban column) label Jul 12, 2017
@clalancette clalancette deleted the triple-dashes branch July 12, 2017 13:31
@wjwwood
Copy link
Member

wjwwood commented Jul 12, 2017

All you have to do is run that script, but for it to work you have to setup your credentials for Jenkins locally. I think the instructions to do that are linked to from the readme.

@clalancette
Copy link
Contributor Author

Thanks, deployed now, and running a test job to make sure everything is good.

mikaelarguedas referenced this pull request Aug 17, 2017
They get put literally into the result.

Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
mikaelarguedas referenced this pull request Aug 17, 2017
Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nightly_linux_coverage build failed
3 participants