-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SubM operator, tests, e2e integration #151
Conversation
Something happened with CI: [submariner]$ PACKAGES=*.go operators pkg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial comments, I will keep reviewing tomorrow.
c8af7bd
to
3c00537
Compare
It was an issue with formatting. I fixed those issues in the latest changes. |
I am seeing other linting errors still, via the extra checks enabled in CI (I was just running e2e locally):
|
3c00537
to
c6957a7
Compare
@manosnoam @skitt - Can you guys try the latest here and report what you get? I'm currently hitting some this unexpected error, trying to track down the problem:
|
Same for me. |
Did you change the container’s security context recently? |
No, I was trying to keep things mostly stable as we finalize everything this week. I think it's something that changed under us when I rebased. If I make a new branch from 5cfe40d and then cherry-pick this commit on top (with a little |
Right, that’s very plausible. I’ll try bisecting. |
And for some unfathomable reason the Submariner pods now deploy correctly again... |
For me it fails earlier on ./gen_subm_operator.sh: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review
operators/go/gen_subm_operator.sh
Outdated
cp $controller_file_src $controller_file_dst | ||
|
||
popd | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of this function is very similar to the other one (add_subm_engine_to_operator) and thus it would be nicer to DRY and have just one function with parameters.
It would also be very useful for future operators (admiral, coastguard, lighthouse, etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good idea. I raised a Jira to track this: https://jira.coreos.com/browse/HYCLD-251
The general theme of a lot of answers to other review comments applies here - "yes, good idea, I'm just trying to keep it as simple as possible for now". There are definitely opportunities to make things more DRY and easy to extend for future controllers.
10d0627
to
a3a2da9
Compare
Yeah, it's a strange one. I'm trying to bisect as well. So far I'm pretty sure I can consistently reproduce it when I base this commit on the latest from master, and that the same Operator code fully passes for me when it's based on 3b46ebd or b0fccd7. |
I think this was resolved on https://jira.coreos.com/browse/HYCLD-250. |
@skitt I'm pretty confident I've bisected it to be between 5306204 (working) and 4b7def5 (failing). I've verified both of those "bookends" of the biset twice and seen the same results. It is possible that 4b7def5 is failing for a different reason however (and so that end of the search could be wrong) - it's not the same error we reported above on latest:
I've also just completed a run with 28f1b1d that passed. |
Still not fully tested because I'm having issues with git.apache.org (dep from operator sdk..) but consider this for the scripts directory:
That way we can just generate with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More food for thought (will continue reviewing on Sun)
|
||
clusterCidr_cluster2=10.245.0.0/16 | ||
clusterCidr_cluster3=10.246.0.0/16 | ||
serviceCidr_cluster2=100.95.0.0/16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here..
|
||
# These all need to end up in pod container/environment vars | ||
sed -i "/spec:/a \ \ submariner_namespace: $subm_ns" $cr_file | ||
if [[ $context = cluster2 ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get it from the "hashtable"
kubectl get submariner $deployment_name --namespace=$subm_ns -o jsonpath='{.spec.submariner_debug}' | grep $subm_debug | ||
kubectl get submariner $deployment_name --namespace=$subm_ns -o jsonpath='{.spec.submariner_namespace}' | grep $subm_ns | ||
kubectl get submariner $deployment_name --namespace=$subm_ns -o jsonpath='{.spec.submariner_natenabled}' | grep $natEnabled | ||
if [[ $context = cluster2 ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get it from the "hashtable"
kubectl get pod $subm_engine_pod_name --namespace=$subm_ns -o jsonpath='{.spec.containers..command}' | grep submariner.sh | ||
kubectl get pod $subm_engine_pod_name --namespace=$subm_ns -o jsonpath='{.spec.containers..env}' | ||
kubectl get pod $subm_engine_pod_name --namespace=$subm_ns -o jsonpath='{.spec.containers..env}' | grep "name:SUBMARINER_NAMESPACE value:$subm_ns" | ||
if [[ $context = cluster2 ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get it from the "hashtable"
kubectl exec -it $subm_engine_pod_name --namespace=$subm_ns -- env | grep "SUBMARINER_DEBUG=$subm_debug" | ||
kubectl exec -it $subm_engine_pod_name --namespace=$subm_ns -- env | grep "BROKER_K8S_APISERVERTOKEN=$SUBMARINER_BROKER_TOKEN" | ||
kubectl exec -it $subm_engine_pod_name --namespace=$subm_ns -- env | grep "BROKER_K8S_REMOTENAMESPACE=$SUBMARINER_BROKER_NS" | ||
if [[ $context = cluster2 ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get it from the "hashtable"
91e7bf7
to
c05f538
Compare
@skitt I haven't found a commit more recent than the one I'm currently basing the PR on (5306204) that totally passses. I've seen everything pass with this base three times. I see the Read-only file system failure on submariner-pod on commits from the HEAD of master back to e40d95c (0919214 and 9aad038 for example). There are 11 commits between that working 5306204 and failing e40d95c, all part of a larger change by @sridhargaddam. I see different results testing them, but I imagine that's because I'm consuming part of a larger change. On 79a29bf (the commit right after the known-working base now) I see everything deploy but the connection tests fail. On 4b7def5 and abe5b01 (2 more of the 11 between the known working/current-failure commits) I see Does it consistently pass for you with this (5306204) as the base? Have you seen this same operator code pass on any more recent commit as the base? |
Yes @dfarrell07 , the VxLAN support patch is a big change and it depends on couple of patches in submariner-charts (as mentioned in the commit message 79a29bf) Can you please check if all the three dependent patches mentioned in the commit message (79a29bf) are in your build. |
I get the impression the operator currently doesn’t handle running in a namespace other than the target namespace ( |
Why doesn’t the operator include all the CRDs used in the end-to-end tests? Are they supposed to be deployed in some other way? |
c05f538
to
de54101
Compare
@skitt - I made some tweaks and I'm now able to deploy/pass all tests in any namespace (configured by the
|
@skitt - Sorry, I might not understand what you mean - what CRDs are we talking about? With latest on PR/151, I see these CRDs.
With the latest from master (0919214), I see these CRDs:
|
de54101
to
280892e
Compare
Sorry I wasn’t very clear @dfarrell07! When I run the operator generator, I get two CRDs in the |
popd | ||
} | ||
|
||
function create_subm_endpoints_crd() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a nifty bash thing, it's a basic programming principle.
Since you're already using functions, at least try to minimize the code duplication if it's easy, it lends to better readability and maintainability.
clusters_crd_file=deploy/crds/submariner_clusters_crd.yaml | ||
|
||
# TODO: Can/should we create this with Op-SDK? | ||
cat <<EOF > $clusters_crd_file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't make sense to agree to disagree actually, I explained why it makes much more sense to keep outside the script than embedded inside it
FYI I'm working on getting e2e to run |
@mangelajo I rebased this on master as we discussed. I verified (a number of times) that we consistently see only the new expected-failing unit tests I mentioned. |
Thanks Daniel, I also checked, and it makes sense, the "src-ip"
preservation fails because
the route-agent is not deployed everywhere. Working on that now.
Helm uses Daemonset for that, so I'll do the same :-)
---
irc: ajo / mangelajo
Miguel Angel Ajo Pelayo
+34 636 52 25 69
skype: ajoajoajo
…On Wed, Oct 2, 2019 at 6:06 AM Daniel Farrell ***@***.***> wrote:
@mangelajo <https://github.com/mangelajo> I rebased this on master as we
discussed. I verified (a number of times) that we consistently see only the
new expected-failing unit tests I mentioned.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#151?email_source=notifications&email_token=AAI7G4QU2MIJA5EDF4GP74DQMQM5FA5CNFSM4IXCCHCKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEADPDCQ#issuecomment-537325962>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAI7G4WQ6NZJKOSYOMG5P2LQMQM5FANCNFSM4IXCCHCA>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more comments
CI using the Operator is timing out while still correctly-running:
https://travis-ci.com/submariner-io/submariner/builds/130283937 |
In the latest CI run the unit tests at the end were running/passing when the job timed out. That likely means things would pass, since all the Operator verifications have already passed. https://travis-ci.com/submariner-io/submariner/builds/130431372 (have to scrollllll) |
It looks like the maximum Travis job time for public repos is 50m and that it can't be changed. https://docs.travis-ci.com/user/customizing-the-build#build-timeouts CI bat signal to signal @dimaunx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can merge this now, and eventually move this all to it's own repo, but it's good for an alpha release IMHO.
Signed-off-by: Daniel Farrell <dfarrell@redhat.com> Signed-off-by: Miguel Angel Ajo Pelayo <majopela@redhat.com> Signed-off-by: Stephen Kitt <skitt@redhat.com> Signed-off-by: Mike Kolesnik <mkolesni@redhat.com> Signed-off-by: Janki Chhatbar <jchhatba@redhat.com>
Signed-off-by: Daniel Farrell dfarrell@redhat.com