-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor infra creation to improve the overall infra setup time #1869
Refactor infra creation to improve the overall infra setup time #1869
Conversation
✅ Deploy Preview for kubernetes-sigs-cluster-api-ibmcloud ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
b724976
to
21ca9b0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this PR, Overall approach looks good to me, Whats the time difference between old vs new approach?
|
||
powerVSCluster := clusterScope.IBMPowerVSCluster | ||
func (r *IBMPowerVSClusterReconciler) reconcilePowerVSResources(clusterScope *scope.PowerVSClusterScope, updatePowerVSCluster *updatePowerVSCluster, ch chan reconcileResult, wg *sync.WaitGroup) { | ||
defer wg.Done() | ||
// reconcile PowerVS service instance | ||
clusterScope.Info("Reconciling PowerVS service instance") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since PowerVS and VPC resource creations will happen concurrently, we can prefix log with some specific keys so we can trace the log of individual flow, something like
vpcLog := clusterScope.WithName("vpc")
vpcLog.Info("Creating vpc) , I hope it will work
} | ||
|
||
var networkReady, loadBalancerReady bool | ||
for _, cond := range clusterScope.IBMPowerVSCluster.Status.Conditions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By this time network should be ready right, When do you think it may not be ready also if its not ready we cannot even attach with TransitGateway right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we consider moving this check before calling ReconcileTransitGateway
to avoid failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With transit gateway we are directly attaching the powervs with CRN right? It is not dependent on the network. And I m not requeuing when network and load balancer is not ready since it would block the TG creation even though it is not dependant for the TG. But before proceeding to retrieve the VPC LB details and set the overall Ready status as true, wanted to validate their status has reached Ready, that's why added the condition to validate both of these resource's Ready status.
Currently it's taking around 15 mins and with these changes it is taking around 8 to 9 mins. |
21ca9b0
to
f70454b
Compare
@dharaneeshvrd as per the discussion, lets get the early testing done with this PR to ensure it is working fine |
Any update on this? /cc @Karthik-K-N please take a look and give lgtm if no more comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last question, Otherwise overall LGTM
return | ||
} | ||
|
||
conditions.MarkFalse(update.cluster, conditionArgs[0].(capiv1beta1.ConditionType), conditionArgs[1].(string), conditionArgs[2].(capiv1beta1.ConditionSeverity), conditionArgs[3].(string), conditionArgs[4:]...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there any better way rather than using interface{} types for conditionArgs? also since we are indexing it, Should we add a check for verifying the min legth to avoid index out of range?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I pass it as a straight types, we would get a golint error for exceeding the max number of args which is 5. Also I would need to create a separate func for true and false condition. It would complicate things IMO.
Should we add a check for verifying the min length to avoid index out of range?
Again, need to add a check on many places where we are updating the condition and won't be able to take major action if it errored. Ex here I want to return the main error instead of the error I m getting from this, it does n't look good to me. Scope of this is within the reconcile func and only two possible flow. I think the current way is ok IMO.
f70454b
to
5b9d799
Compare
I have tested personally and Ashwin Hendre also able to test this and reported the improvement of 12% overall cluster setup time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few minor nits, otherwise lgtm
powerVSCluster.updateCondition(false, infrav1beta2.NetworkReadyCondition, infrav1beta2.NetworkReconciliationFailedReason, capiv1beta1.ConditionSeverityError, err.Error()) | ||
ch <- reconcileResult{reconcile.Result{}, err} | ||
return | ||
} else if requeue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we aren't doing any requeue
here, hence it will be good if we can rename or rewrite the logic to have meaningful name here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also please add why there is no requeue here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have refactored the code, please take a look!
5b9d799
to
fe38345
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dharaneeshvrd, mkumatag The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #1837
Special notes for your reviewer:
With the changes, infra creation takes around ~8 mins.
Most time taking resource is DHCP server VM.
/area provider/ibmcloud
Release note: