Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add native memory circuit breaker. #689

Merged
merged 4 commits into from
Jan 11, 2023
Merged

Conversation

jngz-es
Copy link
Collaborator

@jngz-es jngz-es commented Jan 11, 2023

Add native memory circuit breaker.
Refactor all breakers from common to plugin.
Add dynamic setting for native memory circuit breaker.

Signed-off-by: Jing Zhang jngz@amazon.com

Description

Add native memory circuit breaker.
Refactor all breakers from common to plugin.
Add dynamic setting for native memory circuit breaker.

Issues Resolved

#648

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Refactor all breakers from common to plugin.
Add dynamic setting for native memory circuit breaker.

Signed-off-by: Jing Zhang <jngz@amazon.com>
@jngz-es jngz-es requested a review from a team January 11, 2023 00:48
@codecov-commenter
Copy link

codecov-commenter commented Jan 11, 2023

Codecov Report

Merging #689 (fc43c48) into 2.x (9420c91) will decrease coverage by 0.41%.
The diff coverage is 85.29%.

@@             Coverage Diff              @@
##                2.x     #689      +/-   ##
============================================
- Coverage     83.91%   83.50%   -0.42%     
- Complexity     1008     1021      +13     
============================================
  Files            93       99       +6     
  Lines          3662     3740      +78     
  Branches        343      344       +1     
============================================
+ Hits           3073     3123      +50     
- Misses          444      470      +26     
- Partials        145      147       +2     
Flag Coverage Δ
ml-commons 83.50% <85.29%> (-0.42%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ml/action/load/TransportLoadModelOnNodeAction.java 97.29% <ø> (ø)
.../org/opensearch/ml/breaker/DiskCircuitBreaker.java 36.36% <0.00%> (ø)
...rg/opensearch/ml/breaker/MemoryCircuitBreaker.java 100.00% <ø> (ø)
...opensearch/ml/breaker/ThresholdCircuitBreaker.java 100.00% <ø> (ø)
...n/java/org/opensearch/ml/model/MLModelManager.java 78.20% <ø> (ø)
...va/org/opensearch/ml/task/MLExecuteTaskRunner.java 83.33% <ø> (ø)
...va/org/opensearch/ml/task/MLPredictTaskRunner.java 82.14% <ø> (ø)
...main/java/org/opensearch/ml/task/MLTaskRunner.java 97.72% <ø> (ø)
...pensearch/ml/task/MLTrainAndPredictTaskRunner.java 78.26% <ø> (ø)
...a/org/opensearch/ml/task/MLTrainingTaskRunner.java 74.76% <ø> (ø)
... and 15 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@ylwu-amzn
Copy link
Collaborator

{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"persistent setting [plugins.ml_commons.native_mem_threshold], not recognized"}],"type":"illegal_argument_exception","reason":"persistent setting [plugins.ml_commons.native_mem_threshold], not recognized"},"status":400}

Sicheng added BWC test, which test from 2.4.0, in 2.4.0, we don't have the native memory threshold setting

@@ -53,4 +53,7 @@ private MLCommonsSettings() {}
Setting.Property.NodeScope,
Setting.Property.Dynamic
);

public static final Setting<Integer> ML_COMMONS_NATIVE_MEM_THRESHOLD = Setting
.intSetting("plugins.ml_commons.native_mem_threshold", 90, 0, 100, Setting.Property.NodeScope, Setting.Property.Dynamic);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mem seems not so clear. how about change to

Suggested change
.intSetting("plugins.ml_commons.native_mem_threshold", 90, 0, 100, Setting.Property.NodeScope, Setting.Property.Dynamic);
.intSetting("plugins.ml_commons.native_memory_threshold", 90, 0, 100, Setting.Property.NodeScope, Setting.Property.Dynamic);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, will change it to memory.

}

@Override
public boolean isOpen() {
try {
return AccessController.doPrivileged((PrivilegedExceptionAction<Boolean>) () -> {
return (new File(diskDir).getFreeSpace()/1024/1024/1024) < getThreshold(); // in GB
return (new File(diskDir).getFreeSpace() / 1024 / 1024 / 1024) < getThreshold(); // in GB
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: we can create constant for the disk threshold rather than recalculate for every request.

public static final long DEFAULT_FREE_DISK_SPACE_THRESHOLD = 5l * 1024 * 1024 * 1024;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's from original code not for this PR. Sure, I can add it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to add a contant only for 1024 * 1024 * 1024, as the original logic is using G as default unit.

for (CircuitBreaker breaker : breakers.values()) {
if (breaker.isOpen()) {
return breaker.getName();
return (ThresholdCircuitBreaker) breaker;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

breaker class is CircuitBreaker, this line enforce it to ThresholdCircuitBreaker, seems not so extensible. If we have new child class of CircuitBreaker, this line will throw error.

Why need to change the return type from String to ThresholdCircuitBreaker?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to get more information like the current threshold when breaker is triggered to help debug.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can add a warning log to help

Signed-off-by: Jing Zhang <jngz@amazon.com>
Signed-off-by: Jing Zhang <jngz@amazon.com>
@jngz-es jngz-es requested a review from ylwu-amzn January 11, 2023 01:51
}

@Override
public boolean isOpen() {
try {
return AccessController.doPrivileged((PrivilegedExceptionAction<Boolean>) () -> {
return (new File(diskDir).getFreeSpace()/1024/1024/1024) < getThreshold(); // in GB
return (new File(diskDir).getFreeSpace() / GB) < getThreshold(); // in GB
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just set threshold and compare directly without dividing GB for every request? Just like what you did for native memory breaker?

Signed-off-by: Jing Zhang <jngz@amazon.com>
@jngz-es jngz-es requested a review from ylwu-amzn January 11, 2023 02:29
expectedEx.expect(MLException.class);
expectedEx.expectMessage("Disk Circuit Breaker is open, please check your resources!");
expectedEx.expectMessage("Disk Circuit Breaker is open, threshold is 87. Please check your resources!");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use 87?

@jngz-es jngz-es merged commit d809dd2 into opensearch-project:2.x Jan 11, 2023
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jan 11, 2023
* Add native memory circuit breaker.
Refactor all breakers from common to plugin.
Add dynamic setting for native memory circuit breaker.

Signed-off-by: Jing Zhang <jngz@amazon.com>

* Address the comments 1.

Signed-off-by: Jing Zhang <jngz@amazon.com>

* Spotless changes.

Signed-off-by: Jing Zhang <jngz@amazon.com>

* Address the comments 2.

Signed-off-by: Jing Zhang <jngz@amazon.com>

Signed-off-by: Jing Zhang <jngz@amazon.com>
(cherry picked from commit d809dd2)
@ylwu-amzn ylwu-amzn mentioned this pull request Jan 11, 2023
5 tasks
ylwu-amzn pushed a commit that referenced this pull request Jan 11, 2023
* Add native memory circuit breaker.
Refactor all breakers from common to plugin.
Add dynamic setting for native memory circuit breaker.

Signed-off-by: Jing Zhang <jngz@amazon.com>

* Address the comments 1.

Signed-off-by: Jing Zhang <jngz@amazon.com>

* Spotless changes.

Signed-off-by: Jing Zhang <jngz@amazon.com>

* Address the comments 2.

Signed-off-by: Jing Zhang <jngz@amazon.com>

Signed-off-by: Jing Zhang <jngz@amazon.com>
(cherry picked from commit d809dd2)

Co-authored-by: Jing Zhang <jngz@amazon.com>
ylwu-amzn pushed a commit to ylwu-amzn/ml-commons that referenced this pull request Feb 17, 2023
* Add native memory circuit breaker.
Refactor all breakers from common to plugin.
Add dynamic setting for native memory circuit breaker.

Signed-off-by: Jing Zhang <jngz@amazon.com>

* Address the comments 1.

Signed-off-by: Jing Zhang <jngz@amazon.com>

* Spotless changes.

Signed-off-by: Jing Zhang <jngz@amazon.com>

* Address the comments 2.

Signed-off-by: Jing Zhang <jngz@amazon.com>

Signed-off-by: Jing Zhang <jngz@amazon.com>
ylwu-amzn pushed a commit to ylwu-amzn/ml-commons that referenced this pull request Mar 2, 2023
* Add native memory circuit breaker.
Refactor all breakers from common to plugin.
Add dynamic setting for native memory circuit breaker.

Signed-off-by: Jing Zhang <jngz@amazon.com>

* Address the comments 1.

Signed-off-by: Jing Zhang <jngz@amazon.com>

* Spotless changes.

Signed-off-by: Jing Zhang <jngz@amazon.com>

* Address the comments 2.

Signed-off-by: Jing Zhang <jngz@amazon.com>

Signed-off-by: Jing Zhang <jngz@amazon.com>
ylwu-amzn pushed a commit to ylwu-amzn/ml-commons that referenced this pull request Mar 2, 2023
* Add native memory circuit breaker.
Refactor all breakers from common to plugin.
Add dynamic setting for native memory circuit breaker.

Signed-off-by: Jing Zhang <jngz@amazon.com>

* Address the comments 1.

Signed-off-by: Jing Zhang <jngz@amazon.com>

* Spotless changes.

Signed-off-by: Jing Zhang <jngz@amazon.com>

* Address the comments 2.

Signed-off-by: Jing Zhang <jngz@amazon.com>

Signed-off-by: Jing Zhang <jngz@amazon.com>
ylwu-amzn added a commit that referenced this pull request Mar 2, 2023
* Add native memory circuit breaker.
Refactor all breakers from common to plugin.
Add dynamic setting for native memory circuit breaker.



* Address the comments 1.



* Spotless changes.



* Address the comments 2.

Signed-off-by: Jing Zhang <jngz@amazon.com>
Co-authored-by: Jing Zhang <jngz@amazon.com>
@jngz-es jngz-es deleted the native branch May 23, 2023 20:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants