-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature branch] Add Neural Stats API #1208
base: feature/neural-stats-api
Are you sure you want to change the base?
[Feature branch] Add Neural Stats API #1208
Conversation
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Signed-off-by: Andy Qin <qinandy@amazon.com>
Shouldn't it be a full flat? Similar to cluster setting. https://opensearch.org/docs/latest/api-reference/cluster-api/cluster-settings/#path-parameters
|
It can go either way, but I think preserving some basic structure is helpful for navigating the high level categories. For example, cluster setting still divides the flat keys between The way I see it, flatten is an option to simplify navigating the nested fields, not a hard rule to flatten everything. Keeping some info grouped together still makes it easier to parse and navigate than if it was completely flat, e.g. stat metadata is easier to recursively parse since all the info is "close by". Example cluster settings call with flatten enabled:
Compare to non-flat
|
That make sense. We only need to flatten the key for easier search. Then, shouldn't it be like this?
Also wondering if this could be better.
|
There are two components: Stats API and Metric Management. For the Metric Management code, it might be better to have a separate package to make it more portable for use in other plugins. |
Yes, that makes more sense, good catch
I considered that as well, what I'm thinking is the "stat metadata" object replaces the raw value so the flat key name is the same whether or not metadata is included or not. Like the format will always be at It also keeps things organized when you parse recursively. For example. if a cluster manager is periodically calling the stats API and parsing the response without hardcoding stat names, if we keep metadata non-flat, for every stat you can foreach on the categories with flat_keys and parse each stat and all its metadata one by one in a single pass. Whereas if it was completely flat metadata, running a foreach would need additional logic to split, sort, and group the metadata keys by the stat name again. Also you would probably never be interested in the value of Just some thoughts. I don't see the usecase right now but I could still include an option to flatten the stat metadata as well, it's a pretty small change. |
Sure, although it'd take a little more intricate design to decouple some stuff like state stat calculation from plugin-specific details and refactoring to make it all extensible. Maybe that'd be a separate RFC? |
Please add the change in changelog.md |
Signed-off-by: Andy Qin <qinandy@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completed 1st round of review.
|
||
PipelineServiceUtil.instance().initialize(clusterService); | ||
NeuralSearchSettingsAccessor.instance().initialize(clusterService, environment.settings()); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extra line
@@ -105,7 +123,11 @@ public Collection<Object> createComponents( | |||
NeuralSparseQueryBuilder.initialize(clientAccessor); | |||
HybridQueryExecutor.initialize(threadPool); | |||
normalizationProcessorWorkflow = new NormalizationProcessorWorkflow(new ScoreNormalizer(), new ScoreCombiner()); | |||
return List.of(clientAccessor); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove extra line
|
||
import java.util.Optional; | ||
|
||
public class RestActionUtils { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add javadoc of class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take reference from ProcessorUtils
import java.util.Optional; | ||
|
||
public class RestActionUtils { | ||
public static Optional<String[]> splitCommaSeparatedParam(RestRequest request, String paramName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add javadoc
return Optional.ofNullable(request.param(paramName)).map(s -> s.split(",")); | ||
} | ||
|
||
public static Optional<String> getStringParam(RestRequest request, String paramName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add javadoc
import org.opensearch.common.settings.Settings; | ||
import org.opensearch.neuralsearch.stats.events.EventStatsManager; | ||
|
||
public class NeuralSearchSettingsAccessor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add javadoc
.map(String::toLowerCase) | ||
.collect(Collectors.toSet()); | ||
|
||
private NeuralSearchSettingsAccessor settingsAccessor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this declaration to top
private boolean initialized; | ||
|
||
@Getter | ||
private volatile Boolean isStatsEnabled; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private volatile Boolean isStatsEnabled; | |
private volatile boolean isStatsEnabled; |
* Holds user requester input parameters for retrieving neural stats via rest handler API. | ||
* It allows filtering statistics by node IDs, event statistic types, and state statistic types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Holds user requester input parameters for retrieving neural stats via rest handler API. | |
* It allows filtering statistics by node IDs, event statistic types, and state statistic types. | |
* Entity class that holds input parameters for retrieving neural stats. | |
* It is responsible for filtering statistics by node IDs, event statistic types, and state statistic types. |
import java.util.Map; | ||
|
||
/** | ||
* Manager class for event counter stats, used to track monotonically increasing counts plus some possible metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Manager class for event counter stats, used to track monotonically increasing counts plus some possible metadata | |
* Manager class for event counter stats, used to track monotonically increasing counts plus some possible metadata |
Please rewrite to make it more reader friendly. It should provide high level information of this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I have two comments:
-
The most frequently called method (currently the
increment
method) should always operate in O(1) time complexity, regardless of the number of metrics defined in the future.Hash.get()
does not always guarantee O(1) performance when a large number of items are stored. -
We should avoid using the
public static instance()
pattern. This approach should only be considered as a last resort when no other alternatives exist, as it can make unit testing more difficult.
} | ||
|
||
// If no stats are specified, add all stats to retrieve all by default | ||
if (retrieveAllStats) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if stats is specified, if there is not matching stats, do we want to return all stats instead of empty?
* Return instance of the cluster context, must be initialized first for proper usage | ||
* @return instance of cluster context | ||
*/ | ||
public static synchronized NeuralSearchSettingsAccessor instance() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that this is a commonly used pattern, but can we avoid this approach as much as possible? Implementing a singleton this way can create long-term testing challenges. If one test modifies the settings, it affects all other tests.
It shouldn't be globally accessible through static method. Instead, it was created once and passed around by injection through constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can switch most of the singletons to injection. PipelineServiceUtil
, NeuralSearchSettingsAccessor
, and StateStatsManager
should be trivial.
For the EventStatsManager
, though, incrementing stats via static method call seems cleaner otherwise we'd have to add the instance to every single constructor of every object that wants to call it which is pretty invasive. I see OS core has an injector interface and uses @Inject for things like Transport Action, but it doesn't seem utilized in many other places, maybe for event stats its easier to keep the singleton instance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. For EventStatsManager
, it might not be suitable to inject.
|
||
// Synchronized thread call shouldn't have performance hit, this will be called infrequently | ||
synchronized (this) { | ||
buckets = newBuckets; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think resetting the value might be a better approach than creating a new object since this object has been alive for a long time. As a result, it may have been moved to the old generation, making it less likely to be collected during young generation cleanup by JVM.
bucket.timestamp.set(0);
bucket.count.reset();
* @throws IOException if there's an error reading from the stream | ||
*/ | ||
public NeuralStatsInput(StreamInput input) throws IOException { | ||
nodeIds = input.readBoolean() ? new HashSet<>(input.readStringList()) : new HashSet<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
readOptionalStringList();
might be better in terms of readability.
Wonder if we can make nodeIds as List so that we can avoid this list to set conversion.
* @param set the EnumSet to write | ||
* @throws IOException If there's an error writing to the stream | ||
*/ | ||
private void writeOptionalEnumSet(StreamOutput out, EnumSet<?> set) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to have corresponding readOptionalEnumSet()
Better if we can add this method inside core :)
*/ | ||
public void inc(EventStatName eventStatName) { | ||
boolean enabled = settingsAccessor.getIsStatsEnabled(); | ||
if (enabled && stats.containsKey(eventStatName)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could become an issue when there are many eventStatName
values. While HashMap.get
is generally constant time, its performance may degrade as the number of items grows. Given that this method is called frequently, we may need to explore an alternative approach to access the instance directly for better efficiency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can have the EventStat be a member of the EventStatName enum and be instantiated in the enum constructor. That way we can access it directly without using a map at all.
If we have performance issues with hash map get, though, we do have other options, my understanding is the O(n) worst case happens when there are key object hash code collisions leading to putting objects in the same bucket and requiring a search on the bucket. There are ways we can mitigate this like implementing our hash code methods to ensure uniqueness, and Java has its own optimizations to address this like storing collisions in a tree rather than linked list.
I'll update again once I finish running performance benchmarks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Storing it on a tree is also not O(1). Also, even if the hashcode are distinct, I don't think they will all have their own bucket. The number of bucket can be smaller then the number of hashcode.
Yes. One use case that I can think of is that, when operator want to grep stats containing processor, this whole flat option can be handy.
|
|
||
@Override | ||
public List<Route> routes() { | ||
return ImmutableList.of( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this list can be a class constant, no need to construct it on every call as it's immutable object
splitCommaSeparatedParam(request, "nodeId"); | ||
splitCommaSeparatedParam(request, "stat"); | ||
request.paramAsBoolean(FLATTEN_PARAM, false); | ||
request.paramAsBoolean(INCLUDE_METADATA_PARAM, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we need to process parameters and have those 4 lines if we are a going to return 403 or whatever response in case user did opted in?
*/ | ||
private NeuralStatsRequest getRequest(RestRequest request) { | ||
// parse the nodes the user wants to query | ||
String[] nodeIdsArr = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nodeIdsArr array can be null, add comment about how the null value will be handled: will that trigger an exception or we treat it as "all nodes" etc.
} else if (STATE_STAT_NAMES.contains(stat)) { | ||
retrieveAllStats = false; | ||
neuralStatsInput.getStateStatNames().add(StateStatName.from(stat)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a log statement in case of neither if nor else are not true
} | ||
|
||
public void initialize(ClusterService clusterService, Settings settings) { | ||
if (initialized) return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add curly braces for better readability
updateClusterSettings(); | ||
|
||
// Only enable stats for this IT to prevent collisions | ||
updateClusterSettings("plugins.neural_search.stats_enabled", true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should be able to get this name from that constant NEURAL_STATS_ENABLED
import java.util.Map; | ||
|
||
@Log4j2 | ||
public class RestNeuralStatsHandlerIT extends BaseNeuralSearchIT { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also need a BWC test. It's fine if there is no 2.x version that supports stats, it's needed for future 3.x releases
Description
Implementing Neural Stats API framework design proposed in #1196. This initial PR sets up the foundation for the framework to track event and state stats throughout the neural search plugin and exposed their values via API.
See RFC for more details.
Initial implementation includes 3 stats:
Related Issues
Resolves #1196 #1104
Related: #1146
Check List
--signoff
.Example requests
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.