-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query by Indexed Component Value #17608
base: main
Are you sure you want to change the base?
Conversation
crates/bevy_ecs/src/index/mod.rs
Outdated
} | ||
|
||
/// Marker describing the requirements for a [`Component`] to be suitable for indexing. | ||
pub trait IndexComponent: Component<Mutability = Immutable> + Eq + Hash + Clone {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Hash is too strong. BTreeMaps or QuadTrees will work just fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both of those would require Ord
, which might be difficult to justify for spatial indexes. Perhaps we can offer a more generic version of this API that would allow users to choose which trait bounds and storage option makes the most sense for them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's my hope. Ideally we move the bounds off of here, and onto the lookup methods.
Marking as ready-for-review as I have the finalised MVP API I would like for this feature. As mentioned, it would be nice to allow |
/// } | ||
/// } | ||
/// ``` | ||
pub fn at(&mut self, value: &C) -> Query<'_, '_, D, (F, With<C>)> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow-up: this implies the ability to do between
, which would be super sweet. Another argument for an Ord
backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely! Generalised storage backends could allow some really nice methods here for sure.
crates/bevy_ecs/src/index/mod.rs
Outdated
world.entity_mut(entity).remove_by_id(component_id); | ||
|
||
// On removal, we check if this was the last usage of this marker. | ||
// If so, we can recycle it for a different value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's pretty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's some pretty wild Rust in this file!
Co-Authored-By: Alice Cecile <alice.i.cecile@gmail.com>
Co-Authored-By: Alice Cecile <alice.i.cecile@gmail.com>
Co-Authored-By: Alice Cecile <alice.i.cecile@gmail.com>
Co-Authored-By: Alice Cecile <alice.i.cecile@gmail.com>
Co-Authored-By: Alice Cecile <alice.i.cecile@gmail.com>
Co-Authored-By: Alice Cecile <alice.i.cecile@gmail.com>
Co-Authored-By: Alice Cecile <alice.i.cecile@gmail.com>
Co-Authored-By: Alice Cecile <alice.i.cecile@gmail.com>
Co-Authored-By: Alice Cecile <alice.i.cecile@gmail.com>
Co-Authored-By: Alice Cecile <alice.i.cecile@gmail.com>
/// } | ||
/// } | ||
/// ``` | ||
pub fn at(&mut self, value: &C) -> Query<'_, '_, D, (F, With<C>)> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd call this where
personally but maybe at
has a more ecs-y flavor that I'm not familiar with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like at
personally because it's like we're getting the query at a particular value. But I'd be happy to change it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer at
to where
personally. I was otherwise thinking of for
, but I think at
might be better
Preliminary benchmarking shows the index speeds up iteration by a factor of two when 1-in-16 entities has the target value. However, it degrades mutation performance of that index component by a factor of ten in that same scenario. As expected, indexing is beneficial when you're searching for values at a much higher rate than updating them. |
Improves performance slightly
Improves mutation performance
With a little refactoring I was able to close the performance gap in mutation to a factor of 4 in the benchmark. Still costly, but an improvement nonetheless. One last thing I want to change is |
We do want to benchmark against (it's also the tool that I reach out for when I need to maintain an internal index in my own games) Or is the main benefit compared to using a hashmap that this makes the query code straightforward to update (we just need to update the access instead of making doing more involved changes into the list of archetypes to iterate, etc.)? I do like the elegance of the design, I guess the alternative would be for the I just wonder if this elegant ZST based design won't come to bite us later on, that's why I'd like to understand better the benefits vs the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know my review doesn't mean much on this, but I think the code looks really clean and I think the pitch for this implementation is really good.
} | ||
None => { | ||
// Need to clone the index value for later lookups | ||
let value = value.clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't have to be in this PR, but I see a world where the value that is used here is instead of the result of some IndexableComponent::hash_value()
function, so that we can group components into 'buckets' of values.
That way we could avoid the issue of needing too many ZSTs components for components with float values, as we could for example map all the entities with values between 0.0-1.0 in the same bucket (and the query would then have to do an additional check via a .iter().filter()
in the QueryIndex
since there's no guarantee that the entities returned by the index have the exact value that we want).
I'm not super knowledgeable about indexes but basically I think we can re-use a bunch of tricks used by traditional DB indexes
//! Fragmenting requires unique [component IDs](ComponentId), and they are finite. | ||
//! For components with a small number of values in use at any one time, this is acceptable. | ||
//! But for components with a _large_ number of unique values (e.g., components containing floating point numbers), | ||
//! the pool of available component IDs will be quickly exhausted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That does seem dangerous, I wonder if we could return a warning after a certain number of ZSTs are used?
Also I guess the work to make the ECS work well with higher componentID values will become more important
/// | ||
/// // Indexing is opt-in through `World::add_index` | ||
/// world.add_index::<Player>(); | ||
/// # world.spawn(Player(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the doc is a bit lengthy, maybe having 0 Player(0)
, 1 Player(1)
, 2 Player(2)
s is enough to demonstrate how this works
// If there is a marker, restrict to it | ||
Some(state) => builder.with_id(state.component_id), | ||
// Otherwise, create a no-op query by including With<C> and Without<C> | ||
None => builder.without::<C>(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clever!
// Simply call `add_index` to start indexing a component. | ||
// Make sure to call this _before_ any components are spawned, otherwise | ||
// the index will miss those entities! | ||
.add_index::<TilePosition>() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would be worthwhile to have some kind of configuration to let users define which entities they want indexed.
Maybe .add_index::<C>(IndexConfig)
with IndexConfig.automatic_indexation: bool
?
If it's set to false, then an entity would be indexed only if a Index<C>
component is added to an entity.
This lets us:
- ignore some entities that we don't want to get indexed (cheaper indexation). In your version they would still get indexed and the user would have to filter them out in their
QueryByIndex
- have an easy way to remove an entity from an index (by removing the
Index<C>
component)
For example I might only want to index the Transform
of only a few entities (player entities), but not of sprites, walls, etc.
Maybe both version can co-exist, because your version has ergonomic benefits (users don't have to add Index on every entity that needs to be indexed).
(Of course this doesn't have to be a blocker for this PR)
pub fn at(&mut self, value: &C) -> Query<'_, '_, D, (F, With<C>)> { | ||
self.state = { | ||
// SAFETY: Mutable references do not alias and will be dropped after this block | ||
let mut builder = unsafe { QueryBuilder::new(self.world.world_mut()) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this call to world_mut()
is unsound. It will conflict with uses in other params in the same system, or in other systems running in parallel. It might be impossible to use QueryBuilder
like this.
One approach that might work is to use the QueryState
you're already storing in the parameter state, and try to re-filter the matched_storage_ids
(and somehow put them back when you're done). That would also mean you only have to iterate archetypes/tables matching the overall filter, instead of having to check all of them as QueryBuilder
does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was concerned by this. I'm currently refactoring how this works to be more conservative with component IDs but once that is done I'd like to revisit how this works. Your suggestion might be the right way forward (for performance too probably!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, your plan is to preallocate 32 component ids to use as bits, and use those to encode a 32-bit index in the archetype, right? I think you could build something that stores a Vec<QueryState>
and does all of the archetype filtering in new_archetype
. That would give you no overhead at query execution time, and would let you get queries with &self
instead of &mut self
!
The idea would be to implement SystemParam::new_archetype()
manually by first doing the same filtering as any Query<D, (F, With<C>)>
, but then by calling archetype.contains
32 times to get the 32-bit index encoded in that archetype. Then you'd look up the QueryState
at that index in the Vec
, and tell only that one about the archetype. So each QueryState
in the Vec
would always have the full list of archetypes with the given index, and none with other indices!
You'd need a way to make new QueryState
s when you first see higher indices, but I think that's straightforward. Everything inside it is already Clone
except for D::State
and F::State
, and I think all actual WorldQuery
impls have State: Clone
, so you could just add a bound there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to try and do this now that I've got the combinatorial component ID indexing working. Seriously, thank you for explaining this!
Objective
Solution
Users can now call
app.add_index::<C>()
for an appropriate componentC
to have Bevy automatically track its value and provide a high-performance archetypical indexing solution. Users can then query by-value using theQueryByIndex
system parameter all entities with a specific indexed value. The value is provided within the body of a system, meaningQueryByIndex
can access any entity matching the provided compile-time query filters.This is achieved by adding
OnInsert
andOnReplace
observers and requiring that the index componentC
be immutable, ensuring those two hooks are able to capture all mutations. Since this is done with observers, users can index 3rd party components, required they are immutable and implement the traits required for the blanket implementation ofIndexComponent
.Within the observers, a private index resource is managed which tracks the component value against a runtime marker component. This runtime marker component is used to group entities by-value at the archetype level for the index component.
This grouping allows for maximum performance in iteration, since querying is simply filtering for a hidden component by-archetype.
To-Do / Follow-up
Testing
index
examplePerformance
I've added some
index
benchmarks to theecs
bench which test iteration speed (index_iter
) and mutation speed (index_update
). Running these benchmarks locally, I see a performance improvement by a factor of 2 when iterating using the index (where 1 in 16 are the target value) vs a naiveiter().filter(...)
. However there is a performance degradation by a factor of 4 in that same scenario.This confirms expectation: indexing is only a performance win in scenarios where a value is rarely updated but frequently searched for.
Release Notes
Users can now query by-value using indexes. To setup an index, first create an immutable component with
Clone
,Eq
, andHash
implementations:Next, request that an index be managed for this component:
Finally, you can query by-value using the new
QueryByIndex
system parameter:QueryByIndex
is just likeQuery
except it has a third generic argument,C
, which denotes the component you'd like to use as the index:Internally, the index will move entities with differing values into their own archetypes, allowing for fast iteration at the cost of some added overhead when modifying the indexed value. As such, indexes are best used where you have an infrequently changing component that you frequently need to access a subset of.