-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deletes for all indices when publishing draft #7949
Comments
This is actually how YesSql works, by design. Suggest you open an issue there if you want to suggest changes to that design. One thing you can do (and we're only just starting to make the changes to the indexes in OC to support this), to improve performance is to include the Refer #7859 for some recent discussion on this.
This bit needs to be pinpointed |
Thanks for your quick reply, Dean. Can you point me to something that sheds some light on why would this be by design? Deleting all indices when a given document is deleted I of course understand. I can also understand why one would implement this as a delete issued for all indices of the given document type though I think that can be optimized by checking whether that particular item actually has a given index, as I've explained above (possibly that can't be bulletproof though and it's better to just have all the deletes). However, I don't see why issuing these deletes one by one would make anything better. As apparent, these queries can add up and make all write operations slower by a lot; simply batching these up (meaning concatenating all |
Seb will describe the why, but when you look at how YesSql operates, it is intended to clear all indexes when saving. Otherwise it would have to make a query to know what is in these indexes, to do any comparison. Otherwise how would it know what might have changed?
I'm not sure you're actiually getting roundtrips on this. I think they're just executes. But opening an issue on YesSql is the way to go here. Now, what you could do since you have so many indexes (and a hint on the time it takes to process), and this would help me out with the evaluation of what indices are useful on these index tables. Try adding a non clustered index with the Something like
And let me know if that changes the performance at all? I will also comment that we find document saves slow as well, so whatever we can do to improve it is good. |
That's the "probing" I mentioned but as I said later, I can accept that being error-prone. However, I don't see why these need to be individual queries; Mini Profiler shows them as such too, and it's apparent from the query time requirements that the network roundtrip to the DB server's penalty is incurred every time. Without the roundtrip, the queries would be fine. The I've opened a YesSql issue: sebastienros/yessql#310 However, I'd keep this open as I don't think these queries should be issued twice for a content item save. Should I find the cause of why we sometimes see even more duplication for these deletions I'll post an update here. |
Related to #7859 and #7757 where we discussed about unclustered indexes, here some parts of the discussion
So it's related to the
Maybe i will start a PR to update these indexes. |
Just checking that you are sure of that @Piedone . I'm quite interested in the perf difference if they don't. Particularly as you have so many index tables, so it would be a great indicator of how much difference it would make. On the YesSql issue you said it was the primary key being checked. It's not, on an index table, there is an Id, which is the primary key, and the DocumentId which is not a primary key, and not indexed by default. So you would have had to make those indexes for them yourself.
|
You're right, it's not a primary key indeed, I mixed it up, so scratch my comments about indices. We'll try SQL indices. |
I tried out with tables in our system (~3000 rows at most) and I didn't measure any significant performance advantage with an SQL index. I tried this in Azure SQL so maybe any difference is masked by the variance of the service's performance. I guess around 3000 rows is too small for an index to really have an effect. At this scale, the whole operation's performance is dominated by the base latency of the individual queries (so again, batching them up would pretty much eliminate the issue). |
@deanmarcussen @Piedone I already started something locally 2 days ago, will open a PR soon @deanmarcussen i will do it step by step as we wil need to review carefully some interleaved migrations versions. |
@jtkech for infos. I did a lot of performance tests against YesSql while we figured this out. With 1,000 items in the Duration for deletes is a lot lower now due to all the batching improvements, but still worth doing I reckon. |
Cool !!!
Do you mean because of some recent changes on yesSql side?
Yes, anyway |
Yes all commands now batched into one sql execution (so no more round trips). |
Cool 👍 |
This might be more of a YesSql thing.
When you publish a content item's draft version then SQL
DELETE
s will be issued for the given document for allContentItem
indices, even if the item has nothing to do with it (likeLayerMetadataIndex
for a Page). You can see this e.g. when publishing an Article with a vanilla Blog recipe setup. Actually, if you change something on the item and then publish it then there will be aDELETE
twice for all indices.This is OK if you have a couple of index tables but this can add up quickly, especially with the mentioned duplication (in our custom app I've also seen the same index tables getting a
DELETE
not just twice but something three or even five times but I haven't yet pinpointed why). In our app we have about 60 indices thus this causes a significant performance penalty: Saving a Page (the same content type as in the Blog recipe) takes about half a second on my machine with a local DB, in production >5s.Unless this is a bug in how Orchard uses YesSql then probably a good approach would be to "probe" the document somehow before deletion to see how many index providers are contributing to it and only issuing a
DELETE
for those that are affected. Or, at least bundle all the queries into a single SQL request so the latency of the server roundtrip is not incurred multiple times (as the queries themselves are quite trivial, especially for tables where there's no matching row, I'd guess this would be faster than doing theDELETE
s one by one just for the affected indices.I guess there's a similar issue with direct content item deletions too.
Somewhat related: #5821
The text was updated successfully, but these errors were encountered: