Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement batching #316

Merged
merged 12 commits into from
Dec 31, 2020
Merged

Reimplement batching #316

merged 12 commits into from
Dec 31, 2020

Conversation

sebastienros
Copy link
Owner

No description provided.

@sebastienros
Copy link
Owner Author

Fixes #310

@sebastienros
Copy link
Owner Author

@Piedone @deanmarcussen This should fix batching.
Some commands can't be batched since they require some result from the queries, but it might still be doable with some more thoughts, I went for the easy wins first, which should fixed all the issues dean found.

@sebastienros
Copy link
Owner Author

sebastienros commented Dec 30, 2020

Found a way to add more command in the batches, will continue this then

@deanmarcussen
Copy link
Collaborator

This looks exactly what I thought it would look like. Great. And Sqlite too ;)

Here's some numbers. Which lie on the remote numbers, because my internet connection to azure this morning is significantly slower than yesterday.

            Before batching
            
            local
            Indexes 1, Elapsed 00:00:00.3004236
            Indexes 100, Elapsed 00:00:00.2375774
            Indexes 200, Elapsed 00:00:00.3583902
            Indexes 500, Elapsed 00:00:00.7695818
            Indexes 1000, Elapsed 00:00:01.2836934

            remote
            Indexes 1, Elapsed 00:00:00.7207663
            Indexes 100, Elapsed 00:00:03.3552247
            Indexes 200, Elapsed 00:00:05.5547927
            Indexes 500, Elapsed 00:00:13.8364514
            Indexes 1000, Elapsed 00:00:27.2306443

            After Batching

            local
            Indexes 1, Elapsed 00:00:00.2207824
            Indexes 100, Elapsed 00:00:00.0910920
            Indexes 200, Elapsed 00:00:00.1632908
            Indexes 500, Elapsed 00:00:00.4007200
            Indexes 1000, Elapsed 00:00:00.4559752

            remote

            Indexes 1, Elapsed 00:00:03.6326000
            Indexes 100, Elapsed 00:00:04.9639312
            Indexes 200, Elapsed 00:00:09.8273422
            Indexes 500, Elapsed 00:00:16.1340951
            Indexes 1000, Elapsed 00:00:15.2008296

These are all for dummy indexes which never perform any writes. Just cause deletes.

What I can see in the profiler is that everything easy to batch, is batched together.
The big batch with mostly deletes, and a couple of inserts, is now long running, instead of many many short runs. Probably from the look of it, the insert is still expensive, so which pushes up the time run.

But the network is really slow this morning. So better to look at the local numbers where we can see it's dropped by 2/3 to a 1/2.

Awesome

sebastienros and others added 2 commits December 30, 2020 12:29
Co-authored-by: Zoltán Lehóczky <zoltan.lehoczky@lombiq.com>
@Piedone
Copy link
Contributor

Piedone commented Dec 30, 2020

I tried testing this with an existing Orchard Core app (version somewhere post-RC2) but it seems to be hopelessly incompatible so I'll just eagerly await this getting into Orchard :).

@sebastienros
Copy link
Owner Author

There were a few breaking change with strongly typed things since OC rc2.

@sebastienros
Copy link
Owner Author

PR is ready to merge. It now batches everything, including inserts and updates, for documents and indexes.
The only cases it can't batch are when a concurrency check is request, or when a reduced index generates multiple entries. In those two case these only statements are not batched while the rest still is.
So in 99% of the usages, all transaction should span a single network communication, at least if all statements fit in a single batch.

I created another PR that will allow us to get rid of most unnecessary delete statements in OC:
#318

@Piedone
Copy link
Contributor

Piedone commented Dec 30, 2020

Does it make sense to test this or #318 with OC?

@sebastienros
Copy link
Owner Author

I am merging both and we'll get a package on myget to test.

@sebastienros sebastienros merged commit 9a8f24f into dev Dec 31, 2020
@sebastienros sebastienros deleted the sebros/batching branch December 31, 2020 00:01
@sebastienros
Copy link
Owner Author

I did a local setup with the myget feed and it works fine. When I create a blog post here are the only batch that is sent:

insert into [Document] ([Id], [Type], [Content], [Version]) values (19, 'OrchardCore.ContentManagement.ContentItem, OrchardCore.ContentManagement.Abstractions', '{"ContentItemId":"4cpw0fnmjb1kp07dmzxx8n8ecg","ContentItemVersionId":"4953p18bj3gyy5yy82f7mj7w4y","ContentType":"BlogPost","DisplayText":"The title","Latest":true,"Published":false,"ModifiedUtc":"2020-12-31T00:31:34.3346095Z","PublishedUtc":null,"CreatedUtc":"2020-12-31T00:31:34.3346095Z","Owner":"48v9vt5vxznr5z9m1df9zmvjm8","Author":"admin","TitlePart":{"Title":"The title"},"AutoroutePart":{"Path":"blog/the-title","SetHomepage":false,"Disabled":false,"RouteContainedItems":false,"Absolute":false},"BlogPost":{"Subtitle":{"Text":"Subtitle"},"Image":{"Anchors":[],"Paths":[],"MediaTexts":[]},"Tags":{"TagNames":["Space"],"TaxonomyContentItemId":"4ykev5wxfcny7tvsahz9y64mwe","TermContentItemIds":["4nv0z7r24r1vw3sfpq7t6xws59"]},"Category":{"TaxonomyContentItemId":"4tpy2wv97bkbf0zkx8tyd1bm4q","TermContentItemIds":["4bsstr09f29rp0sgy85n9f07wj"]}},"MarkdownBodyPart":{"Markdown":"Some text"},"ContainedPart":{"ListContentItemId":"491emynv0kavbzhy40xmqv1wds","Order":0}}', 1);
insert into [ContentItemIndex] ([ContentItemId], [ContentItemVersionId], [Published], [Latest], [ContentType], [ModifiedUtc], [PublishedUtc], [CreatedUtc], [Owner], [Author], [DisplayText]) values ('4cpw0fnmjb1kp07dmzxx8n8ecg', '4953p18bj3gyy5yy82f7mj7w4y', 0, 1, 'BlogPost', '2020-12-31T00:31:34', '', '2020-12-31T00:31:34', '48v9vt5vxznr5z9m1df9zmvjm8', 'admin', 'The title') ; select last_insert_rowid() [Id];
update [ContentItemIndex] set [DocumentId] = 19 where [Id] = (last_insert_rowid());
insert into [ContainedPartIndex] ([ListContentItemId], [Order]) values ('491emynv0kavbzhy40xmqv1wds', 0) ; select last_insert_rowid() [Id];
update [ContainedPartIndex] set [DocumentId] = 19 where [Id] = (last_insert_rowid());
insert into [AutoroutePartIndex] ([ContentItemId], [Path], [Published], [Latest], [ContainedContentItemId], [JsonPath]) values ('4cpw0fnmjb1kp07dmzxx8n8ecg', 'blog/the-title', 0, 1, '', '') ; select last_insert_rowid() [Id];
update [AutoroutePartIndex] set [DocumentId] = 19 where [Id] = (last_insert_rowid());

Then when I update the blog post:

insert into [Document] ([Id], [Type], [Content], [Version]) values (20, 'OrchardCore.ContentManagement.ContentItem, OrchardCore.ContentManagement.Abstractions', '{"ContentItemId":"4cpw0fnmjb1kp07dmzxx8n8ecg","ContentItemVersionId":"467hsb597fzp72myhe70ka57dg","ContentType":"BlogPost","DisplayText":"The title","Latest":true,"Published":false,"ModifiedUtc":"2020-12-31T00:36:08.2985588Z","PublishedUtc":"2020-12-31T00:31:34.3459251Z","CreatedUtc":"2020-12-31T00:31:34.3346095Z","Owner":"48v9vt5vxznr5z9m1df9zmvjm8","Author":"admin","TitlePart":{"Title":"The title"},"AutoroutePart":{"Path":"blog/the-title","SetHomepage":false,"Disabled":false,"RouteContainedItems":false,"Absolute":false},"BlogPost":{"Subtitle":{"Text":"Subtitle"},"Image":{"Anchors":[],"Paths":[],"MediaTexts":[]},"Tags":{"TagNames":["Space"],"TaxonomyContentItemId":"4ykev5wxfcny7tvsahz9y64mwe","TermContentItemIds":["4nv0z7r24r1vw3sfpq7t6xws59"]},"Category":{"TaxonomyContentItemId":"4tpy2wv97bkbf0zkx8tyd1bm4q","TermContentItemIds":["4bsstr09f29rp0sgy85n9f07wj"]}},"MarkdownBodyPart":{"Markdown":"Some text"},"ContainedPart":{"ListContentItemId":"491emynv0kavbzhy40xmqv1wds","Order":0}}', 1);
delete from [ContentItemIndex] where [DocumentId] = 19;
delete from [AliasPartIndex] where [DocumentId] = 19;
delete from [LayerMetadataIndex] where [DocumentId] = 19;
delete from [ContainedPartIndex] where [DocumentId] = 19;
delete from [AutoroutePartIndex] where [DocumentId] = 19;
delete from [TaxonomyIndex] where [DocumentId] = 19;
insert into [ContentItemIndex] ([ContentItemId], [ContentItemVersionId], [Published], [Latest], [ContentType], [ModifiedUtc], [PublishedUtc], [CreatedUtc], [Owner], [Author], [DisplayText]) values ('4cpw0fnmjb1kp07dmzxx8n8ecg', '4953p18bj3gyy5yy82f7mj7w4y', 1, 0, 'BlogPost', '2020-12-31T00:31:34', '2020-12-31T00:31:34', '2020-12-31T00:31:34', '48v9vt5vxznr5z9m1df9zmvjm8', 'admin', 'The title') ; select last_insert_rowid() [Id];
update [ContentItemIndex] set [DocumentId] = 19 where [Id] = (last_insert_rowid());
insert into [ContainedPartIndex] ([ListContentItemId], [Order]) values ('491emynv0kavbzhy40xmqv1wds', 0) ; select last_insert_rowid() [Id];
update [ContainedPartIndex] set [DocumentId] = 19 where [Id] = (last_insert_rowid());
insert into [AutoroutePartIndex] ([ContentItemId], [Path], [Published], [Latest], [ContainedContentItemId], [JsonPath]) values ('4cpw0fnmjb1kp07dmzxx8n8ecg', 'blog/the-title', 1, 0, '', '') ; select last_insert_rowid() [Id];
update [AutoroutePartIndex] set [DocumentId] = 19 where [Id] = (last_insert_rowid());
insert into [TaxonomyIndex] ([TaxonomyContentItemId], [ContentItemId], [ContentType], [ContentPart], [ContentField], [TermContentItemId]) values ('4ykev5wxfcny7tvsahz9y64mwe', '4cpw0fnmjb1kp07dmzxx8n8ecg', 'BlogPost', 'BlogPost', 'Tags', '4nv0z7r24r1vw3sfpq7t6xws59') ; select last_insert_rowid() [Id];
update [TaxonomyIndex] set [DocumentId] = 19 where [Id] = (last_insert_rowid());
insert into [TaxonomyIndex] ([TaxonomyContentItemId], [ContentItemId], [ContentType], [ContentPart], [ContentField], [TermContentItemId]) values ('4tpy2wv97bkbf0zkx8tyd1bm4q', '4cpw0fnmjb1kp07dmzxx8n8ecg', 'BlogPost', 'BlogPost', 'Category', '4bsstr09f29rp0sgy85n9f07wj') ; select last_insert_rowid() [Id];
update [TaxonomyIndex] set [DocumentId] = 19 where [Id] = (last_insert_rowid());
update [Document] set [Content] = '{"ContentItemId":"4cpw0fnmjb1kp07dmzxx8n8ecg","ContentItemVersionId":"4953p18bj3gyy5yy82f7mj7w4y","ContentType":"BlogPost","DisplayText":"The title","Latest":false,"Published":true,"ModifiedUtc":"2020-12-31T00:31:34.3346095Z","PublishedUtc":"2020-12-31T00:31:34.3459251Z","CreatedUtc":"2020-12-31T00:31:34.3346095Z","Owner":"48v9vt5vxznr5z9m1df9zmvjm8","Author":"admin","TitlePart":{"Title":"The title"},"AutoroutePart":{"Path":"blog/the-title","SetHomepage":false,"Disabled":false,"RouteContainedItems":false,"Absolute":false},"BlogPost":{"Subtitle":{"Text":"Subtitle"},"Image":{"Anchors":[],"Paths":[],"MediaTexts":[]},"Tags":{"TagNames":["Space"],"TaxonomyContentItemId":"4ykev5wxfcny7tvsahz9y64mwe","TermContentItemIds":["4nv0z7r24r1vw3sfpq7t6xws59"]},"Category":{"TaxonomyContentItemId":"4tpy2wv97bkbf0zkx8tyd1bm4q","TermContentItemIds":["4bsstr09f29rp0sgy85n9f07wj"]}},"MarkdownBodyPart":{"Markdown":"Some text"},"ContainedPart":{"ListContentItemId":"491emynv0kavbzhy40xmqv1wds","Order":0}}', [Version] = 1 where [Id] = 19;
insert into [ContentItemIndex] ([ContentItemId], [ContentItemVersionId], [Published], [Latest], [ContentType], [ModifiedUtc], [PublishedUtc], [CreatedUtc], [Owner], [Author], [DisplayText]) values ('4cpw0fnmjb1kp07dmzxx8n8ecg', '467hsb597fzp72myhe70ka57dg', 0, 1, 'BlogPost', '2020-12-31T00:36:08', '2020-12-31T00:31:34', '2020-12-31T00:31:34', '48v9vt5vxznr5z9m1df9zmvjm8', 'admin', 'The title') ; select last_insert_rowid() [Id];
update [ContentItemIndex] set [DocumentId] = 20 where [Id] = (last_insert_rowid());
insert into [ContainedPartIndex] ([ListContentItemId], [Order]) values ('491emynv0kavbzhy40xmqv1wds', 0) ; select last_insert_rowid() [Id];
update [ContainedPartIndex] set [DocumentId] = 20 where [Id] = (last_insert_rowid());
insert into [AutoroutePartIndex] ([ContentItemId], [Path], [Published], [Latest], [ContainedContentItemId], [JsonPath]) values ('4cpw0fnmjb1kp07dmzxx8n8ecg', 'blog/the-title', 0, 1, '', '') ; select last_insert_rowid() [Id];
update [AutoroutePartIndex] set [DocumentId] = 20 where [Id] = (last_insert_rowid());

Though I think there is something to optimize as I see an update right after and insert, on the same record.
I haven't applied the new Filter option anywhere yet.

@sebastienros
Copy link
Owner Author

btw I use the mini profiler to see this. I update the font-end and it dumps all the queries that happened in the admin. I don't remember if we made an option to enable toe mini profiler in the admin too.

@sebastienros
Copy link
Owner Author

My change on the filters is not working, it's actually breaking things, so don't try to use it for now.

@sebastienros
Copy link
Owner Author

After new optimizations to merge the insert+update on the indexes:

insert into [Document] ([Id], [Type], [Content], [Version]) values (23, 'OrchardCore.ContentManagement.ContentItem, OrchardCore.ContentManagement.Abstractions', '{"ContentItemId":"4cpw0fnmjb1kp07dmzxx8n8ecg","ContentItemVersionId":"4ypdrxm7xbndr0dvcpwaraa95g","ContentType":"BlogPost","DisplayText":"The title","Latest":true,"Published":false,"ModifiedUtc":"2020-12-31T05:55:56.8113646Z","PublishedUtc":"2020-12-31T01:22:37.7926461Z","CreatedUtc":"2020-12-31T00:31:34.3346095Z","Owner":"48v9vt5vxznr5z9m1df9zmvjm8","Author":"admin","TitlePart":{"Title":"The title"},"AutoroutePart":{"Path":"blog/the-title","SetHomepage":false,"Disabled":false,"RouteContainedItems":false,"Absolute":false},"BlogPost":{"Subtitle":{"Text":"Subtitle"},"Image":{"Anchors":[],"Paths":[],"MediaTexts":[]},"Tags":{"TagNames":["Space"],"TaxonomyContentItemId":"4ykev5wxfcny7tvsahz9y64mwe","TermContentItemIds":["4nv0z7r24r1vw3sfpq7t6xws59"]},"Category":{"TaxonomyContentItemId":"4tpy2wv97bkbf0zkx8tyd1bm4q","TermContentItemIds":["4bsstr09f29rp0sgy85n9f07wj"]}},"MarkdownBodyPart":{"Markdown":"Some text"},"ContainedPart":{"ListContentItemId":"491emynv0kavbzhy40xmqv1wds","Order":0}}', 1);
delete from [ContentItemIndex] where [DocumentId] = 22;
delete from [AliasPartIndex] where [DocumentId] = 22;
delete from [LayerMetadataIndex] where [DocumentId] = 22;
delete from [ContainedPartIndex] where [DocumentId] = 22;
delete from [AutoroutePartIndex] where [DocumentId] = 22;
delete from [TaxonomyIndex] where [DocumentId] = 22;
insert into [ContentItemIndex] ([ContentItemId], [ContentItemVersionId], [Published], [Latest], [ContentType], [ModifiedUtc], [PublishedUtc], [CreatedUtc], [Owner], [Author], [DisplayText], [DocumentId]) values ('4cpw0fnmjb1kp07dmzxx8n8ecg', '4q3271jp1705etwnf52c0nnbwz', 1, 0, 'BlogPost', '2020-12-31T01:22:37', '2020-12-31T01:22:37', '2020-12-31T00:31:34', '48v9vt5vxznr5z9m1df9zmvjm8', 'admin', 'The title', 22) ; select last_insert_rowid() [Id];
insert into [ContainedPartIndex] ([ListContentItemId], [Order], [DocumentId]) values ('491emynv0kavbzhy40xmqv1wds', 0, 22) ; select last_insert_rowid() [Id];
insert into [AutoroutePartIndex] ([ContentItemId], [Path], [Published], [Latest], [ContainedContentItemId], [JsonPath], [DocumentId]) values ('4cpw0fnmjb1kp07dmzxx8n8ecg', 'blog/the-title', 1, 0, '', '', 22) ; select last_insert_rowid() [Id];
insert into [TaxonomyIndex] ([TaxonomyContentItemId], [ContentItemId], [ContentType], [ContentPart], [ContentField], [TermContentItemId], [DocumentId]) values ('4ykev5wxfcny7tvsahz9y64mwe', '4cpw0fnmjb1kp07dmzxx8n8ecg', 'BlogPost', 'BlogPost', 'Tags', '4nv0z7r24r1vw3sfpq7t6xws59', 22) ; select last_insert_rowid() [Id];
insert into [TaxonomyIndex] ([TaxonomyContentItemId], [ContentItemId], [ContentType], [ContentPart], [ContentField], [TermContentItemId], [DocumentId]) values ('4tpy2wv97bkbf0zkx8tyd1bm4q', '4cpw0fnmjb1kp07dmzxx8n8ecg', 'BlogPost', 'BlogPost', 'Category', '4bsstr09f29rp0sgy85n9f07wj', 22) ; select last_insert_rowid() [Id];
update [Document] set [Content] = '{"ContentItemId":"4cpw0fnmjb1kp07dmzxx8n8ecg","ContentItemVersionId":"4q3271jp1705etwnf52c0nnbwz","ContentType":"BlogPost","DisplayText":"The title","Latest":false,"Published":true,"ModifiedUtc":"2020-12-31T01:22:37.6390174Z","PublishedUtc":"2020-12-31T01:22:37.7926461Z","CreatedUtc":"2020-12-31T00:31:34.3346095Z","Owner":"48v9vt5vxznr5z9m1df9zmvjm8","Author":"admin","TitlePart":{"Title":"The title"},"AutoroutePart":{"Path":"blog/the-title","SetHomepage":false,"Disabled":false,"RouteContainedItems":false,"Absolute":false},"BlogPost":{"Subtitle":{"Text":"Subtitle"},"Image":{"Anchors":[],"Paths":[],"MediaTexts":[]},"Tags":{"TagNames":["Space"],"TaxonomyContentItemId":"4ykev5wxfcny7tvsahz9y64mwe","TermContentItemIds":["4nv0z7r24r1vw3sfpq7t6xws59"]},"Category":{"TaxonomyContentItemId":"4tpy2wv97bkbf0zkx8tyd1bm4q","TermContentItemIds":["4bsstr09f29rp0sgy85n9f07wj"]}},"MarkdownBodyPart":{"Markdown":"Some text"},"ContainedPart":{"ListContentItemId":"491emynv0kavbzhy40xmqv1wds","Order":0}}', [Version] = 1 where [Id] = 22;
insert into [ContentItemIndex] ([ContentItemId], [ContentItemVersionId], [Published], [Latest], [ContentType], [ModifiedUtc], [PublishedUtc], [CreatedUtc], [Owner], [Author], [DisplayText], [DocumentId]) values ('4cpw0fnmjb1kp07dmzxx8n8ecg', '4ypdrxm7xbndr0dvcpwaraa95g', 0, 1, 'BlogPost', '2020-12-31T05:55:56', '2020-12-31T01:22:37', '2020-12-31T00:31:34', '48v9vt5vxznr5z9m1df9zmvjm8', 'admin', 'The title', 23) ; select last_insert_rowid() [Id];
insert into [ContainedPartIndex] ([ListContentItemId], [Order], [DocumentId]) values ('491emynv0kavbzhy40xmqv1wds', 0, 23) ; select last_insert_rowid() [Id];
insert into [AutoroutePartIndex] ([ContentItemId], [Path], [Published], [Latest], [ContainedContentItemId], [JsonPath], [DocumentId]) values ('4cpw0fnmjb1kp07dmzxx8n8ecg', 'blog/the-title', 0, 1, '', '', 23) ; select last_insert_rowid() [Id];

@sebastienros
Copy link
Owner Author

sebastienros commented Dec 31, 2020

After adding the filters on the indexes:

delete from [ContentItemIndex] where [DocumentId] = 23;
delete from [ContainedPartIndex] where [DocumentId] = 23;
delete from [AutoroutePartIndex] where [DocumentId] = 23;
delete from [TaxonomyIndex] where [DocumentId] = 23;

Layer and Alias are gone since it's not on the blog post.

Example:

context.For<AliasPartIndex>()
    .When(c => c.Has<AliasPart>())
    .Map(contentItem =>

@deanmarcussen
Copy link
Collaborator

deanmarcussen commented Dec 31, 2020

Between 50% and 75% faster in OC under SQL Server 👍

Depends on the type of test

And that's without the When

I don't remember if we made an option to enable toe mini profiler in the admin too.

Yes @Piedone added an option for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants