EFCore.Extensions.SaveOptimizer

Save optimizer extension for EF Core.
It supports multiple EF Core providers and brings serious performance benefits for many scenarios without big effort.

Currently in BETA

Package	NuGet
EFCore.Extensions.SaveOptimizer
EFCore.Extensions.SaveOptimizer.Dapper

Why another library for batch save?

Short version

Because I needed

Long version

Idea came from one of my commercial projects.

We were working with CockroachDB (excellent database for multi-region environments) and initially used Entity Framework Core 3.1. It worked fine, but in multi-region configuration save performance was not so good.

Reason? As many people knows EF generates multiple INSERT / UPDATE / DELETE single row statements instead of lower amount of multiple row statements.

I considered few solutions (e.g. EFCore.BulkExtensions, linq2db) but all of them had some disadvantages during this time. We needed something with support for CockroachDB and concurrency tokens. Also we want to avoid business logic rewrite, more code changes than replacing SaveChanges execution could be big problem. As there were no good choice I wrote something simple from scratch. For sql statements generation I used excellent SqlKata library. Now I decided to rewrite whole solution as something more efficient and better integrated into EF.

EFCore.BulkExtensions is good choice for most cases as it has BulkSaveChanges method and currently supports few different databases (SQL Server, PostgreSQL and SQLite). Unfortunately it looks like API they used for PostgreSQL is incompatible with CockroachDB.

Main difference is approach - EFCore.BulkExtensions uses copy tools / statements under the hood, SaveOptimizer uses batched INSERT / UPDATE / DELETE statements. You need to choose which would be better in your case. Likely copy would be faster in most cases (it's good to measure this), but SaveOptimizer approach would bring more databases support. SaveOptimizer is simple library so you should verify it with your requirements.

How to use

Just replace SaveChanges() / SaveChangesAsync() :)

Optimized

using EFCore.Extensions.SaveOptimizer;

await using var transaction = await context.Database.BeginTransactionAsync(IsolationLevel.Serializable, cancellationToken);
await context.AddAsync(entity);
await context.SaveChangesOptimizedAsync();
await transaction.CommitAsync(cancellationToken);

Optimized Dapper

using EFCore.Extensions.SaveOptimizer.Dapper;

await using var transaction = await context.Database.BeginTransactionAsync(IsolationLevel.Serializable, cancellationToken);
await context.AddAsync(entity);
await context.SaveChangesDapperOptimizedAsync();
await transaction.CommitAsync(cancellationToken);

How it works

When you execute SaveChangesOptimized usually the following sequence happens:

Get entries from ChangeTracker
Build property changes dictionary for each entry
Group changes as much as possible
Generate SQL
Execute
Accept changes

Please note it is not working exactly as SaveChanges, so you should verify it works in your case as expected.

Features

What to do

Optimize data retrieval from Change Tracker
Add unchecked when possible
Low level performance optimizations
Contributing document
Validation before save
Build query builder tests from scratch

Limitations

Refresh data after save

SaveOptimizer approach makes almost impossible refresh data after save, it is on your side. If you will use auto increment primary key it will not retrieve this key from db. I recommend to generate values for primary keys in code, not in db. This will make much easier refresh data after save if necessary, you will be able to use this values for query. Also DatabaseValues for entry will not be retrieved from db when DBConcurrencyException is thrown.

Basically - after save you should not use this context anymore as it could be invalid, you should use new context for another operation. However if you need you can experiment with AfterSaveBehavior.

If you start transaction by your own I recommend to use AfterSaveBehavior.DoNothing in configuration and after successful commit call ProcessAfterSave method from result with parameter you want.

Relationships

This limitation is connected with the previous one. If you want to add within one transaction an entity and some dependent entities you can't use auto increment primary key. This will work only with primary key value generated in your code.

Known issues

Oracle serializable transaction

It looks like serializable transaction produces many errors during execution, especially during insert (e.g. ORA-08177 & ORA-06512). This is something to investigate, maybe this is dockerized Oracle Express issue. I don't recommend using this library with Oracle in production environment without strong testing. Sometimes decrease batch size for insert could help.

Firebird provider

This is not a SaveOptimizer issue, however I experienced some problems with Firebird provider. It looks model builder sometimes build different model than other providers.

Issue	Workaround
Precision lost for decimal column	Use `HasColumnType("DECIMAL(PRECISION,SCALE)")`
Auto increment column not created	Use `HasAnnotation(FbAnnotationNames.ValueGenerationStrategy, FbValueGenerationStrategy.IdentityColumn)`

Q&A

Why you wrote query builder instead of using SqlKata?
- The reason behind is performance. I noticed few issues and created pull request. When I started working on this library my pull request was in review without response for few months. Initially I decided to create .NET 6 fork and performance improvements were really great. Later I realized I don't need most of SqlKata features so it's better to write something optimized from scratch - this will bring performance & freedom boost. Currently there is reference to SqlKata in test project, just for verify my builder. In the future I have plan to drop this dependency and write builder tests from scratch.
Why you wrote query executor?
- I noticed a bug with ExecuteSqlRaw from RelationalExtensions. It looks it cuts precision for decimals. So I created something lightweight using some EF Core features.
What is EFCore.Extensions.SaveOptimizer.Dapper package purpose?
- The default execution use provider described in previous question. Someone could prefer execution using Dapper. You can compare performance in your case. From my experience results usually are similar.
Which EF Core version do you support?
- I have plan to support only current release and latest LTS version. As there is only one required dependency (Microsoft.EntityFrameworkCore.Relational) you should be able to quickly prepare version for older EF if you need. Maybe some small changes in DataContextModelWrapper would be required.

Configuration

Parameter	Description	Provider default value
Batch size	Default batch size	All - 1000
Insert batch size	When defined override batch size for insert operations	Firebird - 500
		Oracle - 50
		Other - NULL
Update batch size	When defined override batch size for update operations	All - NULL
Delete batch size	When defined override batch size for delete operations	All - NULL
Parameters limit	Limit parameters for statement, when exceeded batch size decreased for operation to prevent exceptions	SqlServer - 1024
		Firebird - 2048
		SqLite - 512
		Postgres - 31768
		Other - 15384
Concurrency token behavior	When concurrency token is defined for entity it is included in update / delete statements. When flag is set to throws exception it will throws exception when statements affected less / more rows than expected.	All - throw exception
Auto transaction enabled	If enabled it will start transaction when no transaction attached to DbContext	All - true
After save behavior	It will behavior after successful save, possible values (ClearChanges, AcceptChanges, DetachSaved, MarkTemporaryAsPermanent, DoNothing)	All - DoNothing
Auto transaction isolation level	Isolation level for auto transaction	All - serializable
Builder configuration -> case type	Case type used when building statements, if normal it will not change case to upper / lower	All - normal
Builder configuration -> optimize parameters	Optimize parameters usage in statements, sometimes can lead to unexpected exception in db	All - true

Migration command

powershell -Command ".\add_migration.ps1 -name [NAME]"

$name = "[NAME]"

.\add_migration.ps1 -name $name

Benchmarks

Summary

SaveOptimizer is not always better than pure EF Core methods. EF Core contains tons of optimizations so for small operations and simple workloads likely is better.

My advice is to compare results in your real environment. Honestly - the best choice for pure performance is leave EF Core for write operations at all and then write statements from scratch for your scenarios. But this library could bring serious performance benefits in many scenarios without big effort.

All benchmarks uses serializable isolation level and run databases within containers. Please note I used free versions (e.g. Oracle Express), results on commercial versions could be much better.

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19044.1826 (21H2)
12th Gen Intel Core i9-12900HK, 1 CPU, 20 logical and 14 physical cores
.NET SDK=6.0.302
  [Host]     : .NET 6.0.7 (6.0.722.32202), X64 RyuJIT

EvaluateOverhead=True  OutlierMode=RemoveUpper
InvocationCount=1  IterationCount=20  LaunchCount=3
RunStrategy=ColdStart  UnrollFactor=1  WarmupCount=0

Running

powershell -File run_benchmarks.ps1

.\run_benchmarks.ps1

Results

Legend

Optimized - Using EFCore.Extensions.SaveOptimizer package
Optimized Dapper - Using EFCore.Extensions.SaveOptimizer.Dapper package
EF Core - Using pure EF Core

Name		Name	Last commit message	Last commit date
Latest commit History 368 Commits
.config		.config
.github		.github
EFCore.Extensions.SaveOptimizer		EFCore.Extensions.SaveOptimizer
results		results
.gitignore		.gitignore
BuildPlots.R		BuildPlots.R
CODEOWNERS		CODEOWNERS
LICENSE		LICENSE
README.md		README.md
add_migration.ps1		add_migration.ps1
build.fsx		build.fsx
build.fsx.lock		build.fsx.lock
code_of_conduct.md		code_of_conduct.md
environment.settings.json		environment.settings.json
fake.cmd		fake.cmd
fake.sh		fake.sh
generate_plots.ps1		generate_plots.ps1
github.runsettings		github.runsettings
global.json		global.json
preserve_logs.ps1		preserve_logs.ps1
pull_images.ps1		pull_images.ps1
recreate_containers.ps1		recreate_containers.ps1
remove_containers.ps1		remove_containers.ps1
run_benchmarks.ps1		run_benchmarks.ps1
set_policy.ps1		set_policy.ps1

License

mkolumb/EFCore.Extensions.SaveOptimizer

Folders and files

Latest commit

History

Repository files navigation

EFCore.Extensions.SaveOptimizer

Why another library for batch save?

Short version

Long version

How to use

Optimized

Optimized Dapper

How it works

Features

What to do

Limitations

Refresh data after save

Relationships

Known issues

Oracle serializable transaction

Firebird provider

Q&A

Configuration

Migration command

Benchmarks

Summary

Running

Results

Legend

CockroachDB - single docker node

CockroachDB - nine docker nodes

SQLite

SQL Server

Oracle Express 21

MySQL

MariaDB

PostgreSQL

Firebird 3

Firebird 4

Buy Me A Coffee

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Sponsor this project

Languages