-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent write causes huge file size #11
Comments
What does Go in to your log directory, and do My guess is that the The easy solutions are only for newer databases, since you can't change the density on an existing table/index, only for newly created tables. You can set the density to 80 or 90%, and you should also try playing around with changing the database page size (default is 4k, up to 32k is now supported. XP only supported 8k. I forget when 32k page support was added -- Win7?). Also, if you have an auto-inc column, do you happen to read it via I also had trouble with a multi-threaded program causing lots of splits, and the above did help somewhat, but didn't fix it entirely. Maybe @michaelthorp has some more ideas for us? :) |
Hi @machish, Thanks for your reply, it helped me to understand some of the esent configurations. To answer your question:
By default I have a different table for each
Given the above configuration I basically have no indexes except fore the table name.
I don't. I also have a different configuration where I have a single table, and one row for each item. In this case I have two columns: the
You are right, I repeated the test described in my previous comment, and I found a big difference in the Split2 records between the full parallel and the locked scenario (i.e. 4 vs. 2000+ records), but this happens only in my default configuration (the one with many tables) while in the other case (one table, many rows) I found the opposite: 36 Split2 records for the full parallel and 2000+ for the locked case. In my test I ran 200 iterations, writing 1000 items at each iteration ( I obtained the smallest file size with the second configuration (one table, many rows) and the lock strategy (~40 MB) while in the many tables case I obtained a file size of ~200MB (again with the lock). Running in full parallel I get files of 4+GB in both cases. Reducing the density up to 50 didn't help, and increasing the page size resulted in bigger files. I ran my tests on Win10 and the application is not expected to run on win versions < 8 - server 2012. |
@3Dmondo - how do you do the writes? Do you do an InsertCopyDeleteOriginal on the byte array when updating it, by any chance? We did have an issue (fixed in Windows 10 19H1 I believe, where in certain conditions upon doing an InsertCopyDeleteOriginal, the original copy did not get deleted and would get orphaned. You can check for this by taking one of your databases (one of the bloated ones in production) and running 'esentutl /g ' (note: the database needs to be in CleanShutdown state. If you do dirty term's you'll need to run recovery first - 'esentutl /r '). After running esentutl /g there will be a INTEG.raw file in that directory which you can look at and check for anything like: "WARNING: orphaned LV"). |
Hello,
In my application I need to (concurrently) access as quick as possible to byte arrays stored on disk.
I have a multi threaded method where each thread preform read write access to its own byte array, one array per thread. The method is executed in a loop; at each iteration I need to read the previous state and save the next one. The read operation is performed at the begin of the method to initialize it and the write is at the end, to store the iteration state. The length of each byte array may change from one iteration to the other.
To satisfy these specifications I use an esent database where each byte array is stored in a different table having one row and one column.
Everything works well on my development machine, with relatively low degree of parallelism and fast disk (SSD), but, as often happens, the problem arisen in the production environment, a virtualized machine with 12 cores and possibly slower, but still SSD, disk.
The size of the .edb file in the production machine reached ~50GB after few iterations, while in the dev environment it never exceeded the 15GB.
I also reproduced the problem on my dev environment writing a program that just write byte arrays of random length, filled with random data thus maximizing the concurrent write accesses.
I temporarily solved the problem with a Lock statement around the write operation, but I'm looking for a better solution (e.g. using a number of databases equal to the degree of parallelism).
The following is a snippet of my test application with full parallel and locked versions:
with this example I optained a .edb file size of ~1.5GB in thhe full parallel case and ~300MB in the locked one, both after 100 iterations
The text was updated successfully, but these errors were encountered: