-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-Row Insert #25
Comments
+1 also relates to issue #1 |
I've planned to explore this subject, but I'm swamped currently. A proper place to start with this IMO is to write a benchmark, which proves that implementing this feature would make much difference at all, since the mechanics behind "postgresql-simple" and "hasql" are very different. Such a benchmark, for instance, could compare inserting a lot of rows using the standard "hasql" API and with "executeMany" of "postgresql-simple". |
I've setup the benchmark you have described at https://github.com/AndrewRademacher/sql-driver-race. It includes both the benchmark code and the results in the results.html file. This test indicates that postgres executeMany is about 270 times faster at inserting large batches. |
@AndrewRademacher Thank you. So this settles it, the thing needs to implemented. However I am gonna remain swamped in the foreseeable future, so it's a subject for contribution. |
Fair enough. I don't really have any experience with this sort of thing, but I'll look into it as well. |
I’ve looked into this and it’s a bit tricky. The good news is: This does the job! I’ve updated the benchmarks and included this workaround and it’s now faster than postgresql-simple (and significantly faster than |
Inspired by the input from @cocreature (Cheers, Moritz!), I've come up with a solution. First, a bit of an insight on the problem. It's true that Postgres limits us from passing arrays of composites as parameters to the queries. The reason is the underlying uncomposable OID-based type identification system that it has. Each type has to have a final unique OID, which basically removes the anonymous composite types from the picture. If we can't have anonymous composite types, we can't have arrays of them either, hence is our problem. What we can do though is work around that by, instead of passing an array of products, passing a product of arrays. Starting from Postgres version 9.4 the select * from unnest(array[1,2,3], array[true, false]) We can then combine that with our ability to use insert into "location" ("id", "x", "y") select * from unnest ($1, $2, $3) The final Hasql query for that statement then can look like this: insertMultipleLocations :: Query (Vector (UUID, Double, Double)) ()
insertMultipleLocations =
statement sql encoder decoder True
where
sql =
"insert into location (id, x, y) select * from unnest ($1, $2, $3)"
encoder =
contramap Vector.unzip3 $
contrazip3 (vector Encoders.uuid) (vector Encoders.float8) (vector Encoders.float8)
where
vector value =
Encoders.value (Encoders.array (Encoders.arrayDimension foldl' (Encoders.arrayValue value)))
decoder =
Decoders.unit I must remind that this solution seems to be applicable to Postgres of versions starting from 9.4. For older versions you'll have to simulate the same with a more verbose work-around. The answers to this StackOverflow question should be of help. |
@nikita-volkov Would you be able to add this example to the documentation for Hasql.Encoder? It would be really valuable in the the section on arrays. |
@axman6 Can you make a PR? |
@nikita-volkov Another worthwhile (?) alternative to encode to a tuple with lists per column is to use a query like:
Especially when you have a record type with an aeson Despite the flaws of the record types, they are easier to handle in masses than long tuples. If you consider this a worthwhile alternative please give me a headsup and I'll PR a documentation addition. |
It would I think be great to have a short version of this discussion in the |
Care to PR? |
For anyone else stumbling upon this issue: insert into mytable (myfield)
(
select
(val->>'myfield') :: text as myfield
-- ... add all the fields you need from the record
from jsonb_array_elements($1 :: jsonb) j(val)
) My insertion times went down from ~30s for 1000 rows to ~0.5s for 1000 rows. |
The
executeMany
functionality in the postgresql-simple driver can be the difference between inserting a few hundred rows a second and inserting tens of thousands. Are there any plans to implement this functionality in hasql?The text was updated successfully, but these errors were encountered: