Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Add :prepare option for Postgres Adapter #372

Merged
merged 11 commits into from
Jan 4, 2022

Conversation

greg-rychlewski
Copy link
Member

@greg-rychlewski greg-rychlewski commented Jan 1, 2022

Because of the way Postgres generates plans for prepared statements, sometimes it is beneficial to disable them (i.e. use an unnamed prepared statement). See this Postgrex issue for more details.

You can currently use unnamed prepared statements, but they are set at the Repo config level and would apply to all queries. This means the user can't pick and choose which queries benefit from this feature.

This proposal is to allow the Postgres adapter to recognize a :prepare option and to overwrite the Ecto-generated name with empty string. This is similar to what happens in Postgrex when prepare: :unnamed is configured on the connection. See this excerpt from Postgrex.Protocol:

def handle_prepare(%Query{} = query, opts, %{queries: nil} = s) do
  # always use unnamed if no cache
  handle_prepare(%Query{query | name: ""}, opts, s)
end

This way you still get the benefits of the Ecto cache but without the potentially harmful effects of the Postgres cache. One potential pitfall is that once you issue the query you can't change the :prepare option because the prepare step will be skipped. I think that's ok but let me know what you guys think.

The change is restricted to the Postgres adapter because I don't think the other adapters handle empty string names the same way. I could be mistaken though, since I'm not as familiar with them. Postgres will treat this as a "one shot" name and allow it to be re-defined without explicitly telling it that you're done with the old definition.

Note that this option is meant as a convenience. You can currently issue an unnamed query using Ecto.Adapters.SQL.query but then you don't get all the nice Ecto conveniences.

If you guys like this proposal then I think the following things also need to be done:

  • Update the Ecto docs to say this option exists
  • Add integration tests in Ecto to make sure the queries work the same when that option is specified
  • Maybe add an integration test in this repo. If I could create one that inspects the results from Ecto.Adapters.Postgres.Connection.prepare_execute then I could inspect the name of the query.

@josevalim
Copy link
Member

My only concern with this option is the :prepare applying permanently for future operations. It feels perhaps we should make it part of the query cache somehow?

@greg-rychlewski
Copy link
Member Author

I'll take a closer look at the cache handling and come back with a different proposal. Thanks!

@greg-rychlewski
Copy link
Member Author

greg-rychlewski commented Jan 2, 2022

It looks like something like this can be done:

In the ecto repository, specifically Ecto.Query.Planner.finalize_cache, can add {:prepare, :named | :unnamed} to the cache key. This can be done either by:

  1. adding a new prepare argument to`Ecto.Query.Planner.query and passing it along to the various cache functions
  2. adding a new prepare field to the query struct and making sure it's in the query before it goes to the planner

Then in ecto_sql can can have the name creation logic (ecto_#{id} vs empty string). The combo of these 2 will allow for having both unnamed and named versions of the same query.

I think it's possible to ignore this option for everything that's not Postgres and default it to named. But I have to investigate a bit more to figure out the details. This might be worth it to avoid missing the cache when changing an option that the adapter will ignore.

@josevalim
Copy link
Member

Oh, I think we may be fine with the current approach, as long as we store if the query was prepared or not in the cache? If the query was not prepared and then it is later requested to be prepared, then do so and we update the cache? We can use the callback functions listed here: https://github.com/elixir-ecto/ecto/blob/master/lib/ecto/query/planner.ex#L129-L135

@greg-rychlewski
Copy link
Member Author

That makes sense. I'm trying to see if it's necessary to use the prepare_execute callback every time the user changes the prepare value.

It looks like if you use the execute callback, Postgrex will re-prepare if it has to. This would mean someone going back and forth several between named and unnamed could skip the prepare step for the named query.

I'm just not sure if somehow this code segment can get hit when we wouldn't want it to. If it hits this then potentially something that needs to be prepared wouldn't be

defp handle_execute_result(%{ref: ref} = query, params, opts, %{postgres: {_, ref}} = s) do
  # ref in lock so query is prepared
  status = new_status(opts)

  case query do
    %{name: ""} -> bind_execute_close(s, status, query, params)
    _ -> bind_execute(s, status, query, params)
  end
end

Do you think it's safer to always call prepare_execute because going back and forth like that should be a very rare occurrence? I can dig into the Postgrex code more if you think it's worth considering.

@greg-rychlewski
Copy link
Member Author

greg-rychlewski commented Jan 2, 2022

Actually nevermind it seems like the query ref changes every time the query is prepared. So you will lose the ref to the named query if you go from named -> unnamed -> named and it will need to re-prepare.

It seems like prepare_execute always needs to be called.

@greg-rychlewski
Copy link
Member Author

greg-rychlewski commented Jan 2, 2022

I will do the following changes, if you are alright with them.

Submit a PR to Ecto with the following changes:

  1. Will add prepare value to the cache.
  2. I'll try to see if I can find a way to ignore opts[:prepare] when the adapter is not Postgres and force it to be :named.

Then to this PR will add:

  1. If query was already cached, issue prepare_execute instead of execute if the opts[:prepare] is not the same as the cache value. Also update the cache with the new query + prepare value.

Or I can do the cache key change if it's preferable to avoid the re-preparing when flip flopping.

@josevalim
Copy link
Member

Can’t we handle it in ecto_sql exclusively?

@greg-rychlewski
Copy link
Member Author

Ah yes sorry I see what you mean now. I'll submit something soon.

@greg-rychlewski
Copy link
Member Author

greg-rychlewski commented Jan 3, 2022

@josevalim Thanks for your guidance. Was curious what you think about this approach.

Copy link
Member

@josevalim josevalim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this direction! I am just not sure why we have the :use_cache option. Couldn't we access the prepare option downstream instead? 🤔

@greg-rychlewski
Copy link
Member Author

Oh yeah if you prefer to leave it as prepare: :named | :unnamed then I can switch it back. I originally had it that way but my thinking was that prepare: :named | :unnamed is very Postgres specific, both the option name and the potential values. I was trying to convert it into a neutral term in case other adapters in the future need a similar capability to bypass the cache but use different option names/values.

@greg-rychlewski
Copy link
Member Author

greg-rychlewski commented Jan 3, 2022

I could also leave the option name as :prepare but convert its value to true/false because prepare is a pretty neutral name. But I think the idea of unnamed prepared statements if very Postgres specific, if I'm not mistaken.

edit: Though I guess having prepare = false and then calling prepare_execute kind of looks weird.

@v0idpwn
Copy link
Member

v0idpwn commented Jan 3, 2022

I didn't dive a lot into this issue, so I may be saying some non-sense here, but doesn't this do the same as passing cache_statement: "" as an option? (see: https://github.com/elixir-ecto/ecto_sql/pull/352/files)

@josevalim
Copy link
Member

I think using :prepare is fine. All adapters based on DBConnection have a prepare nomenclature.

cache_statement is different caching mechanism that compares the name with the statement and reuses it if the statement matches. The prepare query does not keep the original statement at all (and it is therefore superior).

@greg-rychlewski
Copy link
Member Author

greg-rychlewski commented Jan 3, 2022

@josevalim Sorry just to be clear, did you mean to use :prepare as a boolean for the downstream or to keep :named/:unnamed?

@josevalim
Copy link
Member

I think we can keep it as prepare: :named and prepare: :unnamed downstream as well. WDYT?

@greg-rychlewski
Copy link
Member Author

greg-rychlewski commented Jan 3, 2022

I think it's a good way to go. It works for everything existing now. Can change later if needed.

@greg-rychlewski
Copy link
Member Author

@josevalim Think it's good to go. Did the following things:

  • got rid of :use_cache
  • documented the new option
  • added integration test

@greg-rychlewski
Copy link
Member Author

greg-rychlewski commented Jan 3, 2022

I had one thought actually. I think it's possible custom adapters can specify their own :prepare option and it will conflict with this.

Like if someone creates their own Teradata adapter then they might want to call TeraRepo.all(Queryable, prepare: some_value_that_is_not_named_or_unnamed) so they can use the option in their Ecto.Adapters.Teradata.Connection module.

It probably makes more sense to add a prepare argument to Ecto.Adapters.SQL.execute(adapter_meta, query_meta, query, params, opts) instead of including it in opts.

@josevalim josevalim merged commit 2e44830 into elixir-ecto:master Jan 4, 2022
@josevalim
Copy link
Member

Beautiful! The solution you landed on is quite elegant! 💚 💙 💜 💛 ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants