-
Notifications
You must be signed in to change notification settings - Fork 623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: Make bigquery backend not depend on impala backend (part 1) #2449
Conversation
should be sql_backend or just sql |
I guess you mean the name of the |
these are better; you already have a naming scheme going keep using it its not an entry point but ibis.backends.base_backend is very weird |
It's |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
This is very useful. Eventually I'd even like to add a base_zetasql
once I get around to supporting Apache Beam SQL and/or Cloud Spanner, which all use the same syntax https://github.com/google/zetasql It's great to have some precedence for how to do this.
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
azp run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small comment. any ci / testing changes that work now that didn't work before?
ibis/backends/spark/compiler.py
Outdated
@@ -24,6 +24,7 @@ | |||
import math | |||
|
|||
import ibis | |||
import ibis.backends.base_sql |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use like
import ibis.backends.base_sql as bsql
or similar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally find using the name of the module much easier to read. Currently we're using (sample from impala/compiler.py
):
import ibis.common.exceptions as com
import ibis.expr.analysis as L
import ibis.expr.datatypes as dt
import ibis.expr.operations as ops
import ibis.expr.types as ir
import ibis.sql.compiler as comp
And seems like we're making our life very difficult for no reason. I'd rather keep the original module name, I find the code much clearer than keep adding aliases for everything.
let's follow the current pattern it's not idiomatic to use the full import happy to entertain a change - but it should b comprehensive when all of the moving parts are finished (all of this moving of backends |
Why do you mean by that? My understanding is that what's Pythonic or idiomatic in Python is being explicit, simple and readable (I think those are the main takeaways from the zen of Python). If I compare: ir.CategoryValue with ibis.expr.types.CategoryValue I think the latter is explicit, simpler and more readable (just more verbose too). I think for anyone new to Ibis (and not so new, like myself) having to learn all those alias, which don't even seem to make sense, is useless and a waste of time. If We are moving a lot of stuff to So, -1 on aliasing the module. Please let me know if there is any other reason I'm missing for the alias to be preferred. |
my point is this is competely changing the current ibis style. you may not like it but changing ad-hoc on this is a big no-no. further I don't find realy long imports useful at all, that'st he reason they are aliased. |
if you want to clean all backends as a followon not averse, but please don't do this here. |
Ok, I'll do that. In any case, we don't have an alias for any backend. Other backends are using the form I'll do that, even if I don't think I should be wasting time writing suboptimal code, when I already wrote the clean up of those imports and made the code clearer in this PR, which is a reafactoring of that exact part, and IMHO the best place to fix it. As a side note, probably worth mentioning that I'm not happy with this policy of blocking PRs for very opinionated things. I see value in (discussing and) standardizing styles and best practices. But if every small detail in this projects needs to be based on your personal preference (and I do feel this way here and in #2451), I'll have to think if this is a good investment of my time working on Ibis. Cleaning the big mess in the Ibis CI and architecture is a lot of hard work. And if instead of having help, I've got impediments and micromanagement, I think I'll leave someone else to do it. |
@datapythonista your efforts are appreciated, but at the same time, you are introducing a huge amount of change in a short period of time. i am fine with doing things as followons, but my point in review is to have consistency in things. |
|
just don't use the fully qualified form itself in the code, that is much longer. |
thanks @datapythonista I suspect some backends that were not originally using impala, e.g. postgres likely could be refactored to use the new base_sql (and maybe push down some more things). I don't know if we have an issue for this, but could create one if not. |
You'll know better than me, but my understanding is that some backends depend on the custom compiler For now I'll create issues to finish this (move what it's still in Impala), and to move |
The bigquery backend is currently reusing a lot of functionality from the impala backend, which IMO doesn't make any sense.
The reason is that the impala backend compiler has many things that are generic. I guess it was the only backend when it was implemented, and when bigquery was implemented, instead of moving the generic functionality to a base module for all backends to import, these generic functionality was simply imported from impala.
Not sure why this hack hasn't created more problems until now, but we're facing some in #2435.
This PR creates a
base_backend.py
with the shared functionality. A follow up is needed to also move the common_operation_registry
and its dependencies. This PR doesn't change any code, just moves functions.CC: @jreback @tswast