-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(stdlib/sql): add SAP HANA db support #3098
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3098 +/- ##
==========================================
+ Coverage 49.58% 49.60% +0.01%
==========================================
Files 341 342 +1
Lines 35727 35844 +117
==========================================
+ Hits 17716 17781 +65
- Misses 15543 15587 +44
- Partials 2468 2476 +8
Continue to review full report at Codecov.
|
570db57
to
1afba6a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looks great as usual, I just had some questions.
stdlib/sql/hdb.go
Outdated
} else { // table in user default schema | ||
where = fmt.Sprintf("WHERE TABLE_NAME=UPPER('%s')", table) | ||
} | ||
return fmt.Sprintf(hdbDoIfTableNotExistsTemplate, where, query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little worried that someone could craft a table name that contains characters like '; ...
or something and do something nefarious. example
Do you have any thoughts on this? For some of the places where we use %s
I imagine that we could use parameters ?
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not get ? placeholder to work with the SQL DO script (SQL Error 1287 - identifier must be declared
), but I at least changed it to use declared variables for schema and table. I hope it is safer now...
The resulting query now looks like eg.
DO
BEGIN
DECLARE SCHEMA_NAME NVARCHAR(11) = 'bike_stores';
DECLARE TABLE_NAME NVARCHAR(5) = 'xcopy';
DECLARE X_EXISTS INT = 0;
SELECT COUNT(*) INTO X_EXISTS FROM TABLES WHERE SCHEMA_NAME=UPPER(:SCHEMA_NAME) AND TABLE_NAME=UPPER(:TABLE_NAME);
IF :X_EXISTS = 0
THEN
CREATE TABLE bike_stores.xcopy (NOTE NVARCHAR(5000),...);
END IF;
END;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is better but it still seems like the CREATE TABLE
statement allows for an injection.
If I were to do something like sql.to(table: "foo (i INTEGER); DROP TABLE bar; ...")
, then there's still a problem, if I'm understanding right.
What if you always quoted the identifiers in the CREATE TABLE statement:
CREATE TABLE "bike_stores"."xcopy" ("NOTE" NVARCHAR(5000),...);
And if any of the identifiers contain a "
, you can either escape it, or produce a user error. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improved, I hope :) Since by default object names are converted to uppercase by HDB unless quoted, and quotes are used to escape possibly malign SQL code here and not to mark the value as case-sensitive, uppercase is assumed and used for new output table name, which I think is conforming and safe.
Table created like bellow then allows a user to insert data both with any of the following statements:
INSERT INTO bike_stores.orders_copy_q2 ... (note, ...) VALUES (...)
INSERT INTO BIKE_STORES.ORDERS_COPY_Q2 ... (NOTE, ...) VALUES (...)
.
INSERT INTO "BIKE_STORES"."ORDERS_COPY_Q2" ... ("NOTE", ...) VALUES (...)
.
The if-not-exist-then-create now looks like this:
BEGIN
DECLARE SCHEMA_NAME NVARCHAR(11) = 'BIKE_STORES';
DECLARE TABLE_NAME NVARCHAR(14) = 'ORDERS_COPY_Q2';
DECLARE X_EXISTS INT = 0;
SELECT COUNT(*) INTO X_EXISTS FROM TABLES WHERE SCHEMA_NAME=ESCAPE_DOUBLE_QUOTES(:SCHEMA_NAME) AND TABLE_NAME=ESCAPE_DOUBLE_QUOTES(:TABLE_NAME);
IF :X_EXISTS = 0
THEN
CREATE TABLE "BIKE_STORES"."ORDERS_COPY_Q2" ("NOTE" NVARCHAR(5000),...);
END IF;
END;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much better. Thanks for the attention to this.
1afba6a
to
84248ba
Compare
7c45a00
to
26e8000
Compare
659cdc9
to
1c09486
Compare
This diff closes the opening for SQL injection via column identifiers in hdb, but highlights the inconsistency between the quote/escape strategy used for table names and column names. Table names were already being quoted/escaped, but were also being transformed to uppercase. It could be argued we should do the same transformation with column identifiers, but I would advocate _the other way_ (to not transform the case at all anywhere). The issue is automatically forcing identifiers to uppercase, then quoting them, means tables/columns created via DDL in the db engine effectively become unreachable from Flux when they: - happen to be quoted (preserving case) - happen to _not be uppercase_ By adding a perceived convenience with the transformation, we reduce control and possibly make valid identifiers impossible to reference. For the original discussion on the strategy, see: <#3098>
This diff closes the opening for SQL injection via column identifiers in hdb, but highlights the inconsistency between the quote/escape strategy used for table names and column names. Table names were already being quoted/escaped, but were also being transformed to uppercase. It could be argued we should do the same transformation with column identifiers, but I would advocate _the other way_ (to not transform the case at all anywhere). The issue is automatically forcing identifiers to uppercase, then quoting them, means tables/columns created via DDL in the db engine effectively become unreachable from Flux when they: - happen to be quoted (preserving case) - happen to _not be uppercase_ By adding a perceived convenience with the transformation, we reduce control and possibly make valid identifiers impossible to reference. For the original discussion on the strategy, see: <#3098>
This diff closes the opening for SQL injection via column identifiers in hdb, but highlights the inconsistency between the quote/escape strategy used for table names and column names. Table names were already being quoted/escaped, but were also being transformed to uppercase. It could be argued we should do the same transformation with column identifiers, but I would advocate _the other way_ (to not transform the case at all anywhere). The issue is automatically forcing identifiers to uppercase, then quoting them, means tables/columns created via DDL in the db engine effectively become unreachable from Flux when they: - happen to be quoted (preserving case) - happen to _not be uppercase_ By adding a perceived convenience with the transformation, we reduce control and possibly make valid identifiers impossible to reference. For the original discussion on the strategy, see: <#3098>
This diff closes the opening for SQL injection via column identifiers in hdb, but highlights the inconsistency between the quote/escape strategy used for table names and column names. Table names were already being quoted/escaped, but were also being transformed to uppercase. It could be argued we should do the same transformation with column identifiers, but I would advocate _the other way_ (to not transform the case at all anywhere). The issue is automatically forcing identifiers to uppercase, then quoting them, means tables/columns created via DDL in the db engine effectively become unreachable from Flux when they: - happen to be quoted (preserving case) - happen to _not be uppercase_ By adding a perceived convenience with the transformation, we reduce control and possibly make valid identifiers impossible to reference. For the original discussion on the strategy, see: <#3098>
* fix(stdlib/sql): quote/escape table and column identifiers * refactor: update hdb to use quoted column identifiers This diff closes the opening for SQL injection via column identifiers in hdb, but highlights the inconsistency between the quote/escape strategy used for table names and column names. Table names were already being quoted/escaped, but were also being transformed to uppercase. It could be argued we should do the same transformation with column identifiers, but I would advocate _the other way_ (to not transform the case at all anywhere). The issue is automatically forcing identifiers to uppercase, then quoting them, means tables/columns created via DDL in the db engine effectively become unreachable from Flux when they: - happen to be quoted (preserving case) - happen to _not be uppercase_ By adding a perceived convenience with the transformation, we reduce control and possibly make valid identifiers impossible to reference. For the original discussion on the strategy, see: <#3098> * fix: resolve issue with identifier casing in hdb The HDB code path will try to force identifiers to all be UPPER CASE. The proc for automatically creating the target table (when missing) only uppercased the table name in certain situations (when specified with a schema name stem). In the other case, the table name was incorrectly left as-is. HDB also handled quoting/escaping column names as a part of the "translate func" for this driver, so the initial work to handle this closer to the top of the `to` impl resulted in "stacked quoting" which caused problems in SQL generation. If we move all column escaping/quoting to happen during translation, we can unify things a bit. * refactor: quote idents closer to where they are templated The column translate func now escapes column names for DDL, quoteIdent for insert. Feels silly to do this in two places. If the translate func returned pairs of (escaped) column name + type, then we could use the translate func to generate identifiers for both cases. Currently it returns a string containing both together. * docs: add remark on ident quoting/escaping being tranlate func's job * refactor: include quote-reliant column name in DDL Expand the surface area of the integration tests to include a column name that has a space in it, requiring it to be quoted. This change is "double duty" in so much as it also introduces a nullable column into the mix. This complicates the tables we assert against since it forces the use of `union` and `debug.opaque` to be able to represent the table we expect to get back from the database. * test: update column translate tests to assert quoted DDL The column translate functions should now be returning quoted column identifiers. * fix: quote/escape string literals in SQL DDL Mitigate risk of SQL injection by escaping interior single quotes when generating string literals for SQL DDL. * fix: quote/escape another string literal, add comments * refactor: mark mysql and postgres quote helpers as private * chore: favor docker stop instead of docker rm for automatic volume cleanup * fix: string literals escape ' as '' * fix: identifiers escape " as "" * test: add sql injection attempt tests * fix: escape ` as ``, fixup hdb escaping * test: comment out vertica injection attempt, possible driver bug * chore: make fmt * refactor: tighten up test code with custom assertion helpers Many of the assertions in the `sql.to` tests contain duplicate blocks which were originally written in a very verbose style. Slim the tests down by pulling the repetion out into helper functions. * refactor: hoist "seed want" values up to top of acceptance tests * chore: make generate
This PR add SAP HANA db support to Flux.
It uses older
v0.14.1
Go driver (Mar 16, 2019), because it is the last version that works with Go 1.12.Done checklist