feat(stdlib/sql): add SAP HANA db support #3098

alespour · 2020-08-11T12:03:22Z

This PR add SAP HANA db support to Flux.

It uses older v0.14.1 Go driver (Mar 16, 2019), because it is the last version that works with Go 1.12.

Done checklist

Test cases written

codecov-commenter · 2020-08-11T12:21:14Z

Codecov Report

Merging #3098 into master will increase coverage by 0.01%.
The diff coverage is 54.54%.

@@            Coverage Diff             @@
##           master    #3098      +/-   ##
==========================================
+ Coverage   49.58%   49.60%   +0.01%     
==========================================
  Files         341      342       +1     
  Lines       35727    35844     +117     
==========================================
+ Hits        17716    17781      +65     
- Misses      15543    15587      +44     
- Partials     2468     2476       +8

Impacted Files	Coverage Δ
stdlib/sql/to.go	`45.41% <40.00%> (+0.12%)`	⬆️
stdlib/sql/source_validator.go	`75.00% <50.00%> (-1.93%)`	⬇️
stdlib/sql/hdb.go	`55.23% <55.23%> (ø)`
stdlib/sql/from.go	`33.65% <100.00%> (+1.30%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 57a7373...1c09486. Read the comment docs.

wolffcm

Overall, looks great as usual, I just had some questions.

stdlib/sql/hdb.go

wolffcm · 2020-09-18T22:41:40Z

stdlib/sql/hdb.go

+	} else { // table in user default schema
+		where = fmt.Sprintf("WHERE TABLE_NAME=UPPER('%s')", table)
+	}
+	return fmt.Sprintf(hdbDoIfTableNotExistsTemplate, where, query)


I'm a little worried that someone could craft a table name that contains characters like '; ... or something and do something nefarious. example

Do you have any thoughts on this? For some of the places where we use %s I imagine that we could use parameters ? instead?

I could not get ? placeholder to work with the SQL DO script (SQL Error 1287 - identifier must be declared), but I at least changed it to use declared variables for schema and table. I hope it is safer now...

The resulting query now looks like eg.

DO BEGIN DECLARE SCHEMA_NAME NVARCHAR(11) = 'bike_stores'; DECLARE TABLE_NAME NVARCHAR(5) = 'xcopy'; DECLARE X_EXISTS INT = 0; SELECT COUNT(*) INTO X_EXISTS FROM TABLES WHERE SCHEMA_NAME=UPPER(:SCHEMA_NAME) AND TABLE_NAME=UPPER(:TABLE_NAME); IF :X_EXISTS = 0 THEN CREATE TABLE bike_stores.xcopy (NOTE NVARCHAR(5000),...); END IF; END;

I think this is better but it still seems like the CREATE TABLE statement allows for an injection.

If I were to do something like sql.to(table: "foo (i INTEGER); DROP TABLE bar; ..."), then there's still a problem, if I'm understanding right.

What if you always quoted the identifiers in the CREATE TABLE statement:

CREATE TABLE "bike_stores"."xcopy" ("NOTE" NVARCHAR(5000),...);

And if any of the identifiers contain a ", you can either escape it, or produce a user error. Does that make sense?

Improved, I hope :) Since by default object names are converted to uppercase by HDB unless quoted, and quotes are used to escape possibly malign SQL code here and not to mark the value as case-sensitive, uppercase is assumed and used for new output table name, which I think is conforming and safe.

Table created like bellow then allows a user to insert data both with any of the following statements:

INSERT INTO bike_stores.orders_copy_q2 ... (note, ...) VALUES (...)
INSERT INTO BIKE_STORES.ORDERS_COPY_Q2 ... (NOTE, ...) VALUES (...).
INSERT INTO "BIKE_STORES"."ORDERS_COPY_Q2" ... ("NOTE", ...) VALUES (...).

The if-not-exist-then-create now looks like this:

BEGIN DECLARE SCHEMA_NAME NVARCHAR(11) = 'BIKE_STORES'; DECLARE TABLE_NAME NVARCHAR(14) = 'ORDERS_COPY_Q2'; DECLARE X_EXISTS INT = 0; SELECT COUNT(*) INTO X_EXISTS FROM TABLES WHERE SCHEMA_NAME=ESCAPE_DOUBLE_QUOTES(:SCHEMA_NAME) AND TABLE_NAME=ESCAPE_DOUBLE_QUOTES(:TABLE_NAME); IF :X_EXISTS = 0 THEN CREATE TABLE "BIKE_STORES"."ORDERS_COPY_Q2" ("NOTE" NVARCHAR(5000),...); END IF; END;

This is much better. Thanks for the attention to this.

This diff closes the opening for SQL injection via column identifiers in hdb, but highlights the inconsistency between the quote/escape strategy used for table names and column names. Table names were already being quoted/escaped, but were also being transformed to uppercase. It could be argued we should do the same transformation with column identifiers, but I would advocate _the other way_ (to not transform the case at all anywhere). The issue is automatically forcing identifiers to uppercase, then quoting them, means tables/columns created via DDL in the db engine effectively become unreachable from Flux when they: - happen to be quoted (preserving case) - happen to _not be uppercase_ By adding a perceived convenience with the transformation, we reduce control and possibly make valid identifiers impossible to reference. For the original discussion on the strategy, see: <#3098>

* fix(stdlib/sql): quote/escape table and column identifiers * refactor: update hdb to use quoted column identifiers This diff closes the opening for SQL injection via column identifiers in hdb, but highlights the inconsistency between the quote/escape strategy used for table names and column names. Table names were already being quoted/escaped, but were also being transformed to uppercase. It could be argued we should do the same transformation with column identifiers, but I would advocate _the other way_ (to not transform the case at all anywhere). The issue is automatically forcing identifiers to uppercase, then quoting them, means tables/columns created via DDL in the db engine effectively become unreachable from Flux when they: - happen to be quoted (preserving case) - happen to _not be uppercase_ By adding a perceived convenience with the transformation, we reduce control and possibly make valid identifiers impossible to reference. For the original discussion on the strategy, see: <#3098> * fix: resolve issue with identifier casing in hdb The HDB code path will try to force identifiers to all be UPPER CASE. The proc for automatically creating the target table (when missing) only uppercased the table name in certain situations (when specified with a schema name stem). In the other case, the table name was incorrectly left as-is. HDB also handled quoting/escaping column names as a part of the "translate func" for this driver, so the initial work to handle this closer to the top of the `to` impl resulted in "stacked quoting" which caused problems in SQL generation. If we move all column escaping/quoting to happen during translation, we can unify things a bit. * refactor: quote idents closer to where they are templated The column translate func now escapes column names for DDL, quoteIdent for insert. Feels silly to do this in two places. If the translate func returned pairs of (escaped) column name + type, then we could use the translate func to generate identifiers for both cases. Currently it returns a string containing both together. * docs: add remark on ident quoting/escaping being tranlate func's job * refactor: include quote-reliant column name in DDL Expand the surface area of the integration tests to include a column name that has a space in it, requiring it to be quoted. This change is "double duty" in so much as it also introduces a nullable column into the mix. This complicates the tables we assert against since it forces the use of `union` and `debug.opaque` to be able to represent the table we expect to get back from the database. * test: update column translate tests to assert quoted DDL The column translate functions should now be returning quoted column identifiers. * fix: quote/escape string literals in SQL DDL Mitigate risk of SQL injection by escaping interior single quotes when generating string literals for SQL DDL. * fix: quote/escape another string literal, add comments * refactor: mark mysql and postgres quote helpers as private * chore: favor docker stop instead of docker rm for automatic volume cleanup * fix: string literals escape ' as '' * fix: identifiers escape " as "" * test: add sql injection attempt tests * fix: escape ` as ``, fixup hdb escaping * test: comment out vertica injection attempt, possible driver bug * chore: make fmt * refactor: tighten up test code with custom assertion helpers Many of the assertions in the `sql.to` tests contain duplicate blocks which were originally written in a very verbose style. Slim the tests down by pulling the repetion out into helper functions. * refactor: hoist "seed want" values up to top of acceptance tests * chore: make generate

alespour marked this pull request as ready for review August 17, 2020 16:02

alespour mentioned this pull request Aug 18, 2020

improve sql package test coverage bonitoo-io/flux#2

Open

nathanielc requested review from a team and wolffcm and removed request for a team September 9, 2020 16:29

alespour force-pushed the feat/sql-sap-hana branch 2 times, most recently from 570db57 to 1afba6a Compare September 16, 2020 11:57

wolffcm suggested changes Sep 18, 2020

View reviewed changes

alespour force-pushed the feat/sql-sap-hana branch from 1afba6a to 84248ba Compare September 23, 2020 15:58

alespour force-pushed the feat/sql-sap-hana branch from 7c45a00 to 26e8000 Compare October 6, 2020 12:45

alespour added 15 commits October 6, 2020 16:26

chore: add SAP HANA driver dependency

d591eb9

feat: add SAP HANA db support

87a1536

test: add HDB test

aa0f8d6

style: remove code comment

d5d7fbe

test: add HDB test

fbc88b5

test: add more tests

6544086

test: add HDB code test

0eb6d75

fix: try to prevent SQL injection

c0c4ffa

test: update to match template changes

6009e1e

style: template formatting

ab18c20

fix: return error if column is BINARY

6b6f8f9

fix: escape names with quotes to help prevent SQL injection

68fd9c1

fix: S1005 linter check error

3bd8ac6

refactor: table name uppercase in Go code

40a1409

style: go fmt

1c09486

alespour force-pushed the feat/sql-sap-hana branch from 659cdc9 to 1c09486 Compare October 6, 2020 14:26

wolffcm merged commit cef010b into influxdata:master Oct 6, 2020

onelson mentioned this pull request Dec 22, 2021

fix(stdlib/sql): quote db identifiers #4328

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(stdlib/sql): add SAP HANA db support #3098

feat(stdlib/sql): add SAP HANA db support #3098

alespour commented Aug 11, 2020 •

edited

Loading

codecov-commenter commented Aug 11, 2020 •

edited

Loading

wolffcm left a comment

wolffcm Sep 18, 2020

alespour Sep 23, 2020 •

edited

Loading

wolffcm Oct 1, 2020

alespour Oct 6, 2020

wolffcm Oct 6, 2020

feat(stdlib/sql): add SAP HANA db support #3098

feat(stdlib/sql): add SAP HANA db support #3098

Conversation

alespour commented Aug 11, 2020 • edited Loading

Done checklist

codecov-commenter commented Aug 11, 2020 • edited Loading

Codecov Report

wolffcm left a comment

Choose a reason for hiding this comment

wolffcm Sep 18, 2020

Choose a reason for hiding this comment

alespour Sep 23, 2020 • edited Loading

Choose a reason for hiding this comment

wolffcm Oct 1, 2020

Choose a reason for hiding this comment

alespour Oct 6, 2020

Choose a reason for hiding this comment

wolffcm Oct 6, 2020

Choose a reason for hiding this comment

alespour commented Aug 11, 2020 •

edited

Loading

codecov-commenter commented Aug 11, 2020 •

edited

Loading

alespour Sep 23, 2020 •

edited

Loading