Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/sem/tree, backupccl, *: sanitize URLs during Format #126970

Merged
merged 9 commits into from
Jul 17, 2024

Conversation

michae2
Copy link
Collaborator

@michae2 michae2 commented Jul 11, 2024

Prior to this change, all SQL statements containing URLs would be formatted with the full URL in cleartext, including any secrets such as keys or passwords. These secrets would sometimes show up in the slow query log or sql audit log. This PR adds new functions tree.(*FmtCtx).FormatURIand tree.(*FmtCtx).FormatURIs which are designed to sanitize URLs during formatting.

See individual commits for details.

Fixes: CRDB-39710, TREQ-284

Epic: None

Release note: None

Copy link

blathers-crl bot commented Jul 11, 2024

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link

blathers-crl bot commented Jul 11, 2024

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@michae2 michae2 force-pushed the formattest branch 3 times, most recently from cc92cef to 5cd37ed Compare July 12, 2024 07:46
@michae2 michae2 changed the title wip sql/sem/tree, backupccl, *: sanitize URLs during Format Jul 12, 2024
@michae2 michae2 added backport-23.1.x Flags PRs that need to be backported to 23.1 backport-23.2.x Flags PRs that need to be backported to 23.2. backport-24.1.x Flags PRs that need to be backported to 24.1. backport-24.2.x Flags PRs that need to be backported to 24.2 labels Jul 12, 2024
@michae2 michae2 marked this pull request as ready for review July 12, 2024 07:51
@michae2 michae2 requested review from a team as code owners July 12, 2024 07:51
@michae2 michae2 requested review from xinhaoz, msbutler, andyyang890 and rytaft and removed request for a team July 12, 2024 07:51
pkg/sql/sem/tree/format.go Outdated Show resolved Hide resolved
@rytaft rytaft requested a review from dt July 15, 2024 21:02
Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! A few questions / nits.

Reviewed 35 of 35 files at r14, 15 of 15 files at r15, 3 of 3 files at r16, 4 of 4 files at r17, 7 of 7 files at r18, 8 of 8 files at r19, 3 of 3 files at r20, 4 of 4 files at r21, 3 of 3 files at r22, 35 of 35 files at r23, 14 of 14 files at r24, 2 of 2 files at r25, 3 of 3 files at r26, 6 of 6 files at r27, 7 of 7 files at r28, 2 of 2 files at r29, 3 of 3 files at r30, 2 of 2 files at r31, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andyyang890, @dt, @michae2, @msbutler, and @xinhaoz)


pkg/sql/sem/tree/alter_changefeed.go line 94 at r18 (raw file):

func (node *AlterChangefeedSetOptions) Format(ctx *FmtCtx) {
	ctx.WriteString(" SET ")
	// TODO(michae2): sanitize SET sink

Does this change need to happen as part of this PR?


pkg/sql/sem/tree/format.go line 486 at r31 (raw file):

func (ctx *FmtCtx) FormatURIs(uris []Expr) {
	if len(uris) > 1 {
		ctx.WriteString("(")

I noticed that this caused import logic to change with parentheses. Is this going to cause issues anywhere else? Any reason we need this special logic with parentheses now? And is this well tested?


pkg/sql/parser/testdata/import_export line 4 at r31 (raw file):

IMPORT TABLE foo FROM PGDUMPCREATE 'nodelocal://0/foo/bar' WITH temp = 'path/to/temp'
----
IMPORT TABLE foo FROM PGDUMPCREATE '*****' WITH OPTIONS (temp = 'path/to/temp') -- normalized!

should we be concerned about 'path/to/temp'?

@dt
Copy link
Member

dt commented Jul 15, 2024

temp = ...

temp hasn't been a valid option to IMPORT for 5+ years. The parser testdata is just showing that you can pass any pair of strings to these untyped string option = string value maps, and we don't do any validation of which keys you pass or what you pass for them until execution of the stmt (at which point you'd get ERROR: invalid option "temp". Fortunately we don't put many URIs in these unstructured option maps.

Copy link
Collaborator Author

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs, @dt and @rytaft! Any objection if I add a cluster setting to bring back the old behavior, default off?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andyyang890, @dt, @msbutler, @rytaft, and @xinhaoz)


pkg/sql/sem/tree/alter_changefeed.go line 94 at r18 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Does this change need to happen as part of this PR?

Ah, good catch. Yes. Done.


pkg/sql/sem/tree/format.go line 486 at r31 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I noticed that this caused import logic to change with parentheses. Is this going to cause issues anywhere else? Any reason we need this special logic with parentheses now? And is this well tested?

Great question. Every field using FormatURIs except Import.Files comes from string_or_placeholder_opt_list in the parser, which makes the parentheses optional for a single value, so I made that the default behavior. This matches the behavior of StringOrPlaceholderOptList.Format.

Import.Files is a special case: it comes from string_or_placeholder_list surrounded by parentheses.

This is tested by the cases in pkg/sql/parser/testdata/import_export which shouldn't have changed w.r.t. parentheses.

(The other weird one is the INCREMENTAL FROM clause of BACKUP which comes from string_or_placeholder_list without any parentheses, so I also had to add a special case for that.)

Copy link
Collaborator Author

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any objection if I add a cluster setting to bring back the old behavior, default off?

Actually, this is turning out to be very painful to plumb into all calls to AsString*. Forget I said this.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andyyang890, @dt, @msbutler, @rytaft, and @xinhaoz)

michae2 added 9 commits July 16, 2024 15:52
Add new functions `tree.(*FmtCtx).FormatURI` and
`tree.(*FmtCtx).FormatURIs` which can be used to safely format one or
more string literals containing URLs. During formatting the URLs are
hidden to prevent leaking sensitive information such as passwords and
keys.

There are some special cases for which we do *not* sanitize the URL:
- if the URL is a placeholder like $1
- if the URL is an empty string or an underscore
- if the URL has already been sanitized (e.g. while building clean AST
  nodes for job details) which is communicated by setting the
  `FmtShowFullURIs` flag
- if we're showing passwords, or already hiding constants

Future commits will call these functions when formatting SQL statements
that could contain a URL.

Epic: None

Release note: None
Use `tree.(*FmtCtx).FormatURI` in various backup-related statements.

Epic: None

Release note (security update): URLs in the following SQL statements are
now sanitized of any secrets before being written to unredacted logs:
- `BACKUP`
- `RESTORE`
- `CREATE SCHEDULE FOR BACKUP`
- `ALTER BACKUP SCHEDULE`
Use `tree.(*FmtCtx).FormatURI` in `ALTER BACKUP`.

Epic: None

Release note (security update): URLs in the following SQL statements are
now sanitized of any secrets before being written to unredacted logs:
- `ALTER BACKUP`
Use `tree.(*FmtCtx).FormatURI` in `SHOW BACKUP`.

Epic: None

Release note (security update): URLs in the following SQL statements are
now sanitized of any secrets before being written to unredacted logs:
- `SHOW BACKUP`
- `SHOW BACKUPS`
Use `tree.(*FmtCtx).FormatURI` in `CREATE CHANGEFEED`.

Epic: None

Release note (security update): URLs in the following SQL statements are
now sanitized of any secrets before being written to unredacted logs:
- `CREATE CHANGEFEED`
- `CREATE SCHEDULE FOR CHANGEFEED`
Use `tree.(*FmtCtx).FormatURI` in `IMPORT`.

Epic: None

Release note (security update): URLs in the following SQL statements are
now sanitized of any secrets before being written to unredacted logs:
- `IMPORT`
Use `tree.(*FmtCtx).FormatURI` in `EXPORT`.

Epic: None

Release note (security update): URLs in the following SQL statements are
now sanitized of any secrets before being written to unredacted logs:
- `EXPORT`
Use `tree.(*FmtCtx).FormatURI` in `CREATE EXTERNAL CONNECTION`.

Epic: None

Release note (security update): URLs in the following SQL statements are
now sanitized of any secrets before being written to unredacted logs:
- `CREATE EXTERNAL CONNECTION`
Use `tree.(*FmtCtx).FormatURI` in `COPY`.

Epic: None

Release note (security update): URLs in the following SQL statements are
now sanitized of any secrets before being written to unredacted logs:
- `COPY`
Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 36 of 36 files at r37, 14 of 14 files at r38, 2 of 2 files at r39, 3 of 3 files at r40, 8 of 8 files at r41, 7 of 7 files at r42, 2 of 2 files at r43, 3 of 3 files at r44, 2 of 2 files at r45, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andyyang890, @dt, @msbutler, and @xinhaoz)


pkg/sql/sem/tree/format.go line 486 at r31 (raw file):

Previously, michae2 (Michael Erickson) wrote…

Great question. Every field using FormatURIs except Import.Files comes from string_or_placeholder_opt_list in the parser, which makes the parentheses optional for a single value, so I made that the default behavior. This matches the behavior of StringOrPlaceholderOptList.Format.

Import.Files is a special case: it comes from string_or_placeholder_list surrounded by parentheses.

This is tested by the cases in pkg/sql/parser/testdata/import_export which shouldn't have changed w.r.t. parentheses.

(The other weird one is the INCREMENTAL FROM clause of BACKUP which comes from string_or_placeholder_list without any parentheses, so I also had to add a special case for that.)

Thanks for the explanation!

@michae2
Copy link
Collaborator Author

michae2 commented Jul 17, 2024

Thanks for the reviews!

bors r=dt,rytaft

@craig craig bot merged commit 743b721 into cockroachdb:master Jul 17, 2024
22 checks passed
Copy link

blathers-crl bot commented Jul 17, 2024

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 5bfbead to blathers/backport-release-23.1-126970: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.1.x failed. See errors above.


error creating merge commit from 5bfbead to blathers/backport-release-23.2-126970: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.2.x failed. See errors above.


error creating merge commit from 5bfbead to blathers/backport-release-24.1-126970: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 24.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1 backport-23.2.x Flags PRs that need to be backported to 23.2. backport-24.1.x Flags PRs that need to be backported to 24.1. backport-24.2.x Flags PRs that need to be backported to 24.2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants