Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False positive - Incomplete string escaping #18379

Open
kuhe opened this issue Dec 29, 2024 · 4 comments
Open

False positive - Incomplete string escaping #18379

kuhe opened this issue Dec 29, 2024 · 4 comments

Comments

@kuhe
Copy link

kuhe commented Dec 29, 2024

Description of the false positive

Incomplete string escaping or encoding
This does not escape backslash characters in the input.
part = `"${part.replace(/"/g, '\\"')}"`;

This is intentional,

actual string: abc\" -> "abc\\""

JS:

'abc\\"'  '"abc\\\\""'

This code is used in a client library to serialize caller input as-is from list to header value.

const input = ["a", "b", `\\"`];
const serialized = `a,b,"\\\\""`;

Code samples or links to source code

https://github.com/smithy-lang/smithy-typescript/blob/d8446cfa3bf6cbcf1187d1ad744ac5a296e442d7/packages/smithy-client/src/quote-header.ts#L8

@smowton
Copy link
Contributor

smowton commented Dec 30, 2024

So just to be clear, it's acceptable that user input " will be escaped to \", which would not terminate any "-delimited string it was pasted into, but user input \" will escape to \\", leaving an unescaped quote which will terminate such a "-delimited string?

@kuhe
Copy link
Author

kuhe commented Dec 31, 2024

That is not what is happening. In this format, for \\", or in JS \\\\", only the backslash before the double quote is an escape instruction for the receiving side. The first backslash is a normal character that is part of the data.

Here is a code example showing that this transform is reversible without loss of information.

import assert from "node:assert";

function serialize(part) {
  if (part.includes(",") || part.includes('"')) {
    part = `"${part.replace(/"/g, '\\"')}"`;
  }
  return part;
}

function deserialize(part) {
  if (part[0] === '"' && part[part.length - 1] === '"') {
    return part.slice(1, -1).replace(/\\"/g, '"');
  }
  return part;
}

const tests = [
  `abcd`,
  `ab","cd`,
  `ab\\",\\"cd`,
  `\t\"\n\\"\r\\\"\n\\\\"`,
  ``,
  `"`,
  `\"`,
  `\\"`,
  `\\\"`,
  `""""\"\\"\\\"\\\\"""""`,
  ` " " " "\"\\"\\\"\\\\" " " " "`,
  `"\"\\"\\\"\\\\"`,
  `",\","\\"\\\,"\\,",\,\,",\,",\,\\\,""""`,
];

for (const test of tests) {
  assert(test === deserialize(serialize(test)));
  process.stdout.write(".");
}
console.log("ok");

@smowton
Copy link
Contributor

smowton commented Dec 31, 2024

I see. I don't think I understand how this could work -- backslash-escaping the quote to begin with seems as if it is designed for a parser seeking an unescaped end-quote to terminate a string, and if \ doesn't itself get escaped then passing string "\" to deserialization would seem like a problem, since a parser wouldn't know if it was looking at the string \ or a long string beginning " and continuing with whatever input comes next. However I'm sure there's some relevant context I don't know about.

I think this scenario, where " can be backslash-escaped but \\ is not itself an escape sequence for a backslash, is so unusual that the query can't reasonably anticipate it. In that case I would recommend dismissing the code scanning alert per https://docs.github.com/en/enterprise-cloud@latest/code-security/code-scanning/managing-code-scanning-alerts/resolving-code-scanning-alerts -- is that option available to you?

@kuhe
Copy link
Author

kuhe commented Dec 31, 2024

"\" becomes "\"\\"" in this serialization, which is

" - start quote wrapper
\" - escaped quote
\ - lone backslash
\" - another escaped quote
" - end quote wrapper

\ (single backslash) is serialized with no change.

Putting that aside as the specifics of this format may not be the main issue, I understand you can't compromise the detection for this case. I also don't want to do any sort of trick in my code to avoid detection by codeql as a false-positive.

My team does not directly use codeql and this came up as my software's downstream users are getting the alert. I will refer them to this recommendation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants