Skip to content

Commit

Permalink
Shell injection hints3 (#599)
Browse files Browse the repository at this point in the history
* shell-injection.html: Improve hints further

Raw strings are recommended but not really *required* in this case,
so recommend them in hints, but don't *mandate* them in the answer.
Allow r'...' and r"..." (both are allowed in Python).

Be pickier about whitespace in a line but NOT in parentheses - in Python
a newline ENDS a statement if it's not bracketed.

Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>

* shell-injection.html: Be picky about Python indentation

Python is picky about indentation. We should be too, but
also provide clear hints if the indentation is wrong.

Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>

* shell-injection.html: Extend hints further

* Add more hints. In particular, help people trying to do
  the substitution.
* Allow newlines and CRs at the beginning of
  the text (since it's allowed in Python).

Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>

* shell-injection.html: Extend a hint

Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>

* shell-injection.html: Add a few more hints

Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>

* shell-injection.html: minor improvements

Improve the text describing the PATTERN, and add
a hint about using an empty string as the second
parameter for re.sub.

Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>

---------

Signed-off-by: David A. Wheeler <dwheeler@dwheeler.com>
  • Loading branch information
david-a-wheeler authored Aug 26, 2024
1 parent 6c937b1 commit 1b5842a
Showing 1 changed file with 191 additions and 52 deletions.
243 changes: 191 additions & 52 deletions docs/labs/shell-injection.html
Original file line number Diff line number Diff line change
Expand Up @@ -21,65 +21,142 @@
subprocess.run(["ls", "-l", clean_dir])
</script>

<!-- Full pattern of correct answer -->
<!-- Full pattern of correct answer.
In Python, newline and carriage return are whitespace but are *meaningful*
outside of (...). So we match specifically on space (\x20) instead.
This makes our patterns harder to read, unfortunately.
It's conventional to use raw strings in Python for regexes, so we allow
and encourage them, but we'll accept *not* using raw strings since they
don't add value in this situation.
-->
<script id="correct0" type="plain/text">
\s*
clean_dir = re . sub \( r'\[\^a-zA-Z0-9\]' , r?(''|"") , dir_to_list \)
[\n\r]*\x20\x20clean_dir\x20*=\x20*re\x20*\.\x20*sub\x20*\(
r?('\[\^a-zA-Z0-9\]'|"\[\^a-zA-Z0-9\]") ,
r?(''|"") , dir_to_list \)
\s*
</script>

<script id="correct1" type="plain/text">
\s*
subprocess . run \( \[ ('ls'|"ls") , ('-l'|"-l") , clean_dir \] \)
[\n\r]*\x20\x20subprocess\x20*\.\x20*run\x20*\(
\[ ('ls'|"ls") , ('-l'|"-l") , clean_dir \] \)
\s*
</script>

<script id="info" type="application/yaml">
---
hints:
- absent: |-
^[\n\r]*\x20\x20[^\x20]
text: >
Python is an indentation-sensitive language, so your indentation
must be consistent.
In this case, your line in the first section
must start with exactly 2 spaces followed by a non-space.
- absent: |-
^\x20\x20[^\x20]
index: 1
text: >
Python is an indentation-sensitive language, so your indentation
must be consistent.
In this case, your line in the second section
must start with exactly 2 spaces followed by
a non-space.
- absent: |-
re \. sub
text: >
Use re.sub(PATTERN, REPLACETEXT, dir_to_list) to
substitute anything that is
not an alphanumeric character (removing the rest).
examples:
-
- ' clean_dir = dir_to_list'
- ' subprocess.run(f"ls -l {dir_to_list}", shell=True)'
- absent: |-
clean_dir = re \. sub \(
present: |-
re \. sub
text: >
Use re.sub(OLDPATTERN, REPLACEPATTERN, dir_to_list) to
remove anything that is not an alphanumeric character (removing the rest).
You need to compute a new string using re.sub and assign the result
to `clean_dir`. Your first line should look like
`clean_dir = re.sub(PATTERN, REPLACETEXT, dir_to_list)`.
examples:
-
- ' clean_dir re.sub = dir_to_list'
- ' subprocess.run(f"ls -l {dir_to_list}", shell=True)'
- present: PATTERN
text: >
We use `PATTERN` as a placeholder (metavariable) for the
expression you should use. The answer won't actually say PATTERN
anywhere. Your first line should look like
`clean_dir = re.sub(PATTERN, REPLACETEXT, dir_to_list)`
but `PATTERN` is a string with the regex of the pattern of text you
want to replace, and `REPLACETEXT` is what you want to replace it with.
The PATTERN would probably look like `r'...'` where the `...`
is the regular expression matching what you want to eliminate.
examples:
-
- clean_dir = dir_to_list
- subprocess.run(f"ls -l {dir_to_list}", shell=True)
- " clean_dir = re.sub(PATTERN, PATTERN, dir_to_list)"
- ' subprocess.run(f"ls -l {dir_to_list}", shell=True)'
- absent: |-
re \. sub \( r
text: >
Python re.sub uses strings to indicate a regex pattern.
By convention these strings usually 'raw' strings, so they
have the form r'PATTERN'. So you should have something like
re.sub( r'...' ...).
By convention these strings are usually 'raw' strings, so they
have the form `r'PATTERN'`. We would recommend that you use raw strings,
in the pattern
`re.sub(r'...', ...)` even though raw strings don't make this
*specific* example easier.
- absent: |-
re \. sub \( r\'
re \. sub \( r['"]
text: >
Python re.sub uses strings to indicate a regex pattern.
By convention these strings usually 'raw' strings, so they
have the form r'PATTERN'. You have the "r" but not the following
single quote character (we will use single quotes here).
have the form `r'PATTERN'`. You have the "r" but not the following
single or double quote character.
- absent: |-
re \. sub \( r\'\[
re \. sub \( r?['"]\[
text: >
Use re.sub(r'[...]', ...) to
indicate that you want to replace every character matching a
certain pattern.
certain pattern. Note the square brackets in the regular expression.
Replace the `...` with the pattern of characters to be replaced.
examples:
-
- clean_dir = re.sub(r'', '', dir_to_list)
- subprocess.run(f"ls -l {dir_to_list}", shell=True)
- " clean_dir = re.sub(r'', '', dir_to_list)"
- ' subprocess.run(f"ls -l {dir_to_list}", shell=True)'
- absent: |-
re \. sub \( r\'\[\^
re \. sub \( r?['"]\[\^
text: >
Use re.sub(r'[^ALPHANUMERIC_PATTERN]', '', dir_to_list) to
indicate that you want to replace everything that is not
an alphanumeric character. Be sure to replace ALPHANUMERIC_PATTERN
an alphanumeric character. Notice the use of the caret symbol `^`;
we are replacing everything *not* matching a certain pattern.
It's okay to use a caret here, because we aren't validating input,
we are filtering (removing) all the input *not* permitted.
Be sure to replace ALPHANUMERIC_PATTERN
with a regular expression pattern that describes alphanumerics!
examples:
-
- clean_dir = re.sub(r'[a-zA-Z0-9]', '', dir_to_list)
- subprocess.run(f"ls -l {dir_to_list}", shell=True)
- " clean_dir = re.sub(r'[a-zA-Z0-9]', '', dir_to_list)"
- ' subprocess.run(f"ls -l {dir_to_list}", shell=True)'
- absent: |-
a-zA-Z0-9
text: >-
Use `a-zA-Z0-9` in your substitution pattern.
examples:
-
- " clean_dir = re.sub(r'[^]', '', dir_to_list)"
- ' subprocess.run(f"ls -l {dir_to_list}")'
- absent: |-
re \. sub \( r?('\[\^a-zA-Z0-9\]'|"\[\^a-zA-Z0-9\]") , r?(''|"")
text: >
The second parameter of `re.sub` should be an empty string, that is,
`''` or `""`. You are matching everything you don't want, and
replacing it with nothing at all.
examples:
-
- " clean_dir = re.sub(r'[^a-zA-Z0-9]', NOWWHAT, dir_to_list)"
- ' subprocess.run(f"ls -l {dir_to_list}", shell=True)'
- absent: "subprocess.run"
index: 1
text: Use subprocess.run
Expand All @@ -90,8 +167,8 @@
Don't say `shell = True`; we don't want to unnecessarily run the shell.
examples:
-
- clean_dir = re.sub(r'[^a-zA-Z0-9]', '', dir_to_list)
- subprocess.run(f"ls -l {dir_to_list}", shell=True)
- " clean_dir = re.sub(r'[^a-zA-Z0-9]', '', dir_to_list)"
- ' subprocess.run(f"ls -l {dir_to_list}", shell=True)'
- present: |-
f["']ls\s+-l
index: 1
Expand All @@ -101,59 +178,118 @@
string concatenation. In addition, you MUST avoid string concatenation
in this case. The shell uses spaces to separate arguments, but we're
trying to avoid using the shell when it's not needed.
You must instead provide the arguments as a list with
separate parameters, e.g.,
subprocess.run(["ls", "-l", clean_dir]) or similar.
You must instead provide the arguments as a list that contains
separate parameters. The parameters should be something like
"ls", "-l", clean_dir
examples:
-
- clean_dir = re.sub(r'[^a-zA-Z0-9]', '', dir_to_list)
- subprocess.run(f"ls -l {dir_to_list}")
- " clean_dir = re.sub(r'[^a-zA-Z0-9]', '', dir_to_list)"
- ' subprocess.run(f"ls -l {dir_to_list}")'
- present: |-
["']ls\s+-l
index: 1
text: >-
The shell uses spaces to separate arguments, but we're
trying to avoid using the shell when it's not needed.
You must instead provide the arguments as a list with
separate parameters, e.g.,
subprocess.run(["ls", "-l", clean_dir]) or similar.
You must instead provide the arguments as a list that contains
separate parameters. The parameters should be something like
"ls", "-l", clean_dir
examples:
-
- clean_dir = re.sub(r'[^a-zA-Z0-9]', '', dir_to_list)
- subprocess.run("ls -l {dir_to_list}")
- " clean_dir = re.sub(r'[^a-zA-Z0-9]', '', dir_to_list)"
- ' subprocess.run("ls -l {dir_to_list}")'
- present: |-
subprocess \. run \( [A-Za-z0-9"']
index: 1
text: >-
The `subprocess.run` function takes a LIST of parameters as its command.
This has the form `subprocess.run([P1, P2, ...])`. Use
subprocess.run(["ls", "-l", clean_dir]) or similar.
The `subprocess.run` function takes a LIST of parameters as its command,
not just comma-separated parameters.
This means you need something like
the form `subprocess.run([P1, P2, ...])`. Note the square brackets
inside the parentheses.
examples:
-
- clean_dir = re.sub(r'[^a-zA-Z0-9]', '', dir_to_list)
- subprocess.run("ls", "-l", clean_dir)
- " clean_dir = re.sub(r'[^a-zA-Z0-9]', '', dir_to_list)"
- ' subprocess.run("ls", "-l", clean_dir)'
- present: |-
\{(dir_to_list|clean_dir)\}
index: 1
text: >-
You don't need {...} in your result. Use
subprocess.run(["ls", "-l", clean_dir]) or similar.
You don't need {...} in your result.
Each parameter should be free-standing, not an expression
concatenated within a formatted string.
examples:
-
- clean_dir = re.sub(r'[^a-zA-Z0-9]', '', dir_to_list)
- subprocess.run(["ls", "-l", {dir_to_list}])
- " clean_dir = re.sub(r'[^a-zA-Z0-9]', '', dir_to_list)"
- ' subprocess.run(["ls", "-l", {dir_to_list}])'
- present: |-
dir_to_list\"
index: 1
text: >-
You have a double-quote after `dir_to_list`; you don't want that.
- present: |-
clean_dir\"
index: 1
text: >-
You have a double-quote after `clean_dir`; you don't want that.
- present: |-
dir_to_list
index: 1
text: >-
The parameter `dir_to_list` is what was provided, but you don't want
to use that. You want to use the cleaned value `clean_dir` instead. Use
to use that. You want to use the cleaned value `clean_dir` instead.
examples:
-
- " clean_dir = re.sub(r'[^a-zA-Z0-9]', '', dir_to_list)"
- ' subprocess.run(["ls", "-l", dir_to_list])'
- absent: |-
run \(.*\)
index: 1
text: >-
You need a pair of matching parentheses in the second section.
examples:
-
- " clean_dir = re.sub(r'[^a-zA-Z0-9]', '', clean_dir)"
- ' subprocess.run(["ls", "-l", clean_dir]'
- absent: |-
run \( \[.*\]
index: 1
text: >-
You need a pair of matching square brackets in the second section.
examples:
-
- " clean_dir = re.sub(r'[^a-zA-Z0-9]', '', clean_dir)"
- ' subprocess.run(["ls", "-l", clean_dir)'
- absent: |-
^ subprocess . run \( \[ ('ls'|"ls") , ('-l'|"-l") , clean_dir \] \) $
index: 1
text: >-
You are getting close. The `subprocess.run` line should look like
subprocess.run(["ls", "-l", clean_dir]) or similar.
examples:
-
- clean_dir = re.sub(r'[^a-zA-Z0-9]', '', dir_to_list)
- subprocess.run(["ls", "-l", dir_to_list])
definitions:
- " clean_dir = re.sub(r'[^a-zA-Z0-9]', '', clean_dir)"
- ' subprocess.run(["ls", "-l", clean_dir, foo])'
successes:
-
- " clean_dir = re . sub( '[^a-zA-Z0-9]' , '' , dir_to_list )"
- ' subprocess . run ( [ "ls" , "-l" , clean_dir ] )'
-
- "\r\n\n clean_dir = re . sub( r'[^a-zA-Z0-9]' , '' , dir_to_list )"
- ' subprocess . run ( [ "ls" , "-l" , clean_dir ] )'
failures:
# Newline in middle of assignment
-
- " clean_dir =\n re . sub( r'[^a-zA-Z0-9]' , '' , dir_to_list )"
- ' subprocess . run ( [ "ls" , "-l" , clean_dir ] )'
# Failure to indent
-
- "clean_dir = re . sub( r'[^a-zA-Z0-9]' , '' , dir_to_list )"
- 'subprocess . run ( [ "ls" , "-l" , clean_dir ] )'
# Indent too far
-
- " clean_dir = re . sub( r'[^a-zA-Z0-9]' , '' , dir_to_list )"
- ' subprocess . run ( [ "ls" , "-l" , clean_dir ] )'
</script>
</head>
<body>
Expand Down Expand Up @@ -241,10 +377,13 @@ <h2>Task Information</h2>
<p>
We can offer a few hints.
You'll need to modify a string to only contain alphanumerics.
Python's <tt>re.sub(OLDPATTERN, REPLACEPATTERN, ORIGINAL_VALUE)</tt>
returns the result of replacing <tt>OLDPATTERN</tt> with
<tt>REPLACEPATTERN</tt> in <tt>ORIGINAL_VALUE</tt>, where <tt>OLDPATTERN</tt>
is a regular expression. In Python regular expressions are usually expressed
Python's <tt>re.sub(PATTERN, REPLACETEXT, ORIGINAL_VALUE)</tt>
returns the result of substituting
<tt>PATTERN</tt> with
<tt>REPLACETEXT</tt> in <tt>ORIGINAL_VALUE</tt>, where <tt>PATTERN</tt>
is a regular expression written as a string.
By default <tt>re.sub</tt> substitutes all matches (that's what you want).
In Python regular expressions are usually expressed
using the "raw" string form, that is, using the <tt>r'...'</tt> syntax.
You need to stop invoking the shell in Python's
<tt>subprocess.run</tt>, including separating the arguments.
Expand Down

0 comments on commit 1b5842a

Please sign in to comment.