Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multi-line commands. #293

Merged
merged 1 commit into from
May 13, 2017
Merged

Support for multi-line commands. #293

merged 1 commit into from
May 13, 2017

Conversation

georgebrock
Copy link
Collaborator

Line breaks are supported in similar places to sh(1). Line breaks can be escaped anywhere by ending a line with a \ character. Unescaped line breaks are also supported:

  • after logical operators (&& and ||),
  • within strings,
  • between commands wrapped in parentheses, and
  • between commands in subshells.

The Lexer has been expanded to insert a MISSING token in all situations where the input is known to be incomplete (i.e. when the input ends with an escape character, or the input ends in any of the places where an unescaped line break can be used).

After invoking the Lexer, the Interpreter checks the token stream for MISSING tokens. If it finds any it requests another line of input from the current input strategy, appends it to the current input, and tries again.

A new EOL token has been introduced to represent line breaks between commands. In the Parser it's treated exactly like the SEMICOLON token.

Lexical analysis of comments needed to be improved to allow for comments at the end of lines in a multi-line command. For example, the following input is valid:

(:echo 1 # comment
:echo 2)

It is semantically equivalent to:

(:echo 1; :echo 2)

To support this, the Lexer will now:

  • ignore whitespace before a comment's initial # character. This prevents extraneous SPACE tokens from being produced. A trailing SPACE token in a single line command is fine, but it can cause problems in a multi-line command.

  • pop the :comment state without consuming the newline character at the end of a comment, allowing the default parsing rules to handle the newline, and produce an EOL token.

@@ -48,6 +49,19 @@ def read_command
retry
end

def read_continuation
input = begin
line_editor.readline(CONTINUATION_PROMPT, true)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing true here results in each line of the command having it's own entry in the history. GNU Bash combines multi-line commands into a single history entry. That's probably a worthwhile enhancement, but maybe it should be a separate PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could improve the code documentation with weird syntax like:

line_editor.readline(CONTINUATION_PROMPT, history_entry_per_line=true)

production(:subshell, 'SUBSHELL_START .program SUBSHELL_END') { |p| p }
production(:subshell) do
clause('SUBSHELL_START .program SUBSHELL_END') { |p| p }
end
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an accidental change. I must have messed up a rebase somewhere.

@georgebrock georgebrock force-pushed the multi-line-commands branch from a5d9652 to 0e6255c Compare April 30, 2017 23:08
@georgebrock georgebrock force-pushed the multi-line-commands branch from 0e6255c to bab8382 Compare April 30, 2017 23:09
@georgebrock georgebrock changed the base branch from handle-bad-scripts to master April 30, 2017 23:16
@georgebrock georgebrock changed the base branch from master to handle-bad-scripts April 30, 2017 23:16
@georgebrock georgebrock changed the base branch from handle-bad-scripts to master April 30, 2017 23:16
Copy link
Contributor

@sharplet sharplet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@@ -13,6 +13,7 @@ module Gitsh
module InputStrategies
class Interactive
BLANK_LINE_REGEX = /^\s*$/
CONTINUATION_PROMPT = '> '.freeze
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this make use of the PS2 environment variable? Is that standard?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use PS1 for the main prompt (there's a git-config(1) variable called gitsh.prompt instead). If we were going to support customisation, I think we should follow that pattern instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me.

Line breaks are supported in similar places to sh(1). Line breaks can be
escaped anywhere by ending a line with a `\` character. Unescaped line
breaks are also supported:

- after logical operators (`&&` and `||`),
- within strings,
- between commands wrapped in parentheses, and
- between commands in subshells.

The Lexer has been expanded to insert a `MISSING` token in all situations
where the input is known to be incomplete (i.e. when the input ends with an
escape character, or the input ends in any of the places where an unescaped
line break can be used).

After invoking the `Lexer`, the `Interpreter` checks the token stream for
`MISSING` tokens. If it finds any it requests another line of input from the
current input strategy, appends it to the current input, and tries again.

A new `EOL` token has been introduced to represent line breaks between
commands. In the `Parser` it's treated exactly like the `SEMICOLON` token.

Lexical analysis of comments needed to be improved to allow for comments at
the end of lines in a multi-line command. For example, the following input
is valid:

    (:echo 1 # comment
    :echo 2)

It is semantically equivalent to:

    (:echo 1; :echo 2)

To support this, the Lexer will now:

- ignore whitespace before a comment's initial `#` character. This prevents
  extraneous `SPACE` tokens from being produced. A trailing `SPACE` token
  in a single line command is fine, but it can cause problems in a
  multi-line command.

- pop the `:comment` state without consuming the newline character at the
  end of a comment, allowing the default parsing rules to handle the
  newline, and produce an `EOL` token.
@georgebrock georgebrock force-pushed the multi-line-commands branch from bab8382 to 22362e9 Compare May 13, 2017 22:00
@georgebrock georgebrock merged commit 22362e9 into master May 13, 2017
@georgebrock georgebrock deleted the multi-line-commands branch May 13, 2017 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants