In a shell, the user can run programs and also direct input from a file and output from a file. Bash allows users to enter commands and give instructions to the operating system.
- Bash is the shell, or command language interpreter, for the GNU operating system.
At first sight, Bash appears to be a simple command/response system, where users enter commands and Bash returns the results after those commands are run. However, Bash is also a programming platform and users are enabled to write programs that accept input and produce output using shell commands in shell scripts.
Bash displays a "prompt" for the user to enter a command. Typically, this prompt contains the user name, computer name, and working directory. The user can enter commands under this prompt.
The user can enter commands under the prompt. These commands are used to perform system-related operations (file management, program execution, network operations, etc.)
Bash interprets commands entered by the user. This means checking if the command is a defined program, processing the parameters if necessary, and executing the command.
It passes the interpreted command to the operating system through system calls and the operating system executes the command. For example, the ls command, a file listing command, requests the operating system to list files.
After the command is executed, Bash checks whether the command is completed successfully. If an error occurs, it displays error messages. It also reports the exit status of the command (0 if successful, or a different value if failed).
It keeps a history of commands entered by the user and allows the user to access this history and undo previous commands. This is achieved with the arrow keys or the history command.
Bash also allows users to create shell scripts that contain a sequence of commands. These scripts are used to automate specific tasks by stringing together sequential commands.
These basic steps of Bash provide a user-friendly command-line environment and come with a wide range of commands. Users can customize Bash and there is extensive documentation and community support.
Essentially we breaks down the process into 4 steps:
lexer
accepts the raw string input of the user and converts it to tokens.expander
accepts a list of tokens and replaces placeholders with its values.parser
accepts a list of tokens and converts them to commands.executor
accepts a list of commands and runs them.
When we wrote this project, we were inspired by state machines.
typedef void *(*t_lexer_state)(t_token **lexer_data, char *input, int *const i);
/* a function pointer that returns void* and accepts 3 parameters. */
typedef void *(*t_parser_state)(t_token **lexer_data, t_command *command);
/* a function pointer that returns void* and accepts 2 parameters. */
The basic building blocks of a state machine are states and transitions. A state is a situation of a system depending on previous inputs and causes a reaction on following inputs. One state is marked as the initial state; this is where the execution of the machine starts. A state transition defines for which input a state is changed from one to another. Depending on the state machine type, states and/or transitions produce outputs.
This minishell project uses 10 different types of tokens. The crucial point is understanding definitions of redirections and words.
Words: Words are minimal string parts of the user input. Words can be fit in quotes or not. Words can be contiguous with another word.
echo
has one word.ec"ho"
has two words.e'c'h"o"
has four words.
Redirections: Redirections are all unquoted >
,>>
,<<
and <
characters. That's all.
typedef enum e_token_type {
DOUBLE_QUOTED_WORD, SINGLE_QUOTED_WORD, UNQUOTED_WORD, PIPE, OUTPUT_REDIRECTION, INPUT_REDIRECTION, HEREDOC_REDIRECTION, APPEND_REDIRECTION, DELIMITER, UNKNOWN
} t_token_type;
Note: The 'Unknown' token type is reserved for future purposes.
typedef struct s_token
{
char *value;
t_token_type type;
struct s_token *next;
} t_token;
Here are some example user inputs and lexer outputs.
Input: echo a
Tokens: UNQUOTED_WORD DELIMITER UNQUOTED_WORD
echo <spaces> a
Input: echo\ a
Tokens: UNQUOTED_WORD
echo a
Input: 'echo' "a"
Tokens: SINGLE_QUOTED_WORD DELIMITER DOUBLE_QUOTED_WORD
echo <space> a
Input: e'c'"ho" a
Tokens: UNQUOTED_WORD SINGLE_QUOTED_WORD DOUBLE_QUOTED_WORD DELIMITER UNQUOTED_WORD
e c ho <space> a
Input: echo < a
Tokens: UNQUOTED_WORD DELIMITER INPUT_REDIRECTION DELIMITER UNQUOTED_WORD
echo <space> < <space> a
Input: echo << a
Tokens: UNQUOTED_WORD DELIMITER HEREDOC_REDIRECTION DELIMITER UNQUOTED_WORD
echo <space> < <space> a
Input: echo > a
Tokens: UNQUOTED_WORD DELIMITER OUTPUT_REDIRECTION DELIMITER UNQUOTED_WORD
echo <space> < <space> a
Input: echo >> a
Tokens: UNQUOTED_WORD DELIMITER APPEND_REDIRECTION DELIMITER UNQUOTED_WORD
echo <space> < <space> a
Input: ls | cat
Tokens: UNQUOTED_WORD DELIMITER PIPE DELIMITER UNQUOTED_WORD
ls <space> | <space> a
- Parses symbols and tokens from the lexer to understand the structure of commands.
- Creates the syntax tree. This tree represents the structure of the command in a hierarchical way.
- For example, it converts the command "ls -l" into a tree structure.
typedef struct s_redirection
{
char *redirected;
int redir_fd;
int flags;
} t_redirection;
typedef struct s_command
{
char **args;
int output;
int input;
int pid;
t_redirection *redirections;
struct s_command *next;
struct s_command *prev;
} t_command;
- Evaluate variables and replace them with their values before running the command.
- For example, it replaces
$HOME
with the home directory. - Also
$?
expression expands as the exit status of the last executed command.
- Creates a new process using system calls such as fork and exec and executes the command specified in that process.
- Returns the result to the user.
Command | Description |
---|---|
cd |
changes the current directory to the first argument provided. can be relative or absolute path. changes to HOME. changes to OLDPWD. PWD and OLDPWD are set accordingly |
echo |
Displays a line of text Optional flag -n : do not output the trailing newline |
env |
Displays the environment variables |
exit |
exits the shell with the status in the argument or the current status if none is specified . also needs a numeric argument for the status otherwise it will error |
export |
with an argument it needs a valid identifier followed by an optional = and value. creates or changes the value of an existing environment variable. if no argument is provided it will print the environment variables in a weird format. |
pwd |
Shows the current directory as an absolute path. |
unset |
with a valid identifier as argument it unsets/deletes the environment variable. otherwise it shows an error. |
- All
ctrl-\
signals are ignored for minishell. ctrl-C
will always set a global variable to true, which quits the current processing and returns to readline- during readline ctrl-C needs some more functions so that we get a new line because readline doesn't return
- the heredoc also has a special handler for readline
- ctrl-\ isn't handled but it should inside heredocs which is an oversight on our part
In Bash, a "heredoc" (here document) is a construct used to feed a block of text or a series of commands into a specific process. Heredoc allows you to directly write text within a file or a script and pass this text as input to a command or operation.
command << eof
text
eof
Here, << eof denotes the beginning of the heredoc, and it is terminated with the label eof. The text or commands in between are taken until the line containing the specified label, and provided as input to the specified operation.
- The usage of heredoc is commonly employed to make long and complex text blocks or command sequences more readable and manageable within text or script files.
In Bash, redirections are used to redirect standard input and standard output, or standard error streams of a command. They are primarily used to perform operations such as writing the output of a command to a file, reading from a file, or redirecting the input/output from/to another command.
- Output Redirection(
>
): This operator redirects the output of a command to a specified file, overwriting the file if it already exists. - Append Redirection(
>>
): This operator appends the output of a command to a specified file, preserving the existing content of the file. - Input Redirection(
<
): This operator redirects the content of a file to the input of a command.- For example, in
cat < file
file is opened by the shell and assigned a file descriptor.cat
's input is replaced as the file. Socat
reads the file.
- For example, in
- Pipe(
|
): Redirects the output of command left to the input of command right.- For example, in
echo a | cat
echo
's output is replaced as the pipe by the shell.cat
's input is replaced with the same pipe. So they write and read to each other.
- For example, in
Programs does not know where to read or write. By default, every program writes to its STD_OUT(1) and reads from its STD_IN(0) but the shell changes files pointed by their STD_OUT and STD_IN.
cat | ls -l | wc -l
exit 21 42
exit | exit
exit -42
exit 42a
exit " -42"
cat file | cat << file
cat << file | cat << file
exit 256
echo facetint | cat << a << b << e
Bash's edge case tests. We never say that minishells must have these implementations.
ls | >> a < a > a << a cat
Should write heredoc input to file a.
export VAR=VAL
export VAR
env | grep VAR
The value of VAR
should be VAL
.
echo $0
Should print the first argument of your program. (eg: ./minishell)
env | grep SHLVL
This variable should be increased by 1 at init. If the inherited value is bigger than 999 bash sets it to 1 with a warning.
cd -
Should go old directory.
USER=abc
Sets User the variable to abc
. (Use env and export commands for a surprise.)
'
Should wait for '
char like heredoc.
export test="o '"
ech$test
Should print '
.
ls missingfile > error.txt 2>&1
Should write ls: missingfile: No such file or directory
to error.txt
echo ~0
echo ~+
echo ~
Should print current directory.
echo ~-
Should print old directory.
cat < <(echo a)
Should print a
.
export TEST_VAR=abc
echo ${TEST_VAR/a/b}
Should print bbc
.
!-1
Should execute the last command.
export HISTSIZE=1000
Should set a limit to the history list.
echo $-
Shows bash's options. Idk how to implement it for minishell lol.
echo !$
Should print the last argument of the last executed command.
echo !!
It should expand the last executed command but it should expand the prompt.
/bin/ech? hello
/bin/ech[lower] hello
/bin/ech*o hello
Should print hello.
echo "\\n"
echo \\n
echo "\n"
echo \n
Should print \n
, \n
, \n
, and n
. (They are not newline.)
echo abc # xyz klm
Should print abc
.
- Clone the repository:
git clone https://github.com/facetint/minishell.git
- run
cd minishell
- run
make
- run the executable:
./minishell
- Install the Criterion framework.
- run
make test