*nix

The phrase *nix refers to family of operating systems descended from Unix, e.g. GNU/Linux, Mac OSX, BSD... that share a common philosophy for how a computer should work. Ideally these operating systems obey the POSIX standard, and most do at least partially.

Shell

A (for our purposes, text-based) user interface to an operating system. A shell allows you to run programs and see their output. Common shells include:

  • bash
  • tsh
  • zsh
  • csh
  • fish

Terminal

A program (graphical or otherwise) that allows you to interact with a shell program. For example, on my Linux operating system, running gnome-terminal will open an instance of bash by default.

Getting started

To get started, you should open a terminal (which will launch your default shell, probably bash). Do this like you would any other program on your operating system of choice.

Be aware that Windows operating systems are not in the *nix family, and that the Windows console is not a *nix shell. If Windows is your main OS, I suggest either creating a virtual machine, dual booting, or at the very least installing a GNU/Linux-like environment (Cygwin, MSYS, etc.).

Running a program

To run a program in the shell, we simply enter the name of that program. For example, to run Python 3 we would write:

$ python3
Python 3.4.3+ (default, Oct 14 2015, 16:03:50)
[GCC 5.2.1 20151010] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

When we enter the name of a program to be run, the shell checks a list of directories for an executable matching that name. We can ask which program is being run with the which command.

$ which python3
/usr/bin/python3

This tells us the absolute path to the program we are running. This means we could instead run:

$ /usr/bin/python3
Python 3.4.3+ (default, Oct 14 2015, 16:03:50)
[GCC 5.2.1 20151010] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

The list of directories we search is found in the $PATH environment variable. We can list the contents of an environment variable with the echo program.

$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:~/bin

We can see that $PATH contains a colon-delimited list of directories to search for a program to run. Note that the current directory, ., is not listed. If we created a program called myprogram in our home directory that we wanted to run and we ran just myprogram, we would get the following

$ myprogram myprogram: command not found

We would instead need to run

$ ~/myprogram
Hello!

Since we specified a path to the program, we bypass searching $PATH. Similarly, if we wanted to execute a program located in the current directory, we need to run

$ ./myprogram
Hello!

In general, . refers to the current directory, and .. refers to the parent directory.

Managing files

One of the most common tasks in the shell is to look around to find files, copy them, rename them, or delete them. We can see what is in our current directory with the ls program

$ ls
Documents  Downloads  Music  Pictures

(This is a greatly abbreviated listing.) We can of course look in different directories:

$ ls Downloads
3300.tex  gurobi5.6.3_linux64.tar.gz

The cp program copies files, while the mv command moves or renames files. They both use the same syntax.

$ cp original-file copy-file
$ mv original-file new-file

The rm command deletes files.

$ rm new-file

Please use caution when running rm, especially with flags such as -r or -f, which causes rm to recursively delete sub-directories and fail to ask for confirmation when doing so, respectively.

Managing running processes

When you execute a program in a shell, it can run in one of two ways. If we do it as before, it is run as a foreground process, and we have to wait for it to finish. For example, executing

$ sleep 20

is a program that simply waits 20 seconds, then terminates. Doing this prevents you from doing other things in the shell because the process is running in the foreground.

In contrast, it is possible to execute a program so it is run as a backgroun process. To do this, one can execute the program using a trailing &

$ sleep 20 &
[1] 2938

and immediately, it provides a number identifying the process called the PID, and allows you to do other things while the process runs in the background. Typing ps allows you to see a selection of processes currently running on the system.

$ ps

  PID TTY          TIME CMD
 2868 pts/2    00:00:01 bash
 2938 pts/2    00:00:00 sleep
 2942 pts/2    00:00:00 ps

We confirm that the PID for the process we just started is 2938. The ps command has a lot of functionality that differs depending on the flags that are passed to it. Another way to see the processes currently running is by executing top. This shows the process using the most resources on your system.

Input/output redirection

Each program has three ways to perform I/O: standard input, standard output, and standard error (abbreviated stdin, stdout, and stderr, respectively). Programs can read input from stdin and write output to either stdout or stderr. stdout is used for most output, while stderr is used to log errors. In a terminal, the text that shows up on your screen is from both stdout and stderr.

We can redirect stdout and stderr to files. To redirect stdout, we use >.

$ ls > list_of_files
$ cat list_of_files
Documents  Downloads  Music  Pictures

Using > will remove the old contents of the file before directing stout. To concactenate instead, use >>.

$ echo 'Hello' >> output
$ echo 'World!' >> output
$ cat output
Hello
World!

To redirect stderr, we use 2>.

$ ls potato 2> err_message
$ cat err_message
ls: cannot access 'potato': No such file or directory

We can redirect both at the same time,

$ ls > file_list 2> err_msg

even redirect both to the same file. Order matters, here.

$ ls > list_of_files_and_errors 2>&1

We can also connect the stdout of one program to the stdin of the next. :

$ pwd | ls

is a contrived way to list the contents of the current directory. More useful would be something like

$ ls -1 | grep '.*py$' | grep -v '^test_' | wc -l

This lists all files in the current directory, one per line, then finds all lines that end in .py, then removes all lines that begin with test_, and then counts the number of lines remaining. This would be one way to count the number of Python files in the current directory that don’t begin with test_.

Another (more dangerous) command recursively removes all backup files made by Emacs under the current directory, which are marked by a trailing ~.

$ find . -name '*~' -print0 | xargs -0 rm

One more example of piping is reading from an output that is too large. To do that, just pipe the output to less, as in

$ ls /usr/bin/ | less

Getting help

The man program displays the manual page for most programs (and C and C++ header files).

$ man git

shows a (perhaps not very useful) introduction to the git program. If you were trying to figure out how to use ls to print more information about files,

$ man ls

would teach you about the -1, -a, -l, and -h flags.

$ man ps

will give you all the possible ways of listing out current processes on your system.

$ man bash

will teach you that writing any non-trivial program in Bash is a nightmare.

$ man man

might technically be enough for you to learn everything on this page without reading it.