Process Management

In this lecture, we cover simple process management: Supplementary reading: Python--how to program, 18.3 and 18.4, pages 623-631.

Motivation

We do not want to write scripts for everything. Sometimes performance of scripts is not acceptable for our applications; sometimes the functionality we need is already available from existing programs.

When performance is unacceptable, we can (re-)write the bottleneck-algorithm in a compiled language and provided it for Python as either a python module or a stand alone program; when the functionality is already found in an existing program, the stand alone program is the only choice.

Writing python modules in languages other than python is beyond the scope of this course, but those interested can read about it here:

In this lecture we will concentrate on communicating with external stand alone programs.

A Simple Shell

As a running example we will consider a simple unix shell.

A shell lets us type in execute and commands. In most cases, the commands are the names of programs--and executing the command means running the program--but in some cases the commands are for the shell itself.

A session with our shell could look like this (assuming the shell is implemented in the script "scripts/shell.py"):

[mailund@dhcp-11-23-11 processes]$ python shell.py
>>> echo "hello, world"
Running: echo "hello, world"
hello, world
>>> ls
Running: ls
process-communication.html  process-communication.html~  shell.py  shell.py~
>>> ls *.html
Running: ls *.html
process-communication.html
>>> quit
[mailund@dhcp-11-23-11 processes]$  

Here, I run the Python script and enter the simple shell. I then type in a few commands: "echo", "ls", "ls", and "quit". The first three commands are program names, and the programs are run when the command is executed, while the last command is a command for the shell that tells it to quick and return me to my outer unix shell.

Executing commands

The primary task of the shell is to execute commands, whether shell-commands or external programs.

For calling external programs, we can use the function os.system(). This function takes a single argument, a string that is the name of the program to be executed, and returns the exit status of the program. By convention, the exit status on a unix program is zero for successful termination and non-zero for an error.

Using os.system() we can write our simple shell as follows:

import os
import sys

def execute(command):
    print 'Running:', command
    os.system(command)

def get_command():
    command = raw_input(">>> ")
    return command

if __name__ == '__main__':
    while 1:
        execute(get_command())  

Here, we simply run in an infinite loop of prompting for a command and executing the command.

Running the program, we soon discover that we cannot easily quit the shell. We need to add the shell command "quit". We can do that in the execute() function:

def execute(command):
    if command == "quit":
        sys.exit(0)
    print 'Running:', command
    os.system(command)  

You can also use execute() to add a number of other shell commands. In my version, for instance, I have used it to add aliasing of commands, so I can dynamically rename commands.

Exercise PM.1: Extend the shell, such that it reports unsuccessful termination. Remember that a successful termination is indicated by os.system() returning 0.

Exercise PM.2: Extend the command syntax such that a sequence of ';'-separated commands can be read from the prompt. (In the current version this is already possible, but only because the underlying shell, at the other end of os.system(), handles it. In this exercise, you are supposed to handle it.)

Communicating with a Program

Say we want the shell to distinguish more clearly, the output of external programs from the input prompt, the commands, and the shell feedback. We want the output of external programs to be indented and displayed in a different colour than the other text.

Setting the colour of the text is fairly easy using ANSI terminal escape sequences. For instance, to set the text colour to dark red, write "<Esc>[31;2m" to the terminal (where <Esc> is the escape code--in emacs use "C-q ESC" to write <Esc>). We can reset the output colour using "<Esc>0m".

Printing the output of external programs in dark red, we can do using the execute() function:

def run_command(command):
    print 'Running:', command

    # set output colour:
    sys.stdout.write("<Esc>[31;2m") ; sys.stdout.flush()

    os.system(command)

    # reset output colour
    sys.stdout.write("<Esc>[0m")  

(Here we need to flush the stdout file to make sure that the escape code is written to the terminal before the output of the program)

A session with the shell now looks like this:

[mailund@dhcp-11-23-11 processes]$ python shell2.py
>>> ls
Running: ls
process-management.html   shell1-exit.py   shell1.py  shell2.py~  shell.py~
process-management.html~  shell1-exit.py~  shell2.py  shell.py
>>> alias ll=ls -l
alias: ll = ls -l
>>> ll
Running: ls -l
total 44
-rw-rw-r--    1 mailund  mailund      7091 Sep  3 16:21 process-management.html
-rw-rw-r--    1 mailund  mailund      5569 Sep  3 15:34 process-management.html~
-rw-rw-r--    1 mailund  mailund      1012 Sep  3 15:01 shell1-exit.py
-rw-rw-r--    1 mailund  mailund       942 Sep  3 15:01 shell1-exit.py~
-rw-rw-r--    1 mailund  mailund       942 Sep  3 14:59 shell1.py
-rw-rw-r--    1 mailund  mailund      1078 Sep  3 16:17 shell2.py
-rw-rw-r--    1 mailund  mailund       942 Sep  3 16:03 shell2.py~
-rw-rw-r--    1 mailund  mailund       942 Sep  3 12:50 shell.py
-rw-rw-r--    1 mailund  mailund      1022 Sep  3 11:54 shell.py~
>>> quit
[mailund@dhcp-11-23-11 processes]$  

Notice that only the output of the external calls are coloured red.

Changing the colour of the output was only half the job; we also need to indent the output. To do this, we need actual access to the output of the program, we can no longer just let the program write it to the terminal.

To get access to the output of the program, we can use os.popen() instead of os.system(). os.popen(), like os.system(), executes an external program, but it also provides a file-object that gives access to either the running programs stdin or stdout.

os.popen() takes two arguments. The first argument is the command to run, the second determines whether if the file-object returned should be to the programs stdin (if the second argument is "w") or the programs stdout (if the second argument is "r").

For our purposes, we need the output of the external program, so we our interested in its stdout. We therefore call the program using os.popen(<program>, "r"):

def run_command(command):
    print 'Running:', command

    f = os.popen(command, "r")

    # set output colour:
    sys.stdout.write("<Esc>[31;2m") ; sys.stdout.flush()
    for l in f.xreadlines():
        print '  ', l,
    # reset output colour
    sys.stdout.write("<Esc>[0m")  

Here we read all the output of the program and write it out, line for line, prefixing each line with a little extra space. The output looks like this

[mailund@dhcp-11-23-11 processes]$ python shell2.py
>>> ls
Running: ls
   process-management.html
   process-management.html~
   shell1-exit.py
   shell1-exit.py~
   shell1.py
   shell2.py
   shell2.py~
   shell.py
   shell.py~
>>> quit  

You can download my version of this shell here.

When executing a program using os.system() we immediately get the exit status. With os.popen() we instead get the stdin/stdout file. To get the exit status of the program, we need the value returned when closing this file object; a return value of None or 0 here means successful termination, otherwise the program terminated abnormally.

EXERCISE PM.3: Extend the shell program to report abnormal termination in the new version based on os.popen().

In a shell, we are used to be able to redirect stdin/stdout from and to files, using < and >. For instance, cat < file feeds file to cat as input, while echo "foo" > file writes the output of echo ("foo") into file.

Our shell also supports this, but only through the underlying shell at the end of os.system() and os.popen(). Using os.popen() we can support it directly in our shell.

EXERCISE PM.4: Extend the shell to support the <, > redirection syntax. In this exercise, use the old shell to avoid confusion about using os.popen() for redirection with using os.popen() for colouring the output of commands. For now, support either < or > in a command, don't worry about both in the same command, we will return to that shortly.

As already hinted at, we have a problem with redirecting both input and output of a command. With os.popen() we can get access to either stdin or stdout, if we want to support commands such as sort < f > f-sorted we need both.

To get a file-object for both stdin and stdout of a program we need the function os.popen2(). Calling os.popen2() we get two file objects:

    cin, cout = os.popen2(cmd) 

We can then write to the running program using the cin object, and read from the file using the cout object:

    for l in input.xreadlines():
        cin.write(l)
    cin.close()

    for l in cout.xreadlines():
        output.write(l)
    cout.close() 

Be careful about closing cin before reading from cout here! Some programs (e.g. sort) will not produce any output until it has read all its input, and it will not know that it has received all its input until it receives EOF, which is sent when you close cin. We had a similar problem when we obtained cin from os.popen(), but there we only risked delaying the output from the program; here there is a very real risk of a deadlock.

To see what happens, try moving cin.close() to after the loop reading from cout, and execute the command:

>>> sort < shell2.py > shell2-sorted 

EXERCISE PM.5*: Extend the shell to support "pipelines", sequences of commands cmd1 | cmd2 | cmd3 | ... | cmdn where the output of command cmdi is redirected to the input of cmd(i+1). If it is too much trouble supporting both < and > redirection at the same time, remove this functionallity first and write a shell that only supports pipelines--you can always add the file redirection again later.

Summary

We have learnt how to execute external programs using os.system(), os.popen(), or os.popen2(), and we have learnt how to communicate with an external program through stdin/stdout file objects.

It is possible to achievec more powerfull communcation patterns with other programs, in more elaborate ways, but the three techniques here will cover most, if not all, of your needs.

With the techniques we have learned here, we are now ready to attack this weeks exercises, concerning a module wrapping the clustalw tool.

Valid XHTML 1.0! Valid CSS! Time-stamp: "2003-11-04 08:49:03 mailund"