| ![]() |
TBiB Q4/2006
|
Exercise: CGI Interface to Clustal W
In this exercise, we write a CGI-script interface to the Clustal W
module we wrote in the exercises in
week 14.
MotivationThe module wrapping Clustal W, that we wrote earlier, provides a nice interface between scripts and Clustal W, but not necessarily a nicer user interface. In this weeks exercise we write a CGI-based user interface for our module. The interface will let the user upload a file containing the sequences to align, and then display the alignment. The exercise this week is somewhat shorter than the exercises for the last two weeks. This is to let you catch up with the exercises, if you are behind, before taking on the second mandatory project, where you will be needing the code you wrote for this and the previous two weeks. Make good use of the extra time you have this week! The Basic InterfaceAt the very minimum, a CGI interface to Clustal W should let the user upload sequences and get shown a multiple alignment. The easiest way of uploading sequences is through a text area, so: EXERCISE CGI-CW.1: Write a CGI script that lets the user provide a list of sequences in a text-area input form, and then displays the multiple alignment of the sequences as calculated by Clustal W. For displaying the multiple alignment, you can simply print it between <pre>-</pre> tags, for instance as this:
<pre>
foobar1 AGGTTGTATACTATC
foobar3 AGGTTGT--ACTATC
foobar2 AGGTTGTTTACTATC
foobar4 CGGTTGT--ACTATC
****** ****** </pre>
which is displayed as this:
foobar1 AGGTTGTATACTATC
foobar3 AGGTTGT--ACTATC
foobar2 AGGTTGTTTACTATC
foobar4 CGGTTGT--ACTATC
****** ******
If the user has the sequences in a file, it should not be necessary for him to copy and paste the sequences from the file; he should be able to provide the file directly. EXERCISE CGI-CW.2: Extend the CGI-script so the user can provide the sequences from a file. While We're Waiting...Generating the multiple alignment can take some time, and while this goes on, the user will not know whether anything is happening or not. We should show him a page, telling him that the alignment will arrive shortly. What we want is to immediately return a HTML page, saying that we are working on the problem, and then start a background process that generates the "real" HTML page. When the real page has been generated, we want to redirect the user to that page. The redirecting can be done using the "Refresh" HTTP-header. With it, you specify a number of seconds to wait, and then an URL to goto afterwards. Thus, we can show the temporary page using code like this:
print 'Refresh: 5; URL = url-to-redirect-to'
print "Content-type: text/html"
print
print """
<html>
<title>Just a sec...</title>
<body>
<h1>Generating the page, please wait</h1>
</body>
</html>"""
After printing the temporary page, we need to do the real work. We need to do this in the background, and after we close the stdout of the script (otherwise the temporary page will not be displayed until after the background job is done). You start a job as a background job--whether another script or a "real" program--by suffixing the command with an ampersand & You close the stdout of the background process by redirecting as >/dev/null 2>&1. Thus, to start a background process and terminate the CGI-script, you need code like this:
import os
os.system("(the real command) >/dev/null 2>&1 &")
A script (waiting.py) waiting for a command to generate a HTML page, can, in its complete form, look like this:
#!/usr/local/bin/python
import cgi
form = cgi.FieldStorage()
if form.has_key("realpage"):
fname = form.getfirst("realpage")
try:
# if the page is ready...
f = open(fname)
# show the real page
print "Content-type: text/html"
print
print f.read()
f.close()
# and remove temporary file
import os, sys
os.unlink(fname)
sys.exit(0) # all done
except IOError:
# if the page isn't ready yet
pass
else:
# realpage wasn't provided, assume first call so start process
import os
fname = os.tempnam("/tmp")
os.system("(background-process > "+str(fname)+") >/dev/null 2>&1 &")
# print "waiting" page
print 'Refresh: 5; URL = http://domain/waiting.py?realpage='+str(fname)
print "Content-type: text/html"
print
print """
<html>
<title>Just a sec...</title>
<body>
<h1>Generating the page, please wait</h1>
</body>
</html>"""
The script checks whether the name for the real page exists (which it will if the background process has been started), and if so, checks whether it can be read. If it can, it reads the page, prints it to the web-client, and remove it from the server file system. If not, it waits some more. If the real page has not got a name, we create one (using os.tempnam--see the on-line help), starts the background process that will write its output, as an HTML page, into the new real page name, and tells the web-client to wait for it. EXERCISE CGI-CW.3: Modify your Clustal W script to use this interaction pattern. Extensions to the InterfaceJust for fun, we will extend the interface a bit. (The exercises in this section are not that much more complicated than the exercises in the previous section, but since they are just for fun, they are all bonus exercises). Colouring the Alignment
EXERCISE CGI-CW.4: Above, we printed the generated alignment using plain text in <pre>-</pre> tags. Wouldn't it look better if we highlighted matches, mismatches, and gaps using colours? We could write alignment shown above like this: foobar1 AGGTTGTATACTATC foobar3 AGGTTGT--ACTATC foobar2 AGGTTGTTTACTATC foobar4 CGGTTGT--ACTATC which in HTML looks like this: foobar1 <b style="color:blue">A</b><b style="color:green">GGTTGT</b><b style="color:red">AT</b><b style="color:green">ACTATC</b> foobar3 <b style="color:blue">A</b><b style="color:green">GGTTGT</b><b style="color:red">--</b><b style="color:green">ACTATC</b> foobar2 <b style="color:blue">A</b><b style="color:green">GGTTGT</b><b style="color:red">TT</b><b style="color:green">ACTATC</b> foobar4 <b style="color:blue">C</b><b style="color:green">GGTTGT</b><b style="color:red">--</b><b style="color:blue">ACTATC</b> EXERCISE CGI-CW.5*: Add this formatting of the output to your script. Multiple Sequence SubmissionsWe have written the script such that the input is given as a list of sequences, either in a text area or in a file. We could also allow the user to provide sequences one at a time, and then keep track of the list ourself until he submit the alignment calculation. One way of doing this is to keep track of the state of the session in a temporary file on the server-side and store a session id in a hidden form in the generated HTML pages. EXERCISE CGI-CW.6*: Enhance the script in this way. SummaryWe have written a CGI-script interface to our Clustal W module. This provides a better user interface for creating multiple alignments than the script module. We will now combine the NCBI searching module from last weeks exercise with the Clustal W module and interface in the second mandatory project. |