Chem 110L: Macromolecular Visualization Laboratory Exercise: Unix


Previous Next

Linux: Your work environment

Computers that you will be using for this tutorial run Ubuntu Linux or CentOS Linux as operating systems. Scientists often like to use Linux-based operating systems because of versatility, power, and security this platform offer. If someone has written a scientific program, it most likely runs on Linux. If you want to write your own programs, for example for sorting a spreadsheet of 70,000 molecules by their polarity (Excel versions earlier than 2007 would lose the last 4464 molecules!) Linux comes pre-installed with necessary programming tools. There is a large number of free, high-quality applications with effective user support. Ports or analogs of popular software, such as Office, Abobe Acrobat Reader, or multimedia players work well under Linux. You can use your USB memory device as in any other computer. Linux is network-friendly (you can run graphical applications over the network), stable, and free.

The Unix Shell

This tutorial is not about learning Linux, especially because modern flavors of Linux come with intuitive graphical user interfaces for most tasks. However, you will need to know how to use the program called Terminal or Unix Shell. You can think of the Unix Shell as a command interpreter. You can open a new Unix Shell by clicking on the black icon with ">" sign in the top panel of the Desktop, or select "Terminal" from the Accessories menu of the Applications Panel. The commands that you type into the Unix Shell are typically the names of the programs that you want to execute, optionally followed by command modifiers (called flags), and names of files that you want to use with the program. There are four rules that you should remember about Unix Shells:

Programs on a Linux Computer: File System

There are hundreds or thousands of programs installed on a typical Linux-based computer. You should not worry about knowing their names except for a few. The first set of commands deals with navigating the file system. Linux has a file system, which contains files organized to Folders or Directories. You should know how to move around the file system and view files in the current folder. The relevant commands are:

pwd
print working directory. Use this command to find out your current location. This is the first command to use when you do not see files that you are looking for.
cd
change directory. Typing this command without any arguments takes you to the course directory (/home/chem110L). If you are lost, you can always get back to your student directory by typing cd, hitting the Return key, and then typing cd PERM where PERM is your student PERM number. Another handy command is cd .. which takes you up one level in the directory tree.
ls
list files. Use this command to see files it the current folder. If you want to see detailed information about the file or just list the files in a column, use the -l (letter l, not number 1) flag as follows: ls -l

The file system can be also quickly browsed with a graphical File Browser, called Thunar in our Ubuntu Linux and Nautilus on CentOS machines. You can launch a File Browser by clicking on "Places" icon on the toolbar. You can copy, paste, delete, and rename files, or launch programs directly from the File Browser. Many file types are associated with programs capable of handling such files; for example double clicking on an image file opens it in an image viewer. When you right-click on a file name, a menu will open asking what would you like to do with this file.

Programs on a Linux Computer: Working with Files

The second set of commands starts programs that manipulate your files. As in other computer systems, Linux files can be either text files or binary files. Examples of text files are the HTML page you are reading, a PDB file containing coordinates of a macromolecule, or a sequence file containing the amino acid sequence of a peptide. Examples of binary files are PNG and JPEG images, the program firefox, and the compressed PDB files. It is important to make a distinction between text and binary files so that you use appropriate tools to open each type of file. Two general purpose programs for editing files are:

nedit
view/edit text files. On Linux systems, NEdit is a suggested editor. Bluefish, Kate, gedit, or jEdit are some other common programs that allow editing of text files. I recommend to use bluefish if nedit fails for some reason. For more complex tasks, such as writing up lab reports, OpenOffice Writer can be used.
display
view/edit image files. Use this command to view or edit image files. It is a simple, yet powerful free image editor from the ImageMagick suite. You can access the menus from the side-bar that opens when you right-click on the image.

Communication between Unix Programs

A powerful feature of the Unix shell is that you can combine several commands to accomplish a task. For example, imagine that you want to know how many structures are there in the Protein Data Bank today. You would need to connect to the data bank and get the file of interest (with wget), find out where is the number of structures listed (with grep command), look up today's date (that's date command), and print it all to the screen (using awk, for example). In Unix, you can pipe (the | symbol) results from one program to another! Try the following sequence of commands, all in one line, and be careful with spaces, semicolons, and quotation marks:
wget -qO - http://www.rcsb.org/pdb/home/home.do | grep 'Retrieve all structures' | awk 'BEGIN { FS = ">" } {printf("PDB has %s structures as of " ,substr($2,1,5));}'; date +%D
If it did not work (despite you typing everything correcly ☺ ), use your mouse to copy (drag over) and paste (go to the destination window, select it, and middle-click) that long line to the Unix shell. Take a mental note that your mouse is a powerful copy-paste tool in Linux. Also, Linux is Unicode aware, in case you'd like to write a second copy of your lab report in runic.

Finally, there are many programs in a typical Unix computer that perform various useful specialized tasks. For example, students at UC Santa Barbara will be using SYBYL for molecular modeling, PyMOL for effective visualization, GChemPaint for drawing chemical structures, and Mathematica for analyzing experimental data.


Previous Next

Tutorial by Dr. Kalju Kahn, Department of Chemistry and Biochemistry, UC Santa Barbara. ©2003-2010