Exercise 1: Introduction to Linux and GnuPlot


Exercises and points:

All exercises consist of a number of tasks that needs to be fulfilled. This is done by providing answers (the correct ones) to a number of the questions. Each answer is given an independent positive or negative score (correct or wrong answer). You both have to obtain a given minimum score and to answered a minimum number of questions to pass the exercise. Each answers will show your gained points, the actual score ,the number of tried questions and the requirement for passing the exercise. This setup makes sure you get through the important parts of the material with a minimum of acquired knowledge. Should you end up with too few points at the end of the exercise, contact your exercise teacher who, after making sure you know what went wrong, can reset the score to give you a retry try.

The questions are marked in lightblue and all are answered using the web page. Finally, to pass the course all exercises must be passed.


Introduction to Linux:

This exercise is intended to give you an introduction to the Linux environment, with an emphasis on the interactive command language. All graphic user interfaces (GUIs) provide DRAG and DROP facilities, where the user by a few clicks with the mouse can do a limited number of simple operations on directories and files. The GUI has the advantage that it makes it very easy to start using a computer. Eventually you may reach a point where you want to execute more complicated actions and to execute commands faster and more flexibly. To do this you have to use the Linux command language. It takes an effort to learn it, but the time you spend doing this will be paid back many times in the future. Therefore, use your exercise time this week to go through this exercise in detail and get yourself acquainted with the different commands and the way in which you can obtain further information about them. All off the following exercises will take advantage of the material presented here! As there is no book included on Linux in the course, we here provide a link to a Danish introduction to Linux written by Carsten Petersen.

Following the Linux section there is a general introduction to GnuPlot. GnuPlot will be used as the visualisation tool for most of the programming exercises in this course. Therefore the more you learn now (or refresh your memory), the more time you can spend on the real thing -- learning to write programs in Fortran.



Command tool:

To start the exercise you need to start a command window often refereed to as an Xterm window. Depending on the setup of your GUI there may be a number of ways to do this. The one that should work in all cases is to click on the "menu" on the panel bar (often in the lower left corner of the screen). This opens a menu list, click on the System Tools entry to open yet another pull down menu. Here you choose the Terminal program, . In some cases there is already a Terminal program shown on the panel bar -- if not then you can include it yourself. If this is the case, you can just click on this one instead. The window that opens is a command tool window that responds to typed commands. Move the courser into the window to highlight the outer frame. This shows the window is ready to take your instructions. In many default GUI settings, it is required to make a left mouse click inside the frame to activate the window. When/if you need another commend tool, just repeat the same procedure again.



File structure:

Lets start exploring the Linux file system. Tradition is to visualised it as an inverse tree structure with the (/root) at the top of the tree. Spreading from this are a number of tree trunks - directories - containing files and sub directories. An example of this is shown in the image to the right.


To start explorer the system you need to know a few commands:
Notice, that Linux differentiate between small and large letters making c different from C. cd and CD are therefore not the same command -- generally all Linux commands are written in lower case. Each of these commands take a number of options. To learn more about each of the commands use the man facility. This is simply used by typing:

    $ man command

at the prompt in the command tool. Use the arrow keys to scroll up an down in the text - use q to quit.

You can even check the features of man using man man. Start by checking the three commands above, then answer the following question. To do this and get consistent answers you need to execute the command on lynx. If you are not working on lynx use the following command from a command shell to logon to lynx:

ssh your_usrid@lynx.astro.ku.dk

Where you change "your_usrid" to your login name. We will return to this command later.

How many files, directories and links are there at /bin ? Credits: 2/-1


Using ls with the appropriate options you can make a long list that tells you more about the individual files and directories. Try for instance ls -l. This gives you a long listing with additional information of the individual entries -- use the man page to find out what the information tells you!

How many files are there at /bin ? Credits: 2/-1

How many links are there at / ? Credits: 2/-1




User directory:

Go back to you own home directory by doing cd without any argument. If you are a new user on the system you will only have one directory here, namely the mail directory. Your home directory is where you are going to build your own directory hierarchy, which will contain all directories and files for the exercises in this course. Preparing this, you need to make a directory to contain all the sub directories and files you are going to work on. In you home directory, create a new directory called Dat_F. To do this use the command mkdir:

   $ mkdir Dat_F

Go into Dat_F and make another new directory called Exercise_1 and go into this directory. We are now going to make some exercises on file handling here to make you confident with several of the other commands that were introduced at the lecture.

Create an empty file by typing:

  $ touch test_file

touch is a command that can either be used to create an empty file or if used on existing files change the time at which they were last updated. Confirm that the file is there using ls -l, which shows something like this:

-rw-r--r-- 1 kg users 0 Jun 28 15:51 test_file

The red shows the permissions on the file. The first three entries (rw-) shows that the owner of the file, kg, is allowed to read and write the file. The next three entries (r--) show that users belonging to the same group are allowed to read the file. Finally, the last three entries (r--) implies that all other users on the system are also allowed to read the file. This is the default permission on files on this system. Other system administrators may have chosen a different setup where only the user is allowed to read the file.). You can easily change these permissions. Assume you have a file or directory that you don't want anyone else to access. Then you have to change the permission for group and other This is done in the following way:

  $ chmod g-r test_file

-rw----r-- 1 kg users 0 Jun 28 15:51 test_file

as it can be seen, the read permission has now been removed for group. The same can be done for other and usr, using u or o instead of g in the expression above. They may also be combined to change more than one flag at a time. The third entry "-" after w stands for execute, implying that the file contains some form of a program or script that can be executed by typing the files name from the prompt in the command tool. You will find that the compiled Fortran programs will contain this permission.

Which options would you give to change the permissions for the file to -rwxrwxrwx, assuming the order of rwx is kept.. Credits: 2/-1

(There is more than one way to do this. If you have tried it on a file and it works and you get a negative reply here. I may have missed that combination in the list of answers. Let me know and I will incl it.)



Disk usage:

As there are many users on the system, we are limited in the amount of disk space each of us are allowed to use. This restriction is enforced automatically on the home directory making sure the disk is not filled up by a single user making some less lucky actions. There are a number of commands used for checking both global and local disk usages.

  $ df

gives a listing of all the file systems associated with the computer you are logged on to. Various options can be used to make the output more readable.

Which option do you need to use to show the result in Megabytes (unit of 1000) Credits: 2/-1

How big is the disk /fys that is the home disk for fysik accounts ? (in Megabytes) Credits: 2/-1

There are three different definitions of a Megabyte. To clearfy the differences take a look at this webpage.
To see how much disk space you are using use the command quota.

   $ quota

quota shows both your present disk usage and how much disk space you are allowed to use. Are you close to your limit? I guess not! There are two important limits. The first is named quota and the other limit. The first one tells how much disk space you are allowed to use without being told by the system. When you pass this limit, you will be told - either next time you login or next time you open a new command tool. You are then given a finite time to remove enough data to bring you below this limit. The limit is a more severe limit. If you reach this one, you can't continue working before you have removed enough data to bring you below the quota limit. One effect of this is that you can't login to the machine using a graphical terminal as this process writes information to your home directory. As a student you are not given a very high quota, so as we proceed with the course, you may find that you have to clean out data that can easily be reproduced.

Assume you are close to you quota limit, and you need to clean up by deleting files. If you have a large directory tree, the previous commands don't tell you where you use most disk space. This may be achieved using the du command. Using specific options with du it possible to get information about the distribution of disk usage in your files/directories below the pwd.

  $ du ?? *

Which option do you need to seen the disk usage in MB per directory/file ? Credits: 2/-1




Wild cards:

At the end of the previous section we use a * to represent all entries in the directory when executing the du command. In Linux there are a number of possibilities to use wild cards. These can be used to substitute either one or more letters in the argument to a command. This makes it easy to select a number of files/directories to act on.
Notice that the wild card can also pickup unwanted files, so be careful when using them in destructive actions!

Just to try it out try doing:

  $ touch test{1,2,3,4,5,6} log
  $ ls t* test?

Notice how the "{1,2,3,4,5,6}" syntax is use to create several files with very similar names. Now remove these files using the command rm.

How would you remove the six different test files using the fewest possible number of characters? Credits: 2/-1




Process handling:

It is useful to be able to see which process runs on the computer, and especially which process you have running. There are different ways to do this, the most used is the command ps.. Typing

  $ ps

simply gives a listing of your own process started from this command tool giving something like this:

  PID TTY          TIME CMD
21847 pts/6    00:00:00 tcsh
24483 pts/6    00:00:00 ps

The PID is the process identification number, such a number is given to each process that runs on the computer. Time indicates how much CPU time the process has required. There are many options to ps, the most used is possible

  $ ps -ef

that is used to give a long listing of all process

   UID        PID     PPID      C  STIME TTY      TIME      CMD
   root         1           0         0 Jun02   ?          00:00:01 init [5]
   root         2           1         0 Jun02   ?          00:00:00 [migration/0]
   root         3           1         0 Jun02   ?          00:00:00 [ksoftirqd/0]
   root         4           1         0 Jun02   ?          00:00:00 [migration/1]
   .....
   kg           21845 21843  0 12:53   ?           00:00:00 sshd: kg@pts/6
   kg           21847 21845  0 12:53   pts/6    00:00:00 -tcsh
   root        24044   2721  0 17:55   ?           00:00:00 sshd: romeo [priv]
   romeo    24046 24044  0 17:55   ?           00:00:00 sshd: romeo@pts/7
   romeo    24047 24046  0 17:55   pts/7    00:00:00 -tcsh
   kg           24544 21847  0 18:45   pts/6    00:00:00 ps -ef

You can use ps with one option to show all your processes on the machine.

Which option do you need to see all your processes? Credits: 2/-1


A different way to see the running processes is by using the top command:

$ top

This gives a listing of processes, with the most CPU demanding at the top. This list is per default updated every 5 seconds.

How do you select only your own processes in top? (you can do it both inside top and with an option Credits: 2/-1

To get out of top you have to type q inside the window.



Editors:

In the following exercises you are going to make much use of an editor for writing codes etc. It is therefore important that you get familiar with one right from the start. On all Linux systems there are a number of different editors available:
The first five are simple full screen editors that are straight forward to use. Most of the functionality is provided through menus at the top of the editor image. vim is an older Unix line editor, which is more complicated to use as it requires knowledge to a number of shortcuts. But when known, it is able to do much more sophisticated operations than the simpler editors. emacs represents another standard editor that in the latest versions can do almost everything you may need to do. The problem, as with vim, is that it takes some effort to become acquainted with all its functionality.

If you don't have a favorite editor already, I would suggest you to start with one of the simpler editors. Try them out and see which one suits you best. These editors are found on most Linux distributions today.

To test the functionality of the editors, download this file to the disk (use shift when clicking on the link) and do the changes suggested below to get a feeling of how the different editors are handling various common editor operations.

Which editor did you choose? Credits: 2/-1



Pipe and redirect:

Until now we have only been dealing with moving files around and changing the permissions on them. There are situations where it is required to get the content from one file and feed it into a program which them does something depending on the input. One example could be to simply count the number of lines and words in files. Another could be a program that produces data written to the screen. For some reason you may want to store this in a file for later analysis. There is a simple way to handle this situation using the following symbolic representation for the data control:

  $ cat file

reads the content of file to the display. Assume we want to get the content from file into another file called tmp. This may be obtained by the following command:

 $ cat file > tmp

where > grabs the output from cat and puts it into tmp. A more general approach to this is to use the command cp.

What would you write to do the same with cp? Credits: 2/-1

Just to read the content of a file without opening it in an editor you should try more file and possible also the related tail....

The program wc counts the number of lines and words in its input. To count the lines of file above you could do

  $ cat file | wc

The symbol | implies that the output from the first command is used as input for the following command.

Additional we may use >>, which implies that the content of file is added at the end of the existing file, tmp

   $ cat file >> tmp

Similarly < can be used to redirect input into a program, for instance as shown here,

   $ wc < file > tmp

This line counts the number of lines and words in file and puts the result into tmp.


Working environment:

In your work directory there are a number of hidden files, meaning that they do not show up in a normal listing by ls. The names of these files all begin with a "." and can be seen using ls -a. Typing this command, you notice that all directories contain a "." and a ".." entry. These relate to the present directory (.) and the directory one above (..). So by using cd ., you stay where you are, while cd .. implies that you go one directory level up.

In your home directory there are many .name files. These are used by various programs to save temporary information. But there are also two files that have implication on what happens when you first login and each time a new process is started. These are the .login and .cshrc files. If you want your environment to behave in a particular way, then this is where you define it. The only thing that is done right now is to source the appropriate standard files that are active for all users and may be something particular from a previous course.



Connecting to a different computer:


You are now, most likely, logged in on one of the SUNray terminals at the RF building. But you may also have a user account on other machines (A few years back you would have individual accounts at Physics, Geophysics and Astronomy) or need access to a different machine in the same network to do something special. There exist various protocols that allows you to login from one machine to another. A general concern here is the security in the protocol that handles this connection. Today there is only one mechanism that allows you to do this in a secure why. Namely ssh, which is short for secure shell. To login to HCØ from a different machine (the one you have at home or lynx) you will have to use this command from a command shell:

   $ ssh -Y usrid@fys.ku.dk

where usrid is you username on the machine. You are prompted for your passwd before you are allowed onto the machine. After you have been accepted, you can work on this machine as if you were sitting there. You can for instance start the mailtool and the window would appear on your screen. The last is only possible when using the "-Y" (or "-X") option as it allows x-graphics to be passed between the two machines.

You may have data on one computer that you need in connection with other data on this computer or vise versa. There is a similar secure protocol that handles file transfer called scp:

   $ scp my_file usrid@fys.ku.dk:my_second_copy

which copies my_file from the file system on the machine where the scp command is executed to the home directory on the fys system.


Windows

Most of the items discussed above do not work directly on a windows machine. You can find protocols which allows you to ssh login on different machines (putty is one), but you need additional software to export windows from the Linux machine to your windows machine. Some of the available packages are commercial, while others are free. One simple way to get your windows machine to act as a Linux machine is to install the free Cygwin package. This is a Linux interface based on a command window tool and it even has a simple X interface, that allows you to export windows from other machines. A totally different approach is to use VNC, which allows you to have a windows session running on the remote machine, while showing the full graphical display in a window on you local computer.


Introduction to GnuPlot:

In many contest of physics today, ample data are provided for further analysis. This being from experiments, observations or numerical simulations. In all cases the simplest way to represent the data is to visualise them. This may be done by a large group of tools. Many of these are expensive to acquire, but there are software packages available free usage as shareware.Gnuplot belongs to a large group of free programs provided under the GNU project. In fact most Linux distributions build extensively on GNU products and were previously distributed freely - except possibly for a fee for putting the packages together and providing them on a CDROM. Gnuplot belongs to this class of software and it is supposed to run on all platforms.

Later in the course you will be using gnuplot to check the result of the your numerical calculations. If the solution is known, then a single look at a graph can show if the program is doing what it is supposed to do...


Links to gnuplot exercises and examples:

As gnuplot is free to use there are a number of sits that provide both exercises and advice for using the package. Here are a few links that can be used to get more information about how to use the software.



Exercise in using gnuplot:

To get a feeling for the general usage of gnuplot you should start the exercise by looking through the introduction to gnuplot here: http://www.duke.edu/~hpgavin/gnuplot.html. This will give an introduction for 2D line plotting and the usage of different output devices. Following this you should browse through http://t16web.lanl.gov/Kawano/gnuplot/plot3d-e.html to get acquainted with 3D plotting. Notice that they use the word 3D plotting for representing a 2D scalar dataset as a 3D surface. Gnuplot is not an appropriate tool to use for real 3D data sets. For such tasks you will have to find different visualisation tools which is something we will not discuss here. This is part of the Computer Physics course in Block 8.

To start gnuplot, you simply type gnuplot in a command tool window, to finish the session you type exit.


Getting data:

For some of the exercises below we need some additional data. These can be obtained by clicking here. This is a compressed tar file that contains the data files. Save the file in your Exercise_1 directory. To unpack the data do the following:

   $ cd ~/Dat_F
   $ tar -zxvf gp_data.tgz
   $ rm gp_data.tgz





2D plotting:

After having looked through the tutorials given above, you should be able to answer the following questions:

Which command sets a logarithmic scale on the y-axis for all following plots? Credits: 2/-1


A file contains 3 columns: col1, col2, col3. For all lines in the file you want to plot col1 as the x-coordinate and (col2+5*col3**2)*cos(col1) as the y-coordinate.

Complete the command: plot filename using ... (avoid spaces) to plot the expression given above Credits: 2/-1


A black body emits radiation with a wavelength dependent intensity given by Planck's law:

planck.gif
where

k= 1.3806e-23 ; Boltzmann's constant J/K
h= 6.6262e-34 ; Planck's constant J s
c= 2.99792e8 ; Speed of light m/s.

or
planck2.gif
where

c1= 3.741e34 W 2
c2= 1.439e8 m K, when the wavelength is measured in Ångstrøm (10-10 m).

Plot the function with different values of T in the interval 3000 to 15000

add Title, Labels, grid.... and replot

At which wavelength is the intensity largest for T=10000K? 380
2800
8550
Credits: 2/-1


Try to set contour, or contour both. Also try to set cntrparam levels 20.

At which temperature in the given interval is the flux from the black body largest? 4000
6000
8000
10000
Credits: 2/-1


Add title, labels etc. to the graph. When it finished, then use an editor to make a script containing all definitions and settings and run the script in gnuplot by typing:

   >load "filename"

where "filename" represents the name of the script.
Remember to use reset to remove previous definitions and settings.

For writing reports or posting images on the web, you will need to save the plots from gnuplot in an appropriate formate. This will be in postscript format for LaTeX file and png for the web. This may be done by doing

set terminal postscript eps
set output "fig1.eps"
replot
set terminal x11

or

set terminal png             #   See this link for options to png
set output "Planck.png"
replot
set terminal x11

You will be able to view the eps file using kghostview by typing:

   $ kghostview fig1.eps &

and the png file using:

   $ gimp Planck.png

or simply load it into the web browser using the Open file menu under File in the top bar.


To make this and coming image visible for the checks, you have to make a soft link between your public_html directory and the Dat_F directory. To do this you use the following commands:

   $ cd ~/public_html
   $ ln -s ../Dat_F Dat_F


If you don't have a public_html in your home directory, you have to make this directory first before making the soft link (mkdir public_html). Check that it works by opening this corrected address in your browser:
http://www.fys.ku.dk/~"usrid"/Dat_F/Exercise_1/Planck.png

I have made the png plot (Call it Planck.png) Credits: 10/-1




3D plotting:

You may have datasets of the type f(x,y) you need to plot. You can store the data in files in two ways. Either as an ASCII files which you can read with an editor or in binary format which is more optimal for the computer. The structure of the files are different. For the ASCII file the format is s follows:
           <x0> <y0> <z0,0>
<x0> <y1> <z0,1>
<x0> <y2> <z0,2>
: : :
<x0> <yM> <z0,M>

           <x1> <y0> <z1,0>
<x1> <y1> <z1,1>
: : :
<x1> <M> <z1,M>
: : :

<xN> <yM> <zN,M>
where N, and M are the two dimensions of the data set.

This file is simply plotted as

   >splot "ascii.dat" ....

Where .... represents all the options needed to make the relevant plot.

For the binary file format the structure is different. Her the data have to be written in the following way:

<N+1> <y0> <y1> <y2> ... <yN>
<x0> <z0,0> <z0,1> <z0,2> ... <z0,N>
<x1> <z1,0> <z1,1> <z1,2> ... <z1,N>
: : : : ... :
Notice that the <N+1> has to have floating point format and not integer as one would have assumed.....
The data is plotted including the keyword binary directly after the file name:

   >splot "binary.dat" binary ....

You can have a go at the data in the two data files, 3D.data and 3D_binary.data, in you exercise directory. These two files contain the same data. But as can you see, the ascii file is 10 times larger than the binary. Part of this is due to the repetition of the data in the ascii file as you can see above. We will return to the important issue of data formats later in the course.

I have made a png plot of the 3D dataset and called it 3D.png Credits: 2/-1




Final task....

If you are really finished with the exercise, then please remove all the files that you do not need in the following --- I am here thinking on all the files you played with earlier and especially the two data files needed for the 3D gnuploting -- the data files can always be acquired again at a later time if you need them.

See you next week!


$Id: index.php,v 1.17 2009/09/15 09:23:16 kg Exp $ kg@astro.ku.dk