Programming for Physicists: Exercise 6

This is the simplest IO format as we just have to specify the variables to be handled and the program itself performs the IO accordingly. Following this there are a few minor tasks.

Write a small program that defines three 1D array, say x, f1 and f2 of dimension 100.
Assign values to x going from 0 to pi, assign sin(x) to f1 and cos(x) to f2 (use single precision real arrays). Sin and cos are standard functions in Fortran that can be used in the following way: a=sin(x).
Write the three variables to a data file in a format that gnuplot can read (lines containing x(i), f1(i), f2(i)).
Check the you can plot the graphs with gnuplot (x,f1) and (x,f2).

In another case it may be an advantage to instead write the arrays one after the other in big blocks.

Make a copy of the previous program, this time saving the arrays one after the other in a different file.

Compare the data structure of the two files

Formated IO provides a possibility to write data in exactly the way we want it. All it requires is an additional format statement that determines the structure of the format:
......
open(10,file='data.dat')
write(10,100), 'x=',x(i), f1(i),f2(i)
100 format(2x,a2,2x,f8.3,2x,e10.3,2x,g9.3)
......

In the format statement the different entries have different effects:

2x: gives two empty character spaces.
a2: reserves space for two characters.
f8.3: reserves place for a floating point number with three spaces after the decimal position and with a total of eight spaces -- including the sign!
e10.3: writes the number in exponential form with three numbers following the decimal comma.
g9.3: chooses the most appropriate of the two previous formats for the data, decimal or exponential.

There are further formats to be used in connection with integer and complex variables, they are all described in the book.

Make a new program to handle formatted data IO. Use the few lines from above for writing the same data as in the previous program. Try to play with the numbers for the different formats and compare the output.

Can you get it to show strange characters in the data file? Yes
No Credits: 2/-1

When does fortran print strange characters in the data file? Credits: 2/-1

Notice, when reading this type of data from a file the program will all the time have to start from the top of the file to find exactly the dataset you need. This is inconvenient when dealing with large datasets.

There are two disadvantages with the ASCII data format, the first was mentioned just above and the second is that ASCII files take up more disk space than needed. The ASCII format is therefor fine for small datasets, but as the amount of data increases it is much more convenient to use a binary format. This is the format the computer likes to write numbers in and it is therefore much faster to read and write. To access a random line of data from the ASCII file, you have to read the file from the beginning and all the way down to the entry you need to access. In binary data files you have this possibility, but it comes with a different down side --- you have to specify the size of the data blocks you want to read/write with. Therefore, it requires the data to have a minimum of uniformity.

In the ASCII case you could simply open a file and start reading/writing from/to it, simply by specifying the file name and a unit number. For binary files more information has to be provided.

open(10,file='test.dat',status='old',access='direct',form='unformatted',recl=1*nwords,err=100)
read(10,rec=1) a

status: can have three values: 'old', 'new' and 'unknown' and referees to the "existence" of the file.
access: referees to the way the file is access: 'direct' or 'sequential', where binary files are handled using the direct access.
format: referees to the structure of the file. The binary file is always unformatted while the ASCII is formatted.
recl: referees to the length of the data block that is handled in the IO process. This is one of the places where different compilers handles this different. nwords referees to how many numbers you are going to store and 1 is a scaling of this. The ifort compiler counts the entries of "words" while other compilers counts the bits in a word multiplied with the number of words. In the later case the scaling value has to be 4 for a single precision floating point number. -- gfortran uses 4!
err: Gives an address to jump to in case there is an error when trying to open the file.
rec: defines the record that is to be read/written from the data file.

Write a small program that takes the same data as in the previous cases and write it in binary format to a single file. As earlier, there are two ways to do this. Either in the simple format that gnuplot reads, or as single blocks of data one after the other. You will have to do both here! --- Notice that nwords is different between the two cases!

How much smaller is the binary data file written with 3 large blocks than the same ASCII file? Credits: 2/-1

As you can see from the expression read(10,rec=1), then the presence of the keyword rec, makes it possible to both read and write to and from a specific data block in the file without having to start from the top of the file each time. This is a significant advantage when one wants to access a particular data set in a large file and you already know which data record contains the data.

To see that the unformatted binary data can be used to more complicated data sets, the next task is to write a program that converts the ASCII data in "catalog.txt" to binary format. The data here represents information about a selection of starts. The numbers represent the stars position on in the sky, their distance, their change in position per year, a measure of their visual magnitude and an indication of their surface temperature, but none of that is important for this task. What is important is how much space needs to be allocated to each variable.

When writing the binary file you ignore the two first header lines that give information about the data values.
Notice also, that you do not need to save the index number as it is related to the rec number.
Save the number of data sets in the first record as the only number.
Include err in open() and use end in read() to control program action when either an error occurs or the end of file is reached.
The three first numbers are in double precision.
Give the binary file the name catalog.bin

After converting the data to binary format, make a program that can extract a single entry from the binary file and print it to standard output. Here you should use formated output to make a consistent appearance of the output.

What is the 5th number in the 219th star entry in catalog2.bin? Credits: 2/-1

For large data set it can take quit some time to compute the Fourier transform. Just to show you that FFT is much faster, and provides you with the same result, you are going to change the previous simple FT method to the FFT method given in Numerical recipes. It will not take long as it is all prepared for in the Makefile. There are a few issues that have to be changed relative your previous program: