Computer Physics: Exercise 9

Exercise 9: Singular Value Decomposition

Before you actually start working on this Exercise, it is essential that you have read Chapter 2 of Numerical Recipes (particularly the section Singular Value Decomposition), and that you understand the various concepts mentioned there (rank, range, orthogonality, etc.). That will make it much easier to understand what is going on when the various control parameters are changed in the experiments below.

Here are your 5 bonus points Credits: 5/5

Subsections	approx time
Preparation section	15 min
SVD experimentarium	30 min
Improving SVD	60 min
Setting up the matrix system from scratch	25 min
Home Work	2 hours

CVS	[about 1 minute]

To extract the exercise files for this weeks exercise, do

    cd ~/ComputerPhysics
    cvs update -d

In case of problems, see the CVS update help page.

SVD preparatory section

[about 15 minutes]

Before you start playing with the SVD experimentarium, let's check that you picked up enough from the lecture and the Numerical Recipes to make the exercise meaningful.

The following is a short summary, combined with questions. If you have prepared enough, you should have no problem with the questions. Conversely, if you find the summary too condensed and incomprehensible, you probably have not prepared enough.

Basic Principles of SVD

The basic theorem on which the SVD method is based says that an arbitrary MxN (M rows, N columns) matrix A may be decomposed into a product of three matrices, denoted U, W, and V^t in the Lecture Notes and in Numerical Recipes:

	A = U W V^t

One of the three matrices is MxN, the other two are NxN. Which one is MxN; Credits: 2/-1
OK

One of these is a diagonal matrix; is it Credits: 2/-1
OK

One of the matrices is both row- and column-orthonormal, which? Credits: 2/-1
OK

The linear equation system

	A x = b

may be interpreted as a mapping from an N-dimensional x-space to an M-dimensional b-space, where we assume that M is larger than N. Using the SVD decomposition of A into U W V^t, we may write A x as U ( W ( V^t x)), where the innermost operation is a generalized rotation in N-space, the next one is a scaling of each of the N components, and the last one is a linear combination of the N column vectors in U, each with M components.

Since, according to this formula, b is a linear combination of the column vectors of U, it can never lie outside the N-dimensional space spanned by these. But b has M (> N) components; hence it formally belongs to an M-dimensional space.

Which of the following statements are true?

The rank of A is equal to: Credits: 2/-1
OK

Least-squares Solutions

If M>N, an M-dimensional vector has more components than a vector expressed as components of the N orthogonal column vectors in U. Hence it is possible to have b vectors for which the equation system has no solution; there is no x that will produce a b outside the range of A. However, the expression for the inversion of the equation,

	x = A^-1 b = (V (W^-1 (U^tb)))

may still be formally applied, and indeed still produces a useful solution; the solution for which the remainder (A x - b) is "smallest", in the sense that it has the smallest sum of squares of its components. To see that this is indeed the case, consider the following:

The three successive steps indicated by the parentheses may be interpreted as finding the components of b along the N column vectors of U, then scaling the components, and finally rotating the answer back to the original coordinate system.

Which of the following statements are true?

The components of b that lie outside the range of A make no difference to the projection, and hence do not influence x
make a difference to the projection, and hence do influence x Credits: 1/-1
OK

A change of x will result in a contribution to the remainder that lies inside the range of A.
outside the range of A. Credits: 1/-1
OK

Contributions to the sum of squares from inside and outside the range of A are dependent
independent Credits: 1/-1
OK

The minimum is obtained when the contribution from inside the range vanishes
outside the range vanishes Credits: 1/-1
OK

The Singular Values w_i

Each column vector in U is multiplied with a singular value w_i when going from x to b. It may happen, that some changes of x have practically no effect on b (think about b as a measurement, and x the "true" function --- a limited resolution in the measurement means that details of x can hardly be "seen" in b).

These x patterns will correspond to very large singular values w_i
small singular values w_i Credits: 1/-1
OK

Correspondingly, when going from b to x, the projections of b onto those column vectors get multiplied by very large 1/w_i factors; i.e., they lead to very small contributions to x
large contributions to x Credits: 1/-1
OK

Thus, if b is obtained from measurements, and is influenced by noise in the measurements, then the components of the noise along those particular U column vectors get multiplied with very large coefficients, which leads to huge uncertainties in the answer.

At this point, it should be easy to understand that one is better off ignoring these components completely than by blowing up their accidental values with large numbers. Hence the rule to put 1/w_i = 0 if w_i is too small.

In the experimentarium, synthetic measurements b are produced by first smearing the input profile x with a (slightly noisy) Gaussian profile, and then adding some noise to the "measurements".

SVD experimentarium

[about 30 minutes]

To run the SVD experimentarium:

Start IDL from the 9_SVD directory, wait for the IDL> prompt, and type "xsvd".
Turn the LU solution On using the bottom in the upper left corner.
Vary the smearing parameter and compare the SVD solution to the LU-decomposition solution.
- Consider what the smearing parameter does to the measurements; it smears the sharp features of the input signal into a more fuzzy measurement, from which we have to reconstruct the input profile.
  
  More singular values become small as the smearing is increased
  decreased Credits: 1/-1
  OK
  
  The LU-solution misbehaves as soon as there are singular values comparable to the SVD cut-off
  machine accuracy Credits: 1/-1
  OK
Turn off the LU solutions.
Study how the right hand side and the solution vector depend on the smearing parameter (the right hand side and the solution vector are shown in two of the plot panels).
- For a value of the smearing parameter where several of the singular values are small, investigate how the solution depends on the choice of the cut-off value for the singular values, epsilon
- Try values slightly smaller and slightly larger than the ``knee'' (where the singular values flatten out at small values)
  
  Try values of the order 1e-2 and 1e-1; the solution becomes more noisy
  less detailed Credits: 1/-1
  OK
For values of the previous parameters where you get a good solution, try adding noise to the matrix.
- As you can see, this scrambles the matrix and the right hand side vector. How much does it disturb the solution; very much
  very little Credits: 1/-1
  OK
Set the matrix noise to the minimum value. Check out the sensitivity of the solution to noise in the right hand side vector ("measurement noise").
- Choose a certain cut off, somewhere in the range between the "knee" and a value where the solution deteriorates because too few singular values are kept.
- Now change the level of right hand side noise from values below the cut-off, over values near the cut-off, to values significantly larger than the cut-off.
  
  How much does this affect the solution very much
  very little Credits: 1/-1
  OK
Finally, press the QUIT button, which tells you how to get access to the matrices U and V (make sure the LU solution is turned off when you quit, in order to get a non-square, overdetermined, equation system)
- Plot some of the columns and rows of V
- Plot some of the columns of U
- Verify that V is row- and column-orthonormal, and that U is column-orthonormal.
- If you have time, generate some random noise, project against one of the column vectors with a very small singular value, and use the inversion formula (step by step) on that column vector, to get the corresponding change in x.

Improving the SVD solution

[about 60 minutes]

The SVD solution is much better than the LU-decomposition (less noise sensitive), but it shows some ringing near the sharp corners of the original function.

Think about why this is so, considering especially that
- the SVD solution is a linear combination of the V vectors, with amplitude factors 1/w_i, where w_i are the decreasing singular values
- the V vectors are functions in space that have increasingly fast oscillations
- if the linear combination has a sharp cut-off, then the first missing V vector will be "visible by its absence"; it leaves a "hole"
Try an alternative formula for the cutoff (this is handled in solve.pro), which tapers off the amplitude smoothly instead of abruptly, with the sharpness contolled by a parameter p
```
	for j=0,n-1 do wp[j,j] = 1./(w[j]^data.p+(max(w)*data.eps/(w[j]+1e-30))^data.p)^(1./data.p)
```
which has a sharper drop-off for larger p. To try this out:
- Add an additional variable p to the definition of the data structure. It is defined in xsvd.pro.
- Implement a slider to control the data.p value (e.g. p = 0.5, 0.6, ..., 1.5).
- Play with the slider values, and try to find a subjectively "best value" for p.

In order to pass the test below your xsvd.pro needs to have:

A new widget slider
A widget label for the slider
A new widget base for the slider to be in
A new field called (exactly) 'p' in the 'data' structure
All of these can be made by essentially copying the corresponding parts for existing sliders.

I have a version that works with the new additions mentioned above Locate and upload your xsvd.pro file: Credits: 5/-5
OK

Setting up the equations from scratch

[about 25 minutes]

In the previous part of this exercise you became familiar with SVD using a prebuilt setup. In this part you perform the basic calculations yourself, starting from a series of data files and information about how these relate to the final result.

We have two data files, signal.dat and response.dat. They represent data from a seismic experiment trying to determine the structure of the layers just below the Earth's surface. The N_a data values in signal.dat (called a_i below) represent the wave pattern that has been enforced at the surface, while the N_s data values in response.dat (called s_i below) represent the measurements of the seismic waves reflected from different depths. The response values are linearly related to the force signal, via the relation
where r_j are values for the Earth's reflectivity. This may be written as a linear system of equations:

where s is given by the data in the file response.dat and the matrix A_i,j = a_i-j may be constructed from the profile given in signal.dat.
The reflectivity as a function of depth, r, can now be determined using SVD:
Start a journal file to capture your progress
```
IDL> journal,'svd.jou'
```
Read in the data from the two data files. Use the Unix command wc (or an editor) to see how many lines each file contains. Each file can be read with a line similar to
```
IDL> openr,1,'signal.dat' & Na=25 & a=fltarr(Na) & readf,1,a & close,1
```
Make a 2D floating point array aa to hold the matrix elements A_i,j. This requires a square matrix with dimensions equal to size of the response data (Ns).
From the formula above you can convince yourself that A becomes a "lower band diagonal" square matrix (A_i,j is non-zero only for band of values with j≤i). Assign values to it:
```
IDL> aa=fltarr(Ns,Ns)                                         ; make the 2D array
IDL> for i=0,Ns-1 do for j=(i-Na+1)>0,i do aa[i,j]=a[i-j]     ; set non-zero values
```

When s and A are defined you are ready to use the SVDC solver. Check the IDL help pages to see how to use it. One way to bring up the help for SVDC is to type

IDL> ? svdc

The use of SVDC (as well as other routines that have been imported to IDL from the Numerical Recipes library) can be confusing, in that the matrix storage order is opposite to the normal mathematical one, unless the /COLUMN keyword is used. To follow standard math [row,column] matrix notation one should thus set the /COLUMN keyword. For a more detailed discussion of matrix storage order, search for "Columns, Rows" in IDLDE Help.

Having obtained a decomposition of A into V, W, U, as shown in the IDL help file, you need to choose a threshold level for the eigenvalues in W (look at the eigenvalues and you will see that any relative value less than about 0.1 will give the same result).
```
IDL> ww=fltarr(Ns,Ns) & for i=0,Ns-1 do if w[i] gt max(w)*threshold then ww[i,i]=1./w[i]
```

Then, using the # sign to make matrix multiplications (just like in the help pages for SVDC), you can determine the reflectivity vector, r and plot the result.

IDL> r1=v#ww#transpose(u)#s

IDL matrix multiplication with #-operators works just like in math; it takes rows from the matrix on the left (running over the 2nd matrix index), multiplies with columns from the matrix to the right (running over the first matrix index), and puts the result at the 'intersection' of the row and column in question. So, the line above corresponds exactly to what you can find in Numerical Recipes and in the Lecture Notes. (The ##-notation used in the IDL help file for SVDC works with the opposite matrix notation [column,row] that is default when the /COLUMN keyword is NOT used.) For a more detailed discussion of IDL matrix operations search for "matrix operators" in IDLDE Help.

On its own, the calculated r vector gives no clue about how good or bad the result is. Read in the data in reflectivity.dat and compare this with your own result.
Compute the rms value of the difference between your derived solution and the one from the data file.
```
IDL> openr,1,'reflectivity.dat' & r2=fltarr(Ns) & readf,1,r2 & close,1
IDL> rms=sqrt(total((r1-r2)^2)/Nr)
IDL> print,'RMS error: ',rms
```
When it works, close the journal file:
```
IDL> journal
```
Open the journal file in an editor (e.g. the IDLDE editor) and remove lines related to mistakes and failed attempts, such that the edited journal file can be used to re-execute the whole thing. Try it out with
```
IDL> @svd.jou
```

In order to pass the test below your svd.jou journal file must assign the RMS value to an IDL variable 'rms'.

My working journal file svd.jou is ready for uploading Locate and upload your svd.jou file: Credits: 5/-5
OK

Use any remaining time for example to play with the X-widget interface, either with this weeks interface, or the one from last week.

Home Work

[about 2 hours]

Spend at least one hour this week reading the chapter about Linear Algebra in Numerical Recipes; understanding what goes on in this exercise is non-trivial! Details about what to read are given in the Lecture Notes.

If you did not have enough time to play with IDL last week, or during the exercise hours, then do it in your home work time. The X-widget interface this week is similar to the one last week; in fact if you would like to experiment with modifications of the X-widgets, you can pick either one of the two exercises.

If you want to prepare for the next Lecture, then check out the Chapter in Numerical Recipes about Ordinary Differential Equations.

$Id: index.php,v 1.14 2010/05/18 19:29:19 aake Exp $