Sparse Spikes Deconvolution with L1 Pursuit

This numerical tour explores the use of L1 regularization (basis pursuit denoising) to solve seismic sparse spikes deconvolution.

Contents

Sparse spikes deconvolution is one of the oldest inverse problems, that is a stilized version of recovery in seismic imaging. The ground is modeled as a 1D profile x, mostly zeros with a few spikes accounting for interfaces between layers in the ground. The observation is the convolution D*x of x against a wavelet filter, where D(:,1) is the basis wavelet function (D is the convolution operator.

The goal of sparse spike deconvolution is to recover an approximation of x given noisy measurement y=D*x+w. Since the convolution destroys many low and high frequencies, this requires some prior information to regularize the inverse problem. Since x is composed of a few spikes, the sparsity is a wonderful prior, that is exploited either by L1 minimization (basis pursuit) or by non-linear greedy processes (matching pursuits).

Installing toolboxes and setting up the path.

You need to download the following files: signal toolbox and general toolbox.

You need to unzip these toolboxes in your working directory, so that you have toolbox_signal and toolbox_general in your directory.

For Scilab user: you must replace the Matlab comment '%' by its Scilab counterpart '//'.

Recommandation: You should create a text file named for instance numericaltour.sce (in Scilab) or numericaltour.m (in Matlab) to write all the Scilab/Matlab command you want to execute. Then, simply run exec('numericaltour.sce'); (in Scilab) or numericaltour; (in Matlab) to run the commands.

Execute this line only if you are using Matlab.

getd = @(p)path(p,path); % scilab users must *not* execute this

Then you can add the toolboxes to the path.

getd('toolbox_signal/');
getd('toolbox_general/');

Seismic Wavelets and Seismic Imaging

We first build the dictionary D used for seismic imaging.

set_rand_seeds(123456,21);

First we load a seismic filter, which is a second derivative of a Gaussian.

% dimension of the signal
n = 1024;
% width of the filter
s = 13;
% second derivative of Gaussian
t = -n/2:n/2-1;
h = (1-t.^2/s^2).*exp( -(t.^2)/(2*s^2) );
h = h-mean(h);
% normalize it
h = h/norm(h);
% recenter the filter for periodic boundary conditions
h1 = fftshift(h);

We compute the filtering matrix. To stabilize the recovery, we sub-sample by a factor of 2 the filtering.

% sub-sampling (distance between wavelets)
sub = 2;
% number of atoms in the dictionary
p = n/sub;
% the dictionary, with periodic boundary conditions
[Y,X] = meshgrid(1:sub:n,1:n);
D = reshape( h1(mod(X-Y,n)+1), [n p]);

We generate a sparse signal to recover.

% spacing min and max between the spikes.
m = 5; M = 40;
k = floor( (p+M)*2/(M+m) )-1;
spc = linspace(M,m,k)';
% location of the spikes
sel = round( cumsum(spc) );
sel(sel>p) = [];
% randomization of the signs and values
x = zeros(p,1);
si = (-1).^(1:length(sel))'; si = si(randperm(length(si)));
% creating of the sparse spikes signal.
x(sel) = si;
x = x .* (1-rand(p,1)*.5);
% sparsity of the solution
M = sum(x~=0);

Now we perform the measurements.

% noise level
sigma = .06*max(h);
% noise
w = randn(n,1)*sigma;
% measures
y = D*x + w;

Display.

clf;
subplot(2,1,1);
plot_sparse_diracs(x);
title('Signal x');
subplot(2,1,2);
plot(y);
set_graphic_sizes([], 20);
axis('tight');
title('Noisy measurements y=D*x+w');

Homotopy Method for L1 Regularization

Instead of using a greedy matching pursuit to recover an approximation of x, we now turn to more global minimization method. L1 minimization is a convex regularization that approximates the L0 pseudo-norm regularization (that would be ideal to find highly sparse Diracs train). The minimization reads

x(lambda) = argmin_x 1/2*norm(y-D*x)^2 + lambda*norm(x,1)

The Homotopy method starts with a large lambda so that x(lambda)=0 and progressively decays the value of lambda so that the sparsity of x(lambda) only increases/decreases by 1 at each step.

The intial solution is 0.

xbp = zeros(p,1);

The solution xbp is update xbp = xbp + gamma*d where the direction of update is supported on the set of vectors that maximally correlate with the residual y-D*xbp

% compute the correlation with the residual
C = D'*(y-D*xbp);
lambda = max(abs(C));
% find the locations that maximally correlates
S = find( abs(abs(C/lambda)-1)<1e-9);
% compute the complementary set
I = ones(p,1); I(S)=0;
Sc = find(I);

The direction of descent d(S) on S is computed so that its image by D(:,S)*d(S) correlates as +1 or -1 with the atoms in S (same speed on all the coefficients). It is zero outside S.

d = zeros(p,1);
d(S) = (D(:,S)'*D(:,S)) \ sign( C(S) );

Now we compute the value gamma so that either

In situations 1) and 2), the number of non-zero coordinates of xbp (its sparsity) increases by 1. In situation 3), its sparsity stays constant because one coefficients appears and another deasapears.

Compute minimum gamma so that situation 1) is in force.

v = D(:,S)*d(S);
w = ( lambda-C(Sc) ) ./ ( 1 - D(:,Sc)'*v );
gamma1 = min(w(w>0));

Compute minimum gamma so that situation 2) is in force.

w = ( lambda+C(Sc) ) ./ ( 1 + D(:,Sc)'*v );
gamma2 = min(w(w>0));

Compute minimum gamma so that situation 3) is in force.

w = -xbp(S)./d(S);
gamma3 = min(w(w>0));

Compute minimum gamma so that 1), 2) or 3) is in force, and update the solution.

gamma = min([gamma1 gamma2 gamma3]);

We can display the residual correlation before/after the update. You see that a new maximally correlating atoms has appeared, and that the overall correlation has decayed a little.

clf;
subplot(2,1,1);
plot( abs(D'*(y-D*xbp)), '.-' ); axis tight;
title('Correlation before update');
subplot(2,1,2);
plot( abs(D'*(y-D*(xbp+gamma*d))), '.-' ); axis tight;
title('Correlation after update');

Update the solution.

xbp = xbp + gamma*d;

Exercice 1: (the solution is exo1.m) Implements the homotopy algorithm by applying several times (let's say up to 1.5*M times) the previous update rule. Keep track of the evolution of lambda and of the evolution of the sparsity.

exo1;

Display the solution computed by L1.

clf;
subplot(2,1,1);
plot_sparse_diracs(x);
title('Signal x');
subplot(2,1,2);
plot_sparse_diracs(xbp);
title('Recovered by L1 minimization');

The solution computed by L1 as a tendency to under-estimate the true values of the coefficients. This is bias inherent to L1 minimization. To remove this bias, one can perform a back projection that performs an L2 best fit on the support selected by L1.

% find the support
sel = find(xbp~=0);
% perform the fit
xproj = zeros(p,1);
xproj(sel) = D(:,sel) \ y;
% display
clf;
subplot(2,1,1);
plot_sparse_diracs(xbp);
title('Recovered by L1 minimization');
subplot(2,1,2);
plot_sparse_diracs(xproj);
title('Back-projection');

Exercice 2: (the solution is exo2.m) In the same code, add a tracking of the error norm(x-xproj), and return the solution that mimizes this error. Save the value of the optimal lambda in lambda_opt and the value xbp of the solution for this lambda for a future use.

exo2;

Display the best solution

err_bp = norm(x-xproj)/norm(x);
clf;
subplot(2,1,1);
plot_sparse_diracs(x);
title('Signal x');
subplot(2,1,2);
plot_sparse_diracs(xproj);
title( strcat(['L1 recovery, error = ' num2str(err_bp, 3)]));

Iterative Soft Thresholding for L1 Minimization

If the value of lambda is known, one can use an iterative algorithm to find the solution xbp of the L1 minimization.

We use the optimal value of lambda already computed. We also save the true solution computed by homotopy.

lambda = lambda_opt;
xbp_opt = xbp;

The gradient descent step size depends on the conditionning of the matrix.

tau = 2/norm(D).^2;

The iterative algorithm starts with the zero vector.

xbp = zeros(p,1);

The gradient step updates the value of the solution by decaying the value of norm(y-D*xbp)^2.

xbp = xbp + tau*D'*( y-D*xbp );

The thresholding step improves the sparsity of the solution.

xbp_prev = xbp;
xbp = perform_thresholding( xbp, tau*lambda, 'soft' );
% display
clf;
subplot(2,1,1);
plot(xbp_prev); axis('tight');
set_graphic_sizes([], 20);
title('Estimate before soft thresholding');
subplot(2,1,2);
plot(xbp); axis('tight');
set_graphic_sizes([], 20);
title('Estimate after soft thresholding');

Exercice 3: (the solution is exo3.m) Implement the iterative soft thresholding algorithm by applying many times these two steps. Keep track of the variational energy minimized by this algorithm. Keep also track of the distance to the solution.

exo3;