4

I am looking for a good numerical method to solve a linear system (of small/moderate size) with a matrix of the form $$ \begin{bmatrix} a_1 & b_1\\ & a_2 & \ddots \\ & & \ddots & b_{n-1}\\ b_n & & & a_n \end{bmatrix}. $$ What would you suggest?

Entries not displayed are zero. The $a_i$ and $b_i$ entries can be arbitrary, so I would not trust the stability of unpivoted Gaussian elimination.

Related question: solving system of linear equations with cyclic tridiagonal matrix.

Federico Poloni
  • 11,344
  • 1
  • 31
  • 59
  • 1
    Since you say you have a lot of these, would it be worth trying, for a given matrix $A$, to use unpivoted GE first, together with computing an estimate of its accuracy, and if it fails or it's too inaccurate, only then compute pivoted LU. This would take time $O(n+\alpha n^3)$, where $\alpha$ is the fraction of ill-behaved matrices, which, if small enough, makes it linear. – Kirill Oct 21 '16 at 18:27

2 Answers2

2

The special structure of your matrix is easily exploited by a custom made $QR$ factorization based on Givens rotations. This leads to a method of $O(n)$ complexity. Below is a Octave/Matlab code, which solves $Mx=rhs$ for cyclic bidiagonal matrices $M$ of size $n \ge 2$ .

% Create a cyclic bidiagonal testmatrix
a=[2 -3 -1 5 3]';
b=[7 2 -4 -1 6]';
rhs=[-1 -3 2 6 2]';
n=5;

% Full matrix solution
M=diag(a)+diag(b(1:n-1),1);  M(n,1)=b(n);
x_full_matrix=M\rhs;
res_nrn_full_matrix=norm(M*x_full_matrix-rhs)

% QR factorization using Givens rotations (see Golub and Van Loan), without 
% forming the full Matrix M. M is transformed to an upper triangular 
% matrix R (which is also not formed as a full matrix)
% a and b are overwritten. d stores fill-in entries of R
c=zeros(n-1,1);
s=zeros(n-1,1);
d=zeros(n-2,1);

z=b(n);
for i=1:n-1,
  % Compute Givens rotation
  if z==0, c(i)=1; s(i)=0;
  else
    if abs(z)>abs(a(i)), tau=-a(i)/z; s(i)=1/sqrt(1+tau*tau); c(i)=s(i)*tau;
    else                 tau=-z/a(i); c(i)=1/sqrt(1+tau*tau); s(i)=c(i)*tau;
    end
  end
  % Apply Givens rotation
  a(i)=c(i)*a(i)-s(i)*z;
  a_n=a(n);
  if i==n-1, break; end
  z   =s(i)*b(i);
  b(i)=c(i)*b(i);
  a(n)=c(i)*a_n;
  d(i)=-s(i)*a_n;
end
a(n)  =s(n-1)*b(n-1)+c(n-1)*a_n;
b(n-1)=c(n-1)*b(n-1)-s(n-1)*a_n;

% Apply Givens rotations to rhs
x=rhs;
for i=1:n-1,
  x_i=x(i);
  x(i)=c(i)*x_i-s(i)*x(n);
  x(n)=s(i)*x_i+c(i)*x(n);
end

% Triangular solve
x(n)=a(n)\x(n);
x(n-1)=a(n-1)\(x(n-1)-b(n-1)*x(n));
for i=n-2:-1:1,
  x(i)=a(i)\(x(i)-d(i)*x(n)-b(i)*x(i+1));
end

% Check the answer
res_nrm_cycl_bidiag_QR=norm(M*x-rhs)
diff=norm(x_full_matrix-x)

Obviously, one should translate this Octave/Matlab code into Fortran or C etcetera, if computational efficiency is important. Even for small $n$, for example, $n=100$, such an implementation of this method maybe much faster than a general sparse $LU$ factorization or a dense LAPACK/BLAS $LU$ factorization.

wim
  • 571
  • 2
  • 4
0

How small is "small/moderate"? N=10? N=10000? N=1000000? How many of these systems of equations do you have to solve (one? a few hundred? a few million?)

If N is small and the total number of these systems that you have to solve is small, then there's no point in doing anything more sophisticated than using a dense LU factorization ala LAPACK/BLAS.

If N is really big and and the total number of these that you have to solve is small, then use a sparse LU factorization routine to do the job.

Only if N is really big and you have lots of these to solve would there be any point in trying to develop a custom solution to the problem.

Brian Borchers
  • 18,719
  • 1
  • 39
  • 70
  • $n=10$ or $n=100$ seems reasonable for our application, and I have to solve several of them (let's say $10000$) but each with a different matrix. It seems that one can do better than vanilla LU factorization; for instance, the matrix is lower Hessenberg, so one can compute its QR factorization in $O(n^2)$. And I was hoping some FFT-based trick could yield linearithmic time. – Federico Poloni Oct 21 '16 at 07:53
  • 1
    I disagree with this attitude "if it is not large scale it doesn't matter". There is a point in solving an easy small-scale problem in 0.05 seconds instead of 0.5, if it later ends up as the computational core of a longer task. – Federico Poloni Oct 21 '16 at 07:57