(octave.info)Creating Sparse Matrices in Oct-Files


A.1.6.2 Creating Sparse Matrices in Oct-Files
.............................................

There are two useful strategies for creating a sparse matrix.  The first
is to create three vectors representing the row index, column index, and
data values, and from these create the matrix.  The second alternative
is to create a sparse matrix with the appropriate amount of space, and
then fill in the values.  Both techniques have their advantages and
disadvantages.

   Below is an example of creating a small sparse matrix using the first
technique

     int nz, nr, nc;
     nz = 4, nr = 3, nc = 4;

     ColumnVector ridx (nz);
     ColumnVector cidx (nz);
     ColumnVector data (nz);

     ridx(0) = 1; cidx(0) = 1; data(0) = 1;
     ridx(1) = 2; cidx(1) = 2; data(1) = 2;
     ridx(2) = 2; cidx(2) = 4; data(2) = 3;
     ridx(3) = 3; cidx(3) = 4; data(3) = 4;
     SparseMatrix sm (data, ridx, cidx, nr, nc);

which creates the matrix given in section Note: Storage of Sparse
Matrices.  Note that the compressed matrix format is not used at the
time of the creation of the matrix itself, but is used internally.

   As discussed in the chapter on Sparse Matrices, the values of the
sparse matrix are stored in increasing column-major ordering.  Although
the data passed by the user need not respect this requirement,
pre-sorting the data will significantly speed up creation of the sparse
matrix.

   The disadvantage of this technique for creating a sparse matrix is
that there is a brief time when two copies of the data exist.  For
extremely memory constrained problems this may not be the best technique
for creating a sparse matrix.

   The alternative is to first create a sparse matrix with the desired
number of nonzero elements and then later fill those elements in.
Sample code:

     int nz, nr, nc;
     nz = 4, nr = 3, nc = 4;
     SparseMatrix sm (nr, nc, nz);
     sm(0,0) = 1; sm(0,1) = 2; sm(1,3) = 3; sm(2,3) = 4;

   This creates the same matrix as previously.  Again, although not
strictly necessary, it is significantly faster if the sparse matrix is
created and the elements are added in column-major ordering.  The reason
for this is that when elements are inserted at the end of the current
list of known elements then no element in the matrix needs to be moved
to allow the new element to be inserted; Only the column indices need to
be updated.

   There are a few further points to note about this method of creating
a sparse matrix.  First, it is possible to create a sparse matrix with
fewer elements than are actually inserted in the matrix.  Therefore,

     int nr, nc;
     nr = 3, nc = 4;
     SparseMatrix sm (nr, nc, 0);
     sm(0,0) = 1; sm(0,1) = 2; sm(1,3) = 3; sm(2,3) = 4;

is perfectly valid.  However, it is a very bad idea because as each new
element is added to the sparse matrix the matrix needs to request more
space and reallocate memory.  This is an expensive operation that will
significantly slow this means of creating a sparse matrix.  It is
possible to create a sparse matrix with excess storage, so having NZ
greater than 4 in this example is also valid.  The disadvantage is that
the matrix occupies more memory than strictly needed.

   Of course, it is not always possible to know the number of nonzero
elements prior to filling a matrix.  For this reason the additional
unused storage of a sparse matrix can be removed after its creation with
the ‘maybe_compress’ function.  In addition to deallocating unused
storage, ‘maybe_compress’ can also remove zero elements from the matrix.
The removal of zero elements from the matrix is controlled by setting
the argument of the ‘maybe_compress’ function to be ‘true’.  However,
the cost of removing the zeros is high because it implies re-sorting the
elements.  If possible, it is better for the user to avoid adding the
unnecessary zeros in the first place.  An example of the use of
‘maybe_compress’ is

     int nz, nr, nc;
     nz = 6, nr = 3, nc = 4;

     SparseMatrix sm1 (nr, nc, nz);
     sm1(0,0) = 1; sm1(0,1) = 2; sm1(1,3) = 3; sm1(2,3) = 4;
     sm1.maybe_compress ();   // No zero elements were added

     SparseMatrix sm2 (nr, nc, nz);
     sm2(0,0) = 1; sm2(0,1) = 2; sm(0,2) = 0; sm(1,2) = 0;
     sm1(1,3) = 3; sm1(2,3) = 4;
     sm2.maybe_compress (true);  // Zero elements were added

   The use of the ‘maybe_compress’ function should be avoided if
possible as it will slow the creation of the matrix.

   A third means of creating a sparse matrix is to work directly with
the data in compressed row format.  An example of this advanced
technique might be

     octave_value arg;
     ...
     int nz, nr, nc;
     nz = 6, nr = 3, nc = 4;   // Assume we know the max # nz
     SparseMatrix sm (nr, nc, nz);
     Matrix m = arg.matrix_value ();

     int ii = 0;
     sm.cidx (0) = 0;
     for (int j = 1; j < nc; j++)
       {
         for (int i = 0; i < nr; i++)
           {
             double tmp = m(i,j);
             if (tmp != 0.)
               {
                 sm.data(ii) = tmp;
                 sm.ridx(ii) = i;
                 ii++;
               }
           }
         sm.cidx(j+1) = ii;
      }
     sm.maybe_compress ();  // If don't know a priori the final # of nz.

which is probably the most efficient means of creating a sparse matrix.

   Finally, it may sometimes arise that the amount of storage initially
created is insufficient to completely store the sparse matrix.
Therefore, the method ‘change_capacity’ exists to reallocate the sparse
memory.  The above example would then be modified as

     octave_value arg;
     ...
     int nz, nr, nc;
     nz = 6, nr = 3, nc = 4;   // Guess the number of nz elements
     SparseMatrix sm (nr, nc, nz);
     Matrix m = arg.matrix_value ();

     int ii = 0;
     sm.cidx (0) = 0;
     for (int j = 1; j < nc; j++)
       {
         for (int i = 0; i < nr; i++)
           {
             double tmp = m(i,j);
             if (tmp != 0.)
               {
                 if (ii == nz)
                   {
                     nz += 2;   // Add 2 more elements
                     sm.change_capacity (nz);
                   }
                 sm.data(ii) = tmp;
                 sm.ridx(ii) = i;
                 ii++;
               }
           }
         sm.cidx(j+1) = ii;
      }
     sm.maybe_compress ();  // If don't know a priori the final # of nz.

   Note that both increasing and decreasing the number of nonzero
elements in a sparse matrix is expensive as it involves memory
reallocation.  Also because parts of the matrix, though not its
entirety, exist as old and new copies at the same time, additional
memory is needed.  Therefore, if possible avoid changing capacity.
automatically generated by info2www version 1.2.2.9