Scalapack Overview

Scalapack Fortran Framework Tutorial

In this tutorial, we will explore the Scalapack Fortran framework, its history, features, and provide several examples to demonstrate its functionality. Scalapack is a parallel linear algebra library that is designed to efficiently solve large-scale scientific and engineering problems on distributed memory systems. It provides a collection of high-performance parallel algorithms for solving dense linear algebra problems.

Introduction

Scalapack stands for "Scalable Linear Algebra Package" and is a library written in Fortran. It is built on top of the Basic Linear Algebra Subprograms (BLAS) and the Message Passing Interface (MPI) standard. Scalapack is designed to take advantage of parallel computing resources, such as clusters, supercomputers, or multi-core processors, to solve large-scale linear algebra problems efficiently.

History

Scalapack was initially developed at the University of Tennessee, the University of California, Berkeley, and the University of Colorado, with funding from the National Science Foundation (NSF), the Department of Energy (DOE), and other organizations. The first version of Scalapack was released in 1994, and it has been continuously updated and improved since then.

Features

Scalapack provides a wide range of features to solve various linear algebra problems efficiently. Some of the key features of Scalapack include:

1. Parallel Algorithms

Scalapack implements parallel versions of various linear algebra algorithms, such as matrix factorizations, eigenvalue problems, least squares problems, and singular value decompositions. These algorithms are designed to efficiently distribute the computation across multiple processors or nodes in a parallel computing environment.

Here is an example of using Scalapack to solve a system of linear equations using the LU factorization:

program lu_solver
  use scalapack
  implicit none
  
  integer :: n, info
  integer :: desc_a(9), desc_b(9)
  real(kind=8), allocatable :: a(:,:), b(:,:)
  integer :: nprow, npcol, myrow, mycol, ctxt, myid
  
  call blacs_pinfo(myid, nprow, npcol)
  call blacs_get(myid, 0, ctxt)
  call blacs_gridinit(ctxt, "R", nprow, npcol)
  
  n = 1000
  allocate(a(n,n))
  allocate(b(n,1))
  
  ! Initialize matrices a and b
  
  call descinit(desc_a, n, n, n, n, 0, 0, ctxt, n, info)
  call descinit(desc_b, n, 1, n, 1, 0, 0, ctxt, n, info)
  
  ! Perform LU factorization
  
  call pgetrf(n, n, a, 1, 1, desc_a, ipiv, info)
  call pdgesv(n, 1, a, 1, 1, desc_a, ipiv, b, 1, 1, desc_b, info)
  
  ! Finalize
  
  deallocate(a, b)
  call blacs_gridexit(ctxt)
  call blacs_exit(0)
  
end program lu_solver

In this example, we first initialize the Scalapack context using BLACS (Basic Linear Algebra Communication Subprograms) functions. We then allocate memory for the matrices a and b and initialize them with appropriate values. The descinit function is used to initialize the distributed matrix descriptors desc_a and desc_b. Finally, we call the pgetrf and pdgesv functions to perform the LU factorization and solve the system of linear equations, respectively.

2. Distributed Data Structures

Scalapack provides distributed data structures, such as distributed matrices and distributed vectors, that allow efficient storage and computation of large-scale linear algebra problems. These data structures are distributed across multiple processors or nodes in a parallel computing environment, and Scalapack provides functions to perform computations on these distributed data structures.

Here is an example of using distributed matrices in Scalapack:

program distributed_matrices
  use scalapack
  implicit none
  
  integer :: n, nprow, npcol, myrow, mycol, ctxt, myid
  integer :: desc_a(9), desc_b(9)
  real(kind=8), allocatable :: a(:,:), b(:,:), c(:,:)
  
  call blacs_pinfo(myid, nprow, npcol)
  call blacs_get(myid, 0, ctxt)
  call blacs_gridinit(ctxt, "R", nprow, npcol)
  
  n = 1000
  allocate(a(n,n))
  allocate(b(n,n))
  allocate(c(n,n))
  
  ! Initialize matrices a and b
  
  call descinit(desc_a, n, n, n, n, 0, 0, ctxt, n, info)
  call descinit(desc_b, n, n, n, n, 0, 0, ctxt, n, info)
  call descinit(desc_c, n, n, n, n, 0, 0, ctxt, n, info)
  
  ! Perform matrix multiplication
  
  call pdgemm('N', 'N', n, n, n, 1.0, a, 1, 1, desc_a, b, 1, 1, desc_b, 0.0, c, 1, 1, desc_c)
  
  ! Finalize
  
  deallocate(a, b, c)
  call blacs_gridexit(ctxt)
  call blacs_exit(0)
  
end program distributed_matrices

In this example, we initialize the Scalapack context using BLACS functions and allocate memory for the distributed matrices a, b, and c. We use the descinit function to initialize the distributed matrix descriptors desc_a, desc_b, and desc_c. Finally, we call the pdgemm function to perform matrix multiplication of a and b and store the result in c.

3. Scalability

Scalapack is designed to be highly scalable, allowing the efficient utilization of large-scale parallel computing resources. It can handle problems with millions or even billions of unknowns and provide significant speedup compared to sequential algorithms.

4. High Performance

Scalapack leverages the performance of BLAS and MPI libraries to achieve high computational efficiency. It is optimized for various parallel computing architectures and provides highly tuned parallel algorithms for different linear algebra operations.

Examples

Now let's look at some examples to demonstrate the usage of Scalapack in solving linear algebra problems.

Example 1: Matrix-Vector Multiplication

program matrix_vector_multiplication
  use scalapack
  implicit none
  
  integer :: n, nprow, npcol, myrow, mycol, ctxt, myid
  integer :: desc_a(9), desc_x(9), desc_y(9)
  real(kind=8), allocatable :: a(:,:), x(:,:), y(:,:)
  
  call blacs_pinfo(myid, nprow, npcol)
  call blacs_get(myid, 0, ctxt)
  call blacs_gridinit(ctxt, "R", nprow, npcol)
  
  n = 1000
  allocate(a(n,n))
  allocate(x(n,1))
  allocate(y(n,1))
  
  ! Initialize matrices a and x
  
  call descinit(desc_a, n, n, n, n, 0, 0, ctxt, n, info)
  call descinit(desc_x, n, 1, n, 1, 0, 0, ctxt, n, info)
  call descinit(desc_y, n, 1, n, 1, 0, 0, ctxt, n, info)
  
  ! Perform matrix-vector multiplication
  
  call pdgemv('N', n, n, 1.0, a, 1, 1, desc_a, x, 1, 1, desc_x, 0.0, y, 1, 1, desc_y)
  
  ! Finalize
  
  deallocate(a, x, y)
  call blacs_gridexit(ctxt)
  call blacs_exit(0)
  
end program matrix_vector_multiplication

In this example, we perform a matrix-vector multiplication using Scalapack. We initialize the Scalapack context using BLACS functions and allocate memory for the matrices a, x, and y. We use the descinit function to initialize the distributed matrix and vector descriptors desc_a, desc_x, and desc_y. Finally, we call the pdgemv function to perform the matrix-vector multiplication.

Example 2: Eigenvalue Problem

program eigenvalue_problem
  use scalapack
  implicit none
  
  integer :: n, nprow, npcol, myrow, mycol, ctxt, myid
  integer :: desc_a(9), desc_z(9), info
  real(kind=8), allocatable :: a(:,:), z(:,:)
  real(kind=8), allocatable :: eigenvalues(:)
  integer :: lwork, liwork
  real(kind=8), allocatable :: work(:)
  integer, allocatable :: iwork(:)
  
  call blacs_pinfo(myid, nprow, npcol)
  call blacs_get(myid, 0, ctxt)
  call blacs_gridinit(ctxt, "R", nprow, npcol)
  
  n = 1000
  allocate(a(n,n))
  allocate(z(n,n))
  allocate(eigenvalues(n))
  
  ! Initialize matrix a
  
  call descinit(desc_a, n, n, n, n, 0, 0, ctxt, n, info)
  call descinit(desc_z, n, n, n, n, 0, 0, ctxt, n, info)
  
  ! Perform eigenvalue decomposition
  
  lwork = max(1, 3 * n - 1)
  liwork = 5 * n
  allocate(work(lwork))
  allocate(iwork(liwork))
  
  call pdsyev('V', 'U', n, a, 1, 1, desc_a, eigenvalues, z, 1, 1, desc_z, work, lwork, iwork, liwork, info)
  
  ! Finalize
  
  deallocate(a, z, eigenvalues, work, iwork)
  call blacs_gridexit(ctxt)
  call blacs_exit(0)
  
end program eigenvalue_problem

In this example, we solve an eigenvalue problem using Scalapack. We first initialize the Scalapack context using BLACS functions and allocate memory for the matrices a, z, and the array eigenvalues. We use the descinit function to initialize the distributed matrix descriptors desc_a and desc_z. We also allocate memory for the work arrays work and iwork required by the pdsyev function. Finally, we call the pdsyev function to compute the eigenvalues and eigenvectors of matrix a.

Conclusion

In this tutorial, we have explored the Scalapack Fortran framework, its history, features, and provided several examples to demonstrate its functionality. Scalapack is a powerful library for solving large-scale linear algebra problems efficiently on parallel computing resources. It provides parallel algorithms, distributed data structures, scalability, and high performance. By leveraging Scalapack, researchers and engineers can tackle complex scientific and engineering problems more effectively.

To learn more about Scalapack, you can visit the official website: Scalapack Official Website

Note: The code snippets provided in this tutorial are for illustrative purposes only and may require additional modifications and dependencies to run successfully in your specific environment.

Introduction​

History​

Features​

1. Parallel Algorithms​

2. Distributed Data Structures​

3. Scalability​

4. High Performance​

Examples​

Example 1: Matrix-Vector Multiplication​

Example 2: Eigenvalue Problem​

Conclusion​