Introduction
Sparse Matrix-Vector multiplication (SpMV) is one of BLAS operations that are often used in scientific calculations. In order to show that SpeedIT belongs to the fastest implementations of this routine we have tested SpMV on 23 randomly chosen matrices from University Florida Matrix Collection. Their properties are described in Tab.1. Tab 2 and Tab.3 present time of SpMV in single and double precision while Figs.1-8 present the results in a graphical form. Since the performance is strongly affected by the matrix size we have divided them into two groups: small and large matrices. The tests were performed on a Tesla C2050 GPU card from NVIDIA.
SpeedIT is available in two formats. CSR and a proprietary CMR format, either of which can be easily chosen by the user.
Fig.1. Average time of SPMV in Single Precision for small matrices.
Fig.1. Time of SpMV in Single Precision for small matrices. Resulting time is an average from 1000 runs.
Fig.2. Average time of SPMV in Single Precision for large matrices.
Fig.2. Time of SpMV in Single Precision for large matrices. Resulting time is an average from 1000 runs.
Fig.3. Average time of SPMV in Double Precision for small matrices.
Fig.4. Time of SpMV in Double Precision for small matrices. Resulting time is an average from 1000 runs.
Fig.4. Average time of SPMV in Double Precision for large matrices.
Fig.4. Time of SpMV in Double Precision for large matrices. Resulting time is an average from 1000 runs.
Fig. 5. Speed-up of SpeedIT CMR vs. CUSPARSE and CUSP in Single Precision.
Fig.5. Speed-up of SpeedIT CMR in Single Precision vs. CUSPARSE and CUSP.
Fig. 6. Speed-up of SpeedIT CSR vs. CUSPARSE and CUSP in Single Precision.
Fig.6. Speed-up of SpeedIT CSR in Single Precision vs. CUSPARSE and CUSP.
Fig. 7. Speed-up of SpeedIT CMR vs. CUSPARSE and CUSP in Double Precision.
Fig.7. Speed-up of SpeedIT CMR in Double Precision vs. CUSPARSE and CUSP.
Fig.8. Speed-up of SpeedIT CSR vs. CUSPARSE and CUSP in Double Precision.
Fig.8. Speed-up of SpeedIT CSR in Double Precision vs. CUSPARSE and CUSP.
Conclusions
- The highest speed-up of SpMV implemented in SpeedIT CMR vs. CUSPARSE is about 2x while vs. CUSP is more than 4x.
- The highest speed-up of SpMV implemented in SpeedIT CSR against and CUSP is about 1.4x.
- SpeedIT performs better for large matrices ( > 100 000 NNZ) and CMR format is more efficient.


