SpeedIT Tools: Beyond Acceleration

July 18, 2010

Official Release

Filed under: Uncategorized — Tags: — admin @ 12:07 am

SpeedIT V1.0 has been officially released. See speedit.vratis.com for details.

May 21, 2010


Filed under: Uncategorized — Tags: — admin @ 6:32 pm

Currently, we are preparing the official release. Stay tuned, the web site should be available in a week or two. Apart from that we have started to implement other, more sophisticated solvers.

Thank you for all your interest and helping us with the testing.

March 23, 2010

SpeedIT eXtreme – Reference Manual

Filed under: Uncategorized — Tags: — admin @ 12:50 pm

SpeedIT eXtreme – Programmer’s Guide

Filed under: Uncategorized — admin @ 12:47 pm

March 4, 2010

Release 0.9

Filed under: Uncategorized — admin @ 6:24 am

We announce that the SpeedIT Toolkit 0.9 has been released. The library has been internally deployed, tested and validated in a real scenario (blood flow in aortic bifurcation, data came from IT’IS Foundation, Switzerland).

If you are interested in our development send us an email at info <at> vratis.com

February 11, 2010

Testing solvers

Filed under: Results — Tags: , — admin @ 9:07 am

Currently we are in the process of testing our iterative solvers (BICGSTAB and CG) with or without preconditioners. The tests are being performed on the same benchmark set of matrices and will include the following tests: for a given matrix we will measure the performance (GFLOPS and speed-up) as a function of iterations that are needed to find a stable solution. The results will be compared to  standard CPU implementation of the mentioned solvers, for example in MKL library.

January 4, 2010

Preliminary tests

Filed under: Uncategorized — Tags: , — admin @ 9:48 pm

Since around 90% of the computational time is devoted to Sparce Matrix Vector Multiplication we focused on testing this operation in the first place. The attached chart presents our results for 23 different matrices with different size, number of NNZ and the structure. As you can see, the performance depends strongly on the matrix structure. This is the reason why we decided to have two seperate kernels for two types of matrices: sparse and denser ones. Please also note that because of the memory transfers & PCIe bottleneck it is not worth to use our solvers only for few iterations.

Perfomance of SpMV Multiplication in Double Precision

Perfomance of SpMV Multiplication in Double Precision

Perfomance of SpMV Multiplication in Single Precision

Perfomance of SpMV Multiplication in Single Precision

Speed-up GPU vs. CPU

Speed-up GPU vs. CPU


  1. Peak performance was calculated as a mean value from 10 runs with the same experimental conditions.
  2. Benchmark matrices where collected from University of Florida Sparse Matrix Collection in CSR format.
  3. Not all of the matrices could be loaded to GPU memory due to its limitations.
  4. CPU denotes a SpMV operation from Intel Math Kernel Library.
  5. GPU denotes our SpMV kernel.
  6. CPU machine: AMD Athlon(tm) 64 X2 Processor 3800+ working at 2010.373 MHz with 3 GB DDR 400 MH (Dual  Channel, bandwidth 6,4 GB/s) and  Nforce 4 SLI chipset.
  7. GPU machine: NVIDIA GeForce GTX295 (480 SP) with 1792 MB GDDR3 (896 bits) 999 MHz and 223.8 GB/s bandwith on PCI-Express 2.0.
  8. Bandwidth for ONE device measured with utility bandwidthTest from CUDA SDK:
      device to device: 93 GB/s
      host to device pageable memory: 1090 MB/s
      host to device non-pageable memory: 1591 MB/s
  9. System: Ubuntu 9.10 64bit, NVidia driver version: 190.42, CUDA  ver. 2.3

Benchmark Matrices

Benchmark Matrices from University of Florida Sparse Matrix Collection

December 10, 2009

SpeedIT Tools

The SpeedIT Tools library provides a set of accelerated solvers for sparse linear systems of equations. Manifold acceleration, e.g. more than an order of magnitude, is achieved with a single reasonably priced NVIDIA Graphics Processing Unit (GPU) that supporst CUDA and proprietary advanced optimisation techniques. The library can be used in a wide spectrum of domains arising from problems with underlying 2D and 3D geometry, such as computational fluid dynamics, electromagnetics, thermodynamics, materials, acoustics, computer vision and graphics, robotics, semiconductor devices and structural engineering. The library can be also used for  problems without defined geometry such as quantum chemistry, statistics, power networks and other graphs and chemical process simulation. All computations are performed with single or double floating point precision. Two linear system solvers and two preconditioners are supplied.

Powered by WordPress