Usually OpenFOAM® simulations take a significant amount of time leading to higher costs of prototyping. GPU technology could potentially overcome this problem, however due to a limited memory on a GPU card realistic simulations are usually not possible. To overcome this problem we have implemented a multi-GPU version of SpeedIT where we accelerate Preconditioned Conjugate Gradient solver to calculate the pressure. Indeed, it is not uncommon that solving pressure equations takes more time that solving equations of momentum.
Here, you will find the results from our tests performed for various OpenFOAM® simulations such as icoFoam and simpleFoam. The tests were either created by us or provided by our partners Engys and IconCFD.
SpeedIT implements SpMV in multi-GPU systems. SpeedIT Plugin was used to call SpeedIT from OpenFOAM®. Following tests were performed in both multi-CPU and multi-GPU environment. Since the quality of SpeedIT was already tested (see our previous reports) we have performed the tests for a fixed number of time steps.
- Cavity3D with varying number of cells, icoFoam, PCG on both CPU and GPU.
- AhmedBody, 2.2M cells, simpleFoam. (the test was prepared by Engys).
- Cabin, 1.5M cells, simpleFoam (the test was prepared by IconCFD).
The following graphs present the performance of all three cases in multi-CPU and multi-GPU systems. The first column covers cavity3D case solved with icoFoam while the second one AhmedBody and Cabin tests solved with simpleFoam. Figs.1-2 present an acceleration as ratio nCPU vs. nGPU where n varied from 1 to 16. One can conclude what happens when calculations instead in a standard cluster are performed on a cluster equipped with GPU cards. Figs. 3-4 present the ratio 1CPU vs. nGPU. The graphs present a scenario when the calculations have been ported from a PC to a multi-GPU system.
Last Figs. 5-6 depict two comparisons, namely nCPU vs. 1CPU (cluster) and nGPU vs. 1GPU. From these scaling factors one can coclude if it is reasonable to add more CPUs or GPUs to the system.
To validate the results we have calculated the precision defined as a difference between p, U and phi for all the cells in the geometry calculated on CPU and GPU.
For example for cavity3D, 32M cells, 5000 time steps, nGPUs vs. nCPUs where n = 2,4,12,16,18 the precision was 1e-4, 1e-7 and 1e-14 for p, U and phi, respectively.
Results lead to following conclusions:
- The bigger the case the better acceleration on GPU. The scaling of 1M and 8M cases in Fig.1, Fig.2 indirectly confirms this hypothesis.
- It is reasonable to accelerate OpenFOAM® simulations with GPU cards. Even a standard PC equipped with a GPU card(s) should perform better.
- Also a cluster equipped with additional GPU cards should also provide acceleration, especially for larger test cases (see Fig.3 and 8M case).
Solving pressure equation was performed with PCG with a diagonal preconditioner. Numerous publications and tests shows that GAMG provides faster converging and is usually used by OpenFOAM® users. In future, we plan to replace a diagonal preconditioner with alternative ones that converge faster.
We kindly would like to acknowledge Engys Ltd. and IconCFD Ltd. for providing us with the test cases. The implementation of a multi-GPU version of SpeedIT was partly funded by Green Transfer Programme from Wroclaw City financed by European Social Fund. This research was supported in part by PL-Grid Infrastructure.
Contact: sales (at) vratis.com and info (at) vratis.com
- This offering is not approved or endorsed by OpenCFD Limited, the producer of the OpenFOAM software and owner of the OPENFOAM® and OpenCFD® trade marks (see the Disclaimer).
- The views and statements expressed in this blog are of Vratis Ltd. and are not necessarily the views of or endorsement by 3rd parties named in this activity.
- OPENFOAM® is a registered trade mark of OpenCFD Limited, the producer of the OpenFOAM software.