A Feasibility Study on Porting the Community Land Model onto Accelerators Using Openacc

International Journal of Advanced Computer Science and Applications(IJACSA), 5(12), 2014 As environmental models (such as Accelerated Climate Model for Energy (ACME), Parallel Reactive Flow and Transport Model (PFLOTRAN), Arctic Terrestrial Simulator (ATS), etc.) became more and more complicated, we...

Full description

Bibliographic Details
Published in:International Journal of Advanced Computer Science and Applications
Main Authors: D. Wang, W. Wu, F. Winkler, W. Ding, O. Hernandez
Format: Text
Language:English
Published: The Science and Information (SAI) Organization 2014
Subjects:
Online Access:https://doi.org/10.14569/IJACSA.2014.051203
Description
Summary:International Journal of Advanced Computer Science and Applications(IJACSA), 5(12), 2014 As environmental models (such as Accelerated Climate Model for Energy (ACME), Parallel Reactive Flow and Transport Model (PFLOTRAN), Arctic Terrestrial Simulator (ATS), etc.) became more and more complicated, we are facing enormous challenges regarding to porting those applications onto hybrid computing architecture. OpenACC emerges as a very promising technology, therefore, we have conducted a feasibility analysis on porting the Community Land Model (CLM), a terrestrial ecosystem model within the Community Earth System Models (CESM)). Specifically, we used automatic function testing platform to extract a small computing kernel out of CLM, then we apply this kernel into the actually CLM dataflow procedure, and investigate the strategy of data parallelization and the benefit of data movement provided by current implementation of OpenACC. Even it is a non-intensive kernel, on a single 16-core computing node, the performance (based on the actual computation time using one GPU) of OpenACC implementation is 2.3 time faster than that of OpenMP implementation using single OpenMP thread, but it is 2.8 times slower than the performance of OpenMP implementation using 16 threads. On multiple nodes, MPI_OpenACC implementation demonstrated very good scalability on up to 128 GPUs on 128 computing nodes. This study also provides useful information for us to look into the potential benefits of “deep copy” capability and “routine” feature of OpenACC standards. We believe that our experience on the environmental model, CLM, can be beneficial to many other scientific research programs who are interested to porting their large scale scientific code using OpenACC onto high-end computers, empowered by hybrid computing architecture. http://thesai.org/Downloads/Volume5No12/Paper_3-A_Feasibility_Study_on_Porting_the_Community.pdf