An implementation of a numerical advection equation solver on modern graphics cards using compute unified device architecture

Thesis (M.S.) University of Alaska Fairbanks, 2010 "In the past decade, the Graphics Processing Unit (GPU) is reported to have become a powerful general-purpose computation platform for various application areas. The Arctic Region Supercomputing Center (ARSC) intends to assess the capability of...

Full description

Bibliographic Details
Main Author: Dang, Wei
Format: Thesis
Language:English
Published: 2010
Subjects:
Online Access:http://hdl.handle.net/11122/12771
id ftunivalaska:oai:scholarworks.alaska.edu:11122/12771
record_format openpolar
spelling ftunivalaska:oai:scholarworks.alaska.edu:11122/12771 2023-05-15T15:16:37+02:00 An implementation of a numerical advection equation solver on modern graphics cards using compute unified device architecture Dang, Wei 2010-12 http://hdl.handle.net/11122/12771 en_US eng http://hdl.handle.net/11122/12771 Department of Electrical and Computer Engineering Graphics processing units Computer graphics Thesis ms 2010 ftunivalaska 2023-02-23T21:37:59Z Thesis (M.S.) University of Alaska Fairbanks, 2010 "In the past decade, the Graphics Processing Unit (GPU) is reported to have become a powerful general-purpose computation platform for various application areas. The Arctic Region Supercomputing Center (ARSC) intends to assess the capability of this emerging computing tool so that they may enlist it as component of supercomputing systems, but at a lower cost. This thesis reports on parallelization, on both GPU and CPU, of a numerical algorithm named the Total Variation Diminishing (TVD) scheme, which is used in the Eulerian Polar Parallel Ionospheric Model (EPPIM) developed at UAF's Geophysical Institute (GI) and ARSC. The GPU (single NVIDIA Tesla® C2050) and CPU (dual Intel Xeon x5560) implementations were parallelized using the Compute Unified Device Architecture (CUDA) language and OpenMP with the C language respectively. A speedup of up to 175x was observed when comparing the CUDA/GPU implementation to the non-parallelized CPU version, and of almost 40x when comparing to the parallelized CPU version. Results also demonstrated an average floating-point-operation rate of 107 GFLOPs, 351 times more than that the CPU version can offer. However, there is still space for improvement as only one tenth of the peak theoretical performance of the C2050 was achieved"--Leaf iii. 1. Introduction -- 1.1. Motivation -- 1.2. Similar work -- 1.3. Contribution -- 1.4. Thesis outline -- 2. Background -- 2.1. Evolution of GPU computing -- 2.2. Compute Unified Device Architecture -- 2.2.1. Hardware architecture -- 2.2.2. Software architecture -- 2.2.3. Terminology -- 2.2.4. Compilation workflow -- 2.2.5. CUDA memory model -- 2.2.6. Programming methodology -- 2.2.7. Performance considerations for scientific computing -- 2.3. Mathematical background -- 2.3.1. Continuity equation -- 2.3.2. Numerical schemes -- 2.3.3. The corner transport upwind scheme -- 2.3.4. The Lax-Wendroff scheme -- 2.3.5. The TVD scheme -- 3. Algorithms -- 3.1. Introduction -- 3.2. The serial algorithm -- ... Thesis Arctic Alaska University of Alaska: ScholarWorks@UA Arctic Fairbanks
institution Open Polar
collection University of Alaska: ScholarWorks@UA
op_collection_id ftunivalaska
language English
topic Graphics processing units
Computer graphics
spellingShingle Graphics processing units
Computer graphics
Dang, Wei
An implementation of a numerical advection equation solver on modern graphics cards using compute unified device architecture
topic_facet Graphics processing units
Computer graphics
description Thesis (M.S.) University of Alaska Fairbanks, 2010 "In the past decade, the Graphics Processing Unit (GPU) is reported to have become a powerful general-purpose computation platform for various application areas. The Arctic Region Supercomputing Center (ARSC) intends to assess the capability of this emerging computing tool so that they may enlist it as component of supercomputing systems, but at a lower cost. This thesis reports on parallelization, on both GPU and CPU, of a numerical algorithm named the Total Variation Diminishing (TVD) scheme, which is used in the Eulerian Polar Parallel Ionospheric Model (EPPIM) developed at UAF's Geophysical Institute (GI) and ARSC. The GPU (single NVIDIA Tesla® C2050) and CPU (dual Intel Xeon x5560) implementations were parallelized using the Compute Unified Device Architecture (CUDA) language and OpenMP with the C language respectively. A speedup of up to 175x was observed when comparing the CUDA/GPU implementation to the non-parallelized CPU version, and of almost 40x when comparing to the parallelized CPU version. Results also demonstrated an average floating-point-operation rate of 107 GFLOPs, 351 times more than that the CPU version can offer. However, there is still space for improvement as only one tenth of the peak theoretical performance of the C2050 was achieved"--Leaf iii. 1. Introduction -- 1.1. Motivation -- 1.2. Similar work -- 1.3. Contribution -- 1.4. Thesis outline -- 2. Background -- 2.1. Evolution of GPU computing -- 2.2. Compute Unified Device Architecture -- 2.2.1. Hardware architecture -- 2.2.2. Software architecture -- 2.2.3. Terminology -- 2.2.4. Compilation workflow -- 2.2.5. CUDA memory model -- 2.2.6. Programming methodology -- 2.2.7. Performance considerations for scientific computing -- 2.3. Mathematical background -- 2.3.1. Continuity equation -- 2.3.2. Numerical schemes -- 2.3.3. The corner transport upwind scheme -- 2.3.4. The Lax-Wendroff scheme -- 2.3.5. The TVD scheme -- 3. Algorithms -- 3.1. Introduction -- 3.2. The serial algorithm -- ...
format Thesis
author Dang, Wei
author_facet Dang, Wei
author_sort Dang, Wei
title An implementation of a numerical advection equation solver on modern graphics cards using compute unified device architecture
title_short An implementation of a numerical advection equation solver on modern graphics cards using compute unified device architecture
title_full An implementation of a numerical advection equation solver on modern graphics cards using compute unified device architecture
title_fullStr An implementation of a numerical advection equation solver on modern graphics cards using compute unified device architecture
title_full_unstemmed An implementation of a numerical advection equation solver on modern graphics cards using compute unified device architecture
title_sort implementation of a numerical advection equation solver on modern graphics cards using compute unified device architecture
publishDate 2010
url http://hdl.handle.net/11122/12771
geographic Arctic
Fairbanks
geographic_facet Arctic
Fairbanks
genre Arctic
Alaska
genre_facet Arctic
Alaska
op_relation http://hdl.handle.net/11122/12771
Department of Electrical and Computer Engineering
_version_ 1766346926716878848