La lecture en ligne est gratuite
Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

Partagez cette publication

!
CENTER FOR HIGH PERFORMANCE COMPUTING
GPUs for Scientific
Computing
Wim R.M. Cardoen
Center for High Performance Computing
wim.cardoen@utah.edu
Fall 2011 CENTER FOR HIGH PERFORMANCE COMPUTING
Overview
•  Why GPUs?
•  Architecture
•  CUDA
•  Basic example(s)
•  Shared Memory
•  Libraries
•  Cuda-Fortran
•  Alternatives to CUDA

12/20/11 2 CENTER FOR HIGH PERFORMANCE COMPUTING
Why GPUs?
12/20/11 3 CENTER FOR HIGH PERFORMANCE COMPUTING
M2090 (Fermi Architecture):
665 GFLOP/s (DP) & 1331 Gflops (SP)
Memory Bandwidth: 177 GB/s (no ECC)
12/20/11 4 CENTER FOR HIGH PERFORMANCE COMPUTING
Architecture
•  CPU/Multi-GPU System HP-SL390

Source: K. Spafford, J.S. Meredith and J. S. Vetter. "Quantifying NUMA and Contention Effects in
Multi-GPU Systems", Fourth Workshop on General-Purpose Computation on Graphics Processors
(GPGPU), 2011
12/20/11 5 CENTER FOR HIGH PERFORMANCE COMPUTING
•  M2090:
o  SIMT (cfr. SIMD)
o  16 SMPs (Streaming Multi Processors)
o  Each SMP: 32 cores/SMP => 512 cores
o  16 SMP: share 768 kB L2 Cache (new)
o  Constant Memory: 64 kB
o  Global Memory: 6 GB (DDR5)
o  GPU clock speed: 1.3 GHz
12/20/11 6 CENTER FOR HIGH PERFORMANCE COMPUTING
Fermi architecture block diagram
L2 Cache
Source:T. R. Halfhill. White Paper “Looking Beyond Graphics”
12/20/11 7 CENTER FOR HIGH PERFORMANCE COMPUTING
•  SMP:
o  Each SMP: 32 cores & 4 SFU
o  Each core: FP/INT Unit
o  L1 Cache (new)
o  Each SMP: can manage 48 threads
o  Warp Size: 32 threads
o  Shared memory (per block): 48 kB
o  #Registers (per block): 32768
12/20/11 8 CENTER FOR HIGH PERFORMANCE COMPUTING
•  SMP block diagram:
12/20/11 9 CENTER FOR HIGH PERFORMANCE COMPUTING
•  Multithreading in Fermi Arch.:

Source: T. R. Halfhill. White Paper “Looking Beyond Graphics”
12/20/11 10

Un pour Un
Permettre à tous d'accéder à la lecture
Pour chaque accès à la bibliothèque, YouScribe donne un accès à une personne dans le besoin