Lecture 2: Sets and Functions - MA101 : Calculus (Semester 1
42 pages
English

Lecture 2: Sets and Functions - MA101 : Calculus (Semester 1

-

Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres
42 pages
English
Le téléchargement nécessite un accès à la bibliothèque YouScribe
Tout savoir sur nos offres

Description

  • cours magistral
MA101 : Calculus (Semester 1) Lecture 2: Sets and Functions Tuesday, 20 September 2011 MA101 — Lecture 2: Sets and Functions 1/10
  • examples ma101
  • short review of sets
  • idea of a function
  • numbers system
  • ma101
  • subset of real numbers
  • subset of the real numbers
  • numbers
  • function

Sujets

Informations

Publié par
Nombre de lectures 18
Langue English
Poids de l'ouvrage 2 Mo

Extrait

!
CENTER FOR HIGH PERFORMANCE COMPUTING
GPUs for Scientific
Computing
Wim R.M. Cardoen
Center for High Performance Computing
wim.cardoen@utah.edu
Fall 2011 CENTER FOR HIGH PERFORMANCE COMPUTING
Overview
•  Why GPUs?
•  Architecture
•  CUDA
•  Basic example(s)
•  Shared Memory
•  Libraries
•  Cuda-Fortran
•  Alternatives to CUDA

12/20/11 2 CENTER FOR HIGH PERFORMANCE COMPUTING
Why GPUs?
12/20/11 3 CENTER FOR HIGH PERFORMANCE COMPUTING
M2090 (Fermi Architecture):
665 GFLOP/s (DP) & 1331 Gflops (SP)
Memory Bandwidth: 177 GB/s (no ECC)
12/20/11 4 CENTER FOR HIGH PERFORMANCE COMPUTING
Architecture
•  CPU/Multi-GPU System HP-SL390

Source: K. Spafford, J.S. Meredith and J. S. Vetter. "Quantifying NUMA and Contention Effects in
Multi-GPU Systems", Fourth Workshop on General-Purpose Computation on Graphics Processors
(GPGPU), 2011
12/20/11 5 CENTER FOR HIGH PERFORMANCE COMPUTING
•  M2090:
o  SIMT (cfr. SIMD)
o  16 SMPs (Streaming Multi Processors)
o  Each SMP: 32 cores/SMP => 512 cores
o  16 SMP: share 768 kB L2 Cache (new)
o  Constant Memory: 64 kB
o  Global Memory: 6 GB (DDR5)
o  GPU clock speed: 1.3 GHz
12/20/11 6 CENTER FOR HIGH PERFORMANCE COMPUTING
Fermi architecture block diagram
L2 Cache
Source:T. R. Halfhill. White Paper “Looking Beyond Graphics”
12/20/11 7 CENTER FOR HIGH PERFORMANCE COMPUTING
•  SMP:
o  Each SMP: 32 cores & 4 SFU
o  Each core: FP/INT Unit
o  L1 Cache (new)
o  Each SMP: can manage 48 threads
o  Warp Size: 32 threads
o  Shared memory (per block): 48 kB
o  #Registers (per block): 32768
12/20/11 8 CENTER FOR HIGH PERFORMANCE COMPUTING
•  SMP block diagram:
12/20/11 9 CENTER FOR HIGH PERFORMANCE COMPUTING
•  Multithreading in Fermi Arch.:

Source: T. R. Halfhill. White Paper “Looking Beyond Graphics”
12/20/11 10

  • Univers Univers
  • Ebooks Ebooks
  • Livres audio Livres audio
  • Presse Presse
  • Podcasts Podcasts
  • BD BD
  • Documents Documents