This is a 4 seires of 4 talks about GPGPUS, intended for the practical engineer.
Note: There are installation instructions for the hands-on part at the bottom of this page.
Also, please note that this page is being updated as the we run through the series, so please check it up every now and then if you're interested in these lectures.
General Purpose GPU programming became a hot topic in the last few years, ranging from academic studies to being used by commercial software products. As an example, three out of the world's top10 supercomputers (June2011 list) contain GPUs in them. This series of lectures focuses on OpenCL, the open standard for parallel programming of heterogeneous systems.
The third and fourth meetings will include hands-on experience, under Ofer's guidance. For these meetings, each participant will have to bring a laptop with:
Lecture slides for part 1 in PDF format
Lecture slides for part 2 in PDF format
Lecture slides for part 3 in PDF format
Hands-on code to work on (zip format)
Evolution and Graphics, by Eric Demers, pdf
Lecture slides for part 4 in PDF format
1. GPGPU introduction
The first lecture is an introduction to GPU architecture and GPGPU programing. It covers the differences between GPU and CPU architectures, and how these differences impose restrictions on programming GPUs. We will also touch the issue of memory aspects of GPU architecture and the overall system (CPU & GPU)
2. OpenCL overview
From the Khronos website: "OpenCL" is the first open, royalty-free standard for cross-platform, parallel programming of modern processors found in personal computers, servers and handheld/embedded devices. This lecture will provide an overview of OpenCL, covering the API programming aspects (such as OpenCL objects, contexts, queues, events, etc.) as well as the language enhancements (such as vectors, images, samplers, built-in functions etc.)
3. OpenCL Do's and Don'ts
This lecture provides a practical guide for programming in OpenCL by doing a hands-on guided experience of writing OpenCL applications and kernels. Starting from basic examples through more complex scenarios, we will provide some tips for writing code that provides the required correct results. We will also provide some performance tips.
4. OpenCL Optimization & Profiling
This lecture focuses on performance aspects of OpenCL. We will provide a hands-on experience of improving performance of OpenCL kernels by optimizing a specific example. In addition we will show ways to profile the kernel, including working with profiling tools such as AMD kernel profiler and gDebugger. Note that some of issues presented in this lecture will be possible only on AMD GPUs.
Installation: I did it step by step to verify that it works. It is tested on OpenSuSe 12.1 Required Installs:
For those of you who don't like automatic installation scripts (with a reboot in the end), you may install the AMD APP SDK manually, as mentioned in item #13 in the README's FAQ at the end.
Note that this all boils down to copying the OpenCL directory into your /etc folder, and install a few libraries. I chose to copy the library files into my own /usr/lib64, run ldconfig, and that was it (for installation item #3).