2010 Symposium on Application Accelerators in High-Performance Computing (SAAHPC'10)

Symposium on Application Accelerators in High-Performance Computing (SAAHPC)

Are you looking for the 2012 Symposium on Application Accelerators in High-Performance Computing (SAAHPC'12)? Go to: www.saahpc.org.


All plenary events will be in Room 406 (Auditorium) at the UT Conference Center in Knoxville, Tennessee, unless otherwise noted.

Monday, July 12

1:00 p.m. - 5:00 p.m.
Hybrid-Core Computing: Extending a Commodity Instruction Set with Application-Specific Logic
Please register separately for this Convey tutorial at http://www.conveycomputer.com/2010/saahpc.html
Room 413 C
1:00 p.m. - 5:00 p.m.
An Introduction to OpenCL for HPC
Please register separately for this AMD tutorial at http://developer.amd.com/community/events/pages/SAAHPC-10.aspx
Room 413 B

Tuesday, July 13

8:00 a.m.
Continental breakfast
8:30 a.m.
Opening Session
Chair: Gregory Peterson, University of Tennessee, Knoxville
Keynote: Toward Exascale Computational Science with Graphics Processors
Jeffrey Vetter
Leader of the Future Technologies Group and the Experimental Computing Laboratory at Oak Ridge National Laboratory
Welcome to NICS
Bruce Loftis, University of Tennessee, Knoxville
10:00 a.m.
10:30 a.m.
Design Space Exploration
Chair: Anton Kulchitsky, Arctic Region Supercomputing Center
Interactive Supercomputing Enabled by Cell Processor Accelerators
Paul Woodward, Jagan Jayaraj and Pei-Hung Lin
An Experimental Study on Performance Portability of OpenCL Kernels
Sean Rul, Hans Vandierendonck, Joris D'Haene and Koen De Bosschere
Design-Space Optimization for Automatic Acceleration of Streaming Applications
Shobana Padmanabhan, Yixin Chen and Roger D. Chamberlain
12:15 p.m.
Lunch and Exhibits
Room 404/Dining Room
1:30 p.m.
Reconfigurable Computing
Chair: Dan Poznanovic, Cray
Productively Scaling I/O Bound Streaming Applications with a Cluster of FPGAs
Andrew Schmidt, Siddhartha Datta, Ashwin Mendon and Ron Sass
Reconfigurable Supercomputing with Scalable Systolic Arrays and In-Stream Control for Wavefront Genomics Processing
Carlo Pascoe, Abhijeet Lawande, Herman Lam, Alan George, Yijun Sun, William Farmerie and Martin Herbordt
Crossing Timezones in the TimeTrial Performance Monitor
Joseph Lancaster and Roger Chamberlain
3:00 p.m.
3:30 p.m.
Technology Update
Chair: Craig Steffen, National Center for Supercomputing Applications
Accelerating HPC
Nash Palaniswamy, Intel
Heterogeneous Computing to Fusion
Norm Rubin, AMD
Design Philosophies for Memory-Centric Instruction Set Architectures
John Leidel, Convey Computer
5:30 p.m.
Reception and Poster Session

Using Graphics Processors to Accelerate Synthetic Aperture Imaging via Backpropagation
Dan Campbell and Daniel Cook
Static Memory Access Pattern Analysis on a Massively Parallel GPU
Byunghyun Jang, Dana Schaa, Perhaad Mistry and David Kaeli
A GPU-Based Flood Simulation Framework
Siddharth Shankar, Alfred Kalyanapu, Charles Hansen and Steven Burian
Reducing Preprocessing Overhead Times in a Reconfigurable Accelerator of Finite Difference Applications
Hiroshi Kataoka, Hiroaki Honda, Farhad Mehdipour, Koji Inoue and Kazuaki Murakami
GPU Accelerated Stochastic Simulation
David Jenkins and Gregory Peterson
Enhancing the Simulation of P Systems for the SAT Problem on GPUs
José María Cecilia, José Manuel García, G.D. Guerrero, Miguel Angel Martínez del Amor, Mario de Jesús Pérez-Jiménez and Manuel Ujaldon
GpuC: Data Parallel Language Extension to CUDA
Zeki Bozkus
Automatically Tuned Dense Linear Algebra for Multicore+GPU
Xing Fu, Xue Li and Gregory Peterson
Takagi Factorization on GPU using CUDA
Gagandeep S. Sachdev, Vishay Vanjani and Mary W. Hall
Accelerating Algorithms on GPUs in SCIRun: the Conjugate Gradient Case Study
Devon Yablonski, Miriam Leeser and Dana Brooks
A Strategy for Automatically Generating High Performance CUDA Code for a GPU Accelerator from a Specialized Fortran Code Expression
Pei-Hung Lin, Jagan Jayaraj and Paul Woodward
Using Queuing Theory to Model Streaming Applications
Rahav Dor, Joseph Lancaster, Mark Franklin, Jeremy Buhler and Roger Chamberlain
Dynamically Scheduled Cholesky Factorization on Multicore Architectures with GPU Accelerators
Emmanuel Agullo, Cedric Augonnet, Jack Dongarra, Hatem Ltaief, Raymond Namyst, Jean Roman, Samuel Thibault and Stanimire Tomov
Simulations of Large Membrane Regions using GPU-enabled Computations - Preliminary Results
Narayan Ganesan, Sandeep Patel and Michela Taufer
Medium-Grained Functions Mapping using Modern GPUs
Jiří Filipovič and Jan Fousek
GPU Accelerated Particle System for Triangulated Surface Meshes
Brad Peterson, Manasi Datar, Mary Hall and Ross Whitaker
Evaluating One-Sided Programming Models for GPU Cluster Computations
Jeff Hammond and A. Eugene Deprince III
OpenCL Evaluation for Numerical Linear Algebra Library Development
Peng Du, Piotr Luszczek and Jack Dongarra
Performance Comparison of Cholesky Decomposition on GPUs and FPGAs
Depeng Yang and Gregory Peterson
Two Nallatech FPGA Control Abstraction APIs: One Using A HomeBrew API and Later Using The OpenCL Interface
Craig Steffen
GPU Acceleration of Near-Minimal Logic Minimization
Ibrahim Savran and Jason D. Bakos

Wednesday, July 14

8:00 a.m.
Continental breakfast
8:30 a.m.
Keynote Session
Chair: Volodymyr Kindratenko, National Center for Supercomputing Applications
Keynote: Cyberinfrastructure for the 21st Century (CF21)
Robert Pennington
Program manager at the Office of Cyberinfrastructure, National Science Foundation
Arctic Region Supercomputer Center Work Related to GPGPUs and IBM Cell
Anton Kulchitsky, ARSC
Is Now the Time for Reconfigurable Computing Standards?
Thomas Steinke, Zuse-Institut Berlin
10:00 a.m.
10:30 a.m.
Applications in Chemistry and Physics
Chair: Michela Taufer, University of Delaware
Generation of Kernels to Calculate Electron Repulsion Integrals of High Angular Momentum Functions on GPUs – Preliminary Results
Alex Titov, Volodymyr Kindratenko, Ivan Ufimtsev, Todd Martinez
Fully Accelerating Quantum Monte Carlo Simulations of Real Materials on GPU Clusters
Kenneth Esler, Jeongnim Kim and David Ceperley
Accelerating Quantum Chromodynamics Calculations with GPUs
Guochun Shi, Steven Gottlieb, Aaron Torok, Volodymyr Kindratenko
Room 404/Dining Room
1:00 p.m.
Numerical Linear Algebra
Chair: Eric Stahlberg, OpenFPGA
Power-Aware Performance of Mixed Precision Linear Solvers for FPGAs and GPGPUs
JunKyu Lee, Junqing Sun, Gregory Peterson, Robert Harrison and Robert Hinde
Accelerating Double Precision Floating-Point Hessenberg Reduction on FPGA and Multicore Architectures
Miaoqing Huang, Lingyuan Wang and Tarek El-Ghazawi
Building Zero Latency Matrix Vector Multiplication Engines using FPGAs
Craig Petrie, Nallatech
3:00 p.m.
3:30 p.m.
Image and Signal Processing
Chair: Thomas Steinke, Zuse-Institut, Berlin
Tetrahedral Interpolation for Deformable Image Registration on GPUs
Cedomir Segulja, David Han, Tarek Abdelrahman, Kristy Brock and Joanne Moseley
Accelerating Image Feature Comparisons using CUDA on Commodity Hardware
Seth Warn, Amy Apon, Jackson Cothren, Wesley Emeneker and John Gauch
Using GPU VSIPL & CUDA to Accelerate RF Clutter Simulation
Dan Campbell, Mark McCans, Mike Davis and Mike Brinkmann
5:15 p.m.
Calhoun's Restaurant and Microbrewery
400 Neyland Dr., Knoxville

Thursday, July 15

8:00 a.m.
Continental breakfast
8:30 a.m.
Applications on GPUs
Chair: Joseph Lancaster, Washington University in St. Louis
Efficiency Considerations of Cauchy Reed-Solomon Implementations on Accelerator and Multi-Core Platforms
Thomas Steinke, Kathrin Peter and Sebastian Borchert
Faster File Matching using GPGPUs
Deephan Venkatesh Mohan and John Cavazos
GPU Accelerated Scalable Parallel Random Number Generators
Shuang Gao and Gregory D. Peterson
10:00 a.m.
10:30 a.m.
Applications on GPUs
Chair: Volodymyr Kindratenko, National Center for Supercomputing Applications
A Generic Approach for Developing Highly Scalable Particle-Mesh Codes for GPUs
Wolfgang Hoenig, Felix Schmitt, Rene Widera, Heiko Burau, Guido Juckeland, Matthias S. Mueller and Michael Bussmann
High Performance Relevance Vector Machine on GPUs
Depeng Yang and Gregory Peterson

SAAHPC'10 concludes

Available in Room 404/Dining Room
1:30 p.m. - 5:00 p.m.
OpenFPGA Meeting
Room 401
1:30 p.m. - 5:00 p.m.
UT College of Engineering Tutorial:
Accelerating Linear Algebra on Heterogeneous Architectures of Multicore and GPUs using MAGMA and the DPLASMA and StarPU Schedulers
Stanimire Tomov, George Bosilca, and Cédric Augonnet
Learn how to develop numerical software for heterogeneous architectures of Multicore and GPUs through a hybridization methodology that is built on:
  • Representing algorithms as collections of tasks and data dependencies, and
  • Properly scheduling the tasks' execution over the available multicore and GPU hardware components.
Examples will be given from the Matrix Algebra on GPU and Multicore Architectures (MAGMA) project, which aims to develop a new generation of linear algebra libraries that extends the sequential LAPACK-style algorithms for the highly parallel GPU and multicore heterogeneous architectures. As MAGMA has stand-alone hybrid algorithms, it also provides hybrid kernels to be used as building blocks in tile and "communication-avoiding" algorithms that must be efficiently scheduled. You will learn how to use dynamic schedulers to easily express these new algorithms, while at the same time fully use and extract high-performance from heterogeneous systems of multicore and GPUs. In particular, we will consider the DPLASMA and StarPU schedulers. DPLASMA is related to the Parallel Linear Algebra for Scalable Multi-core Architectures (PLASMA) project but extends its operation to the distributed memory regime, while StarPU is a runtime system that is specialized into scheduling tasks onto accelerator-based platforms.
Tutorial presentations:
Room 413 C