|
Technical Program Schedule |
|
Saturday, February 10th, 2007 |
All Day Events |
Workshop RIDMS-2: Second Workshop on Real-Time and Interactive Digital Media Supercomputing
|
Sunday, February 11th, 2007 |
All Day Events |
Workshop INTERACT-11: Eleventh Workshop on Interaction between Compilers and Computer Architectures
|
Workshop CAECW-10: Tenth Workshop on Computer Architecture Evaluation using Commercial Workloads
|
Morning Events |
Workshop RIDMS-2: Second Workshop on Real-Time and Interactive Digital Media Supercomputing
|
Workshop CMP-MSI: First Workshop on Chip Multiprocessor Memory Systems and Interconnects
|
Tutorial: Practical Cache Performance Modeling for Computer Architects
Y. Solihin (NCSU), T. Puzak, and P. Emma (IBM Research)
|
Afternoon Events |
Workshop CARD: First Workshop on Computer Architecture Research Directions
|
Tutorial: Microprocessor Memory Array Circuits for Architects
|
6:00PM - 8:00PM |
HPCA Conference Reception |
Monday, February 12th, 2007 |
7:30AM - 8:30AM |
Breakfast |
8:30AM - 8:50AM |
Welcome Message |
8:50AM - 10:00AM |
Keynote I |
Interconnect-Centric Computing
Bill Dally (Willard R. and Inez Kerr Bell Professor of Engineering and Chairman,
Department of Computer Science, Stanford University)
Abstract: As we enter the many-core era, the interconnection networks of a
computer system, rather than the processor or memory modules, will dominate
its performance. Several recent developments in interconnection
network architecture including global adaptive routing, high-radix
routers, and technology-matched topologies offer large improvements in
the performance and efficiency of this critical component. The
implementation of a portion of several interconnection networks on
multi-core chips also raises new opportunities and challenges for network
design. This talk explores the role of interconnection networks in
modern computer systems, recent developments in network architecture
and design, and the challenges of on-chip interconnection networks.
Examples will be drawn from several systems including the Cray BlackWidow.
Slides
| 10:00AM - 10:30AM |
Break |
10:30AM - 12:00PM |
Session 1: Multiprocessor Architectures |
An Adaptive Shared/Private NUCA Cache Partitioning Scheme for Chip Multiprocessors
H. Dybdahl (Norwegian University of Science and Technology) and P. Stenström (Chalmers)
Evaluating MapReduce for Multicore and Multiprocessor Systems
C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski and C. Kozyrakis (Stanford University)
Extending Multicore Architectures to Exploit Hybrid Parallelism in Single-Thread Applications
H. Zhong, S. Lieberman, and S. Mahlke, (University of Michigan)
|
12:00PM - 1:30PM |
Lunch |
1:30PM - 3:00PM |
Session 2: Industry |
Implications of Device Timing Variability on Full Chip Timing
M. Annavaram, E. Grochowski, and P. Reed (Intel)
Optical Interconnect Opportunities for Future Server Memory Systems
Y. Katayama and A. Okazaki (IBM)
|
3:00PM - 3:30PM |
Break |
3:30PM - 5:00PM |
Session 3: Prefetching |
Feedback Directed Prefetching: Improving the Performance and Bandwidth-Efficiency of Hardware Prefetchers
S. Srinath (Microsoft and The University of Texas at Austin), O. Mutlu (Microsoft Research), H. Kim, and Y. Patt (University of Texas at Austin)
Improving Branch Prediction and Predicated Execution in Out-of-Order Processors
E. Quiñones, J.M. Parcerisa (Universitat Politècnica de Catalunya) and A. González (Intel and Universitat Politècnica de Catalunya)
Accelerating and Adapting Precomputation Threads for Efficient Prefetching
W. Zhang, B. Calder, and D. Tullsen (University of California, San Diego)
|
6:00PM - 7:30PM |
TCC Business Meeting |
Tuesday, February 13th, 2007 |
7:30AM - 8:30AM |
Breakfast |
8:30AM - 9:30AM |
Keynote II |
Petascale Computing Research Challenges – A Manycore Perspective
Steve Pawlowski (Senior Fellow and Chief Technology Officer of the Digital Enterprise Group, Intel Corporation)
Abstract: Future High Performance Computing will undoubtedly reach Petascale and beyond. Today’s HPC is tomorrow’s Personal Computing. What are the evolving processor architectures towards Multi-core and Many-core for the best performance per watt; memory bandwidth solutions to feed the ever more powerful processors; intra-chip interconnect options for optimal bandwidth vs. power? With Moore’s Law continuing to prove its viability and shrinking transistors’ geometry mean that improving reliability is even more challenging. Intel Senior Fellow and Chief Technology Officer of Intel’s Digital Enterprise Group, Steve Pawlowski, will provide his technology vision, insight and research challenges to achieve the vision of Petascale computing and beyond.
Slides
|
9:30AM - 10:00AM |
Break |
10:00AM - 12:00PM |
Session 4: Memory Systems I (Parallel Session) |
A Scalable, Non-blocking Approach to Transactional Memory
H. Chafi, J. Casper, B. Carlstrom, A. McDonald, C. Cao Minh, W. Baek, C. Kozyrakis, and K. Olukotun (Stanford University)
Fully-Buffered DIMM Memory Architectures: Understanding Mechanisms, Over-heads and Scaling
B. Ganesh, B. Jacob (University of Maryland College Park), D. Wang (Metaram), and A. Jaleel (Intel)
HARD: Hardware-Assisted Lockset-Based Race Detection
P. Zhou, R. Teodorescu, and Y. Zhou (University of Illinois at Urbana-Champaign)
Colorama: Architectural Support for Data-Centric Synchronization
L. Ceze, Pablo Montesinos (University of Illinois at Urbana-Champaign), C. von Praun (IBM T J Watson) and J. Torrellas (University of Illinois at Urbana-
Champaign)
|
|
Session 5: Error Detection and Fault-Tolerance (Parallel Session) |
Error Detection Via Online Checking of Cache Coherence with Token Coherence Signatures
A. Meixner and D. Sorin (Duke University)
A Low Overhead Fault Tolerant Coherence Protocol for CMP Architectures
R. Fernndez-Pascual, J. Garca, M. Acacio, and J. Duato, (Universidad de Murcia and Universidad Politcnica de Valencia)
Perturbation-Based Fault Screening
P. Racunas (Intel), K. Constantinides (University of Michigan), S. Manne, and S. Mukherjee (Intel)
Application-Level Correctness and its Impact on Fault Tolerance
X. Li, and D. Yeung (University of Maryland at College Park)
|
12:00PM - 1:30PM |
Lunch |
1:30PM - 3:00PM |
Session 6: Thermal Modeling and SIMD (Parallel Session) |
Thermal Herding: Microarchitecture Techniques for Controlling HotSpots in High-Performance 3D-Integrated Processors
K. Puttaswamy and G. Loh (Georgia Institute of Technology)
Modeling and Managing Thermal Profiles of Rack-Mounted Servers with ThermoStat
J. Choi, Y. Kim, A. Sivasubramaniam, J. Srebric, Q. Wang (Pennsylvania State University), and J. Lee (KAIST)
Liquid SIMD: Abstracting SIMD Hardware Using Lightweight Dynamic Mapping
N. Clark, A. Hormati, S. Mahlke (University of Michigan), S. Yehia, and K. Flautner (ARM)
|
|
Session 7: Chip Multiprocessors, Simultaneous Multi-threading, and Caches (Parallel Session) |
Interactions Between Compression and Prefetching in Chip Multiprocessors
A. Alameldeen (Intel) and D. Wood (University of Wisconsin-Madison)
A Memory-Level Parallelism Aware Fetch Policy for SMT Processors
S. Eyerman and L. Eeckhout (Ghent University)
Line Distillation: Increasing Cache Capacity By Filtering Unused Words in Cache Lines
M. Qureshi, M. Suleman, and Y. Patt (University of Texas at Austin)
|
3:00PM - 3:30PM |
Break |
3:30PM - 5:00PM |
Panel |
Researching Novel Systems: To Instantiate, Emulate, Simulate, or Analyticate?
Moderator: Doug Burger (University of Texas at Austin)
Panel Members:
Joel Emer (Intel)
Phil Emma (IBM)
Steve Keckler (University of Texas at Austin)
Yale Patt (University of Texas at Austin)
Dave Patterson (University of California, Berkeley)
Description: The computer architecture research community has a rich menu of
methodological options, which includes building full system prototypes, measuring in simulation,
emulating on FPGAs, or constructing sophisticated analytic models. However, building custom
systems has become enormously expensive, especially given the current funding climate.
Simulations have become enormously complex as well, often including full operating systems.
Analytic models have become less popular as system complexity has grown. Finally, some argue
that FPGA emulation of hardware is the right approach for the future, while others opine
that it is the worst of all worlds. This panel will debate these various points of view, which are
of great interest to the funding sponsors of our community.
|
6:00PM - 10:30PM |
Banquet |
Wednesday, February 14th, 2007 |
7:00AM - 8:00AM |
Breakfast |
8:00AM - 10:00AM |
Session 8: Memory Systems II |
LogTM SE: Decoupling Hardware Transactional Memory from Caches
L. Yen, J. Bobba, M. Marty, K. Moore, H. Volos, M. Hill, M. Swift, and D. Wood (University of Wisconsin-Madison)
MemTracker: Efficient and Programmable Support for Memory Access Monitoring and Debugging
G. Venkataramani, B. Roemer (Georgia Institute of Technology), Y. Solihin (North Carolina State University) and M. Prvulovic (Georgia Institute of Technology)
A Burst Scheduling Access Reordering Mechanism
J. Shao and B. Davis, (Michigan Technological University)
Exploiting Postdominance for Speculative Parallelization
M. Agarwal, K. Malik, K. Woley, S. Stone, M. Frank (University of Illinois at Urbana Champaign)
|
10:00AM - 10:30AM |
Break |
10:30AM - 12:30PM |
Session 9: Virtual Machines, Caches and Modeling |
Concurrent Direct Network Access for Virtual Machine Monitors
P. Willmann, J. Shafer, D. Carr (Rice University), A. Menon (EPFL), S. Rixner, A. Cox (Rice University), and W. Zwaenepoel (EPFL)
A Domain-Specific On-Chip Network Design for Large Scale Cache Systems
Y. Jin, E. Kim (Texas A&M University), K. Yum (University of Texas, San Antonio)
An Adaptive Cache Coherence Protocol Optimized for Producer-Consumer Sharing
L. Cheng, J. Carter (University of Utah), and D. Dai (SGI)
Illustrative Design Space Studies with Microarchitectural Regression Models
B. Lee and D. Brooks (Harvard University)
|
12:30PM |
Conference Program Ends |
|
|
|
|