earticle

논문검색

Yet another Hybrid Strategy for Auto-tuning SpMV on GPUs

초록

영어

Sparse matrix-vector multiplication (SpMV) is a key linear algebra algorithm and is widely used in many application domains. Besides multi-core architecture, there is also extensive research focusing on accelerating SpMV on many-core Graphics Processing Units (GPUs). SpMV computations have many indirect and irregular memory accesses, and load imbalance could occur while mapping computations onto single-instruction, multiple-data (SIMD) GPUs. SpMV is highly memory bandwidth-bound, though GPUs have massive computational resources, the performance of SpMV on GPUs is still unsatisfying. In this paper, we present a new hybrid strategy for auto-tuning SpMV on GPUs. Our strategy combines the advantages of row-major storage and column-major storage. Like many other strategies, we reordered a given sparse matrix according to row lengths in decreasing order. In order to be more adaptive and efficient, we proposed a new hybrid Blocked CSR and JDS (BCJ) format based on original CSR and JDS. BCJ splits a sparse matrix into a denser part and a sparser part after reordering and uses different kernels to process the corresponding part. And we proposed corresponding auto-tuning framework to help transforming matrix and launching kernels according to the sparsity characteristics of the matrix. A CUDA implementation of BCJ outperforms the original formats significantly on a broad range of unstructured sparse matrices.

목차

Abstract
 1. Introduction
 2. Background
  2.1. GPUs and CUDA Programming Model
  2.2. Formats for Sparse Matrix
 3. Hybrid auto-tuning Framework
  3.1. Blocked and Padded CSR (B-CSR)
  3.2. Blocked and Padded JDS (B-JDS)
  3.3. Parameters to be Modeled
 4. Experimental Results
 5. Related Work
 6. Conclusions
 References

저자정보

  • Zhaohui Wang University of Science and Technology of China
  • Xiaojie Qiu University of Science and Technology of China
  • Aimin Zhang University of Science and Technology of China
  • Yichao Cheng University of Science and Technology of China
  • Yi Peng University of Science and Technology of China
  • Sun Sun University of Science and Technology of China

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.