Go to file

Ruobing Han e0a361f47a fix coding style issue		2022-09-07 19:38:21 -04:00
.github/workflows	fix bug in CI	2022-06-20 23:03:01 -04:00
compilation	fix coding style issue	2022-09-07 19:38:21 -04:00
examples	fix linting issues	2022-05-24 20:43:47 -04:00
external	add external party for lock-free queue	2022-09-07 19:23:51 -04:00
runtime	change CMakeLists to include lock-free queue	2022-09-07 19:32:22 -04:00
.gitignore	fix linting issues	2022-05-24 20:43:47 -04:00
.gitmodules	add external party for lock-free queue	2022-09-07 19:23:51 -04:00
.pre-commit-config.yaml	add CI	2022-01-13 13:30:45 -05:00
CMakeLists.txt	add codebase for TACO submission	2022-05-04 08:59:38 -04:00
CONTRIBUTING.md	update how to contribute	2022-05-06 16:08:28 -04:00
LICENSE	add backbone, including basic features for compilation	2022-01-11 11:01:42 -05:00
README.md	edit README.md	2022-09-07 19:21:14 -04:00

README.md

CuPBoP: Cuda for Parallelized and Broad-range Processors

Introduction

CuPBoP is a framework which support executing unmodified CUDA source code on non-NVIDIA devices. Currently, CuPBoP support serveral CPU backends, including x86, AArch64, and RISC-V. Supporting Vortex backend is working in progress.

Install

Prerequisites

Linux
LLVM 14.0.1

Installation

Clone from github

git clone https://github.com/drcut/CuPBoP
cd CuPBoP
export CuPBoP_PATH=`pwd`
export LD_LIBRARY_PATH=$CuPBoP_PATH/build/runtime:$CuPBoP_PATH/build/runtime/threadPool:$LD_LIBRARY_PATH

As CuPBoP relies on CUDA structures, we need to download the CUDA header file

wget https://www.dropbox.com/s/r18io0zu3idke5p/cuda-header.tar.gz?dl=1
tar -xzf 'cuda-header.tar.gz?dl=1'
cp -r include/* runtime/threadPool/include/

Other CUDA files are also required for compiling CUDA source code to LLVM IR

wget https://www.dropbox.com/s/4pckqsjnl920gpn/cuda-10.1.tar.gz?dl=1
tar -xzf 'cuda-10.1.tar.gz?dl=1'

Build CuPBoP

mkdir build && cd build
cmake .. -DLLVM_CONFIG_PATH=`which llvm-config` # need path to llvm-config
make

Run HIST application in Hetero-mark benchmark

# Clone Hetero-mark benchmark
git clone https://github.com/drcut/SC_evaluate
cd SC_evaluate/Hetero-cox/src/hist
# Compile CUDA source code to LLVM IR
# this may raise error due to absence of CUDA library, just ignore them
clang++ -std=c++11 cuda/hist_cuda_benchmark.cu \\
    -I../.. --cuda-path=$CuPBoP_PATH/cuda-10.1 \\
    --cuda-gpu-arch=sm_50 -L$CuPBoP_PATH/cuda-10.1/lib64 \\
    -lcudart_static -ldl -lrt -pthread -save-temps -v  || true
# Translate host/kernel LLVM IR to formats that suitable for CPU
$CuPBoP_PATH/build/compilation/kernelTranslator \\
   hist_cuda_benchmark-cuda-nvptx64-nvidia-cuda-sm_50.bc kernel.bc
$CuPBoP_PATH/build/compilation/hostTranslator \\
   hist_cuda_benchmark-host-x86_64-unknown-linux-gnu.bc host.bc
# generate object files
llc --relocation-model=pic --filetype=obj  kernel.bc
llc --relocation-model=pic --filetype=obj  host.bc
# generate CPU executable file
g++ -o hist -fPIC -no-pie \\
-I$CuPBoP_PATH/runtime/threadPool/include \\
-L$CuPBoP_PATH/build/runtime  \\
-L$CuPBoP_PATH/build/runtime/threadPool \\
cuda/main.cc host.o kernel.o *.cc  ../common/benchmark/*.cc \\
../common/command_line_option/*.cc  ../common/time_measurement/*.cc \\
-I../.. -lpthread -lc -lx86Runtime -lthreadPool
# execute and verify
./hist -q -v

How to contribute?

Any kinds of contributions are welcome. Please refer to Contribution.md for more detail.

"COX: Exposing CUDA Warp-Level Functions to CPUs" ACM Transactions on Architecture and Code Optimization link
"CuPBoP: CUDA for Parallelized and Broad-range Processors" arxiv preprint link

Contributors

Ruobing Han
Jun Chen
Bhanu Garg
Xule Zhou
John Lu
Chihyo Ahn
Haotian Sheng
Blaise Tine
Hyesoon Kim