Go to file

Ruobing Han 397cb9443f Create LICENSE		2022-09-23 09:28:00 -04:00
.github/workflows	update CI	2022-09-22 11:43:03 -04:00
common	update HostTranslator with debug tools	2022-09-15 18:19:13 -04:00
compilation	avoid unnecessary extend arrays	2022-09-23 09:15:10 -04:00
examples	add static/dynamic shared memory example	2022-09-15 20:51:53 -04:00
external	add external party for lock-free queue	2022-09-07 19:23:51 -04:00
runtime	implement multistream APIs for CPU backend	2022-09-19 10:41:40 -04:00
.gitignore	fix linting issues	2022-05-24 20:43:47 -04:00
.gitmodules	add external party for lock-free queue	2022-09-07 19:23:51 -04:00
.pre-commit-config.yaml	add CI	2022-01-13 13:30:45 -05:00
CMakeLists.txt	update CMake to use official CUDA toolkit	2022-09-22 11:20:50 -04:00
CONTRIBUTING.md	update how to contribute	2022-05-06 16:08:28 -04:00
LICENSE	Create LICENSE	2022-09-23 09:28:00 -04:00
README.md	update README	2022-09-22 14:53:32 -04:00

README.md

CuPBoP: Cuda for Parallelized and Broad-range Processors

Introduction

CuPBoP is a framework which support executing unmodified CUDA source code on non-NVIDIA devices. Currently, CuPBoP support serveral CPU backends, including x86, AArch64, and RISC-V. Supporting Vortex (a RISC-V GPU) is working in progress.

Install

Prerequisites

Linux system
LLVM 14.0.1
CUDA Toolkit

Although CuPBoP does not require NVIDIA GPUs, it needs CUDA to compile the source programs to NVVM/LLVM IRs. CUDA toolkit can be built on machines without NVIDIA GPUs. For building CUDA toolkit, please refer to https://developer.nvidia.com/cuda-downloads.

Installation

Clone from github

git clone --recursive https://github.com/drcut/CuPBoP
cd CuPBoP
export CuPBoP_PATH=`pwd`
export LD_LIBRARY_PATH=$CuPBoP_PATH/build/runtime:$CuPBoP_PATH/build/runtime/threadPool:$LD_LIBRARY_PATH
export CUDA_PATH=/usr/local/cuda-11.7 # set to your own location

Build CuPBoP

mkdir build && cd build
#set -DDEBUG=ON for debugging
cmake .. \
   -DLLVM_CONFIG_PATH=`which llvm-config` \
   -DCUDA_PATH=$CUDA_PATH
make

Run Vector Addition example

cd examples/vecadd
# Compile CUDA source code (both host and kernel) to bitcode files
clang++ -std=c++11 vecadd.cu \
      -I../.. --cuda-path=$CUDA_PATH \
      --cuda-gpu-arch=sm_50 -L$CUDA_PATH/lib64 \
      -lcudart_static -ldl -lrt -pthread -save-temps -v  || true
# Apply compilation transformations on the kernel bitcode file
$CuPBoP_PATH/build/compilation/kernelTranslator \
      vecadd-cuda-nvptx64-nvidia-cuda-sm_50.bc kernel.bc
# Apply compilation transformations on the host bitcode file
$CuPBoP_PATH/build/compilation/hostTranslator \
      vecadd-host-x86_64-unknown-linux-gnu.bc host.bc
# Generate object files
llc --relocation-model=pic --filetype=obj  kernel.bc
llc --relocation-model=pic --filetype=obj  host.bc
# Link with runtime libraries and generate the executable file
g++ -o vecadd -fPIC -no-pie \
      -I$CuPBoP_PATH/runtime/threadPool/include \
      -L$CuPBoP_PATH/build/runtime  \
      -L$CuPBoP_PATH/build/runtime/threadPool \
      host.o kernel.o \
      -I../.. -lc -lx86Runtime -lthreadPool -lpthread
# Execute
./vecadd

How to contribute?

Any kinds of contributions are welcome. Please refer to Contribution.md for more detail.

If you want to refer CuPBoP in your projects, please cite the related papers:

Contributors

Ruobing Han
Jun Chen
Bhanu Garg
Xule Zhou
John Lu
Chihyo Ahn
Haotian Sheng
Blaise Tine
Hyesoon Kim