Ruobing Han
|
9e40c806fb
|
Merge branch 'PPoPP' of github.com:drcut/CuPBoP into PPoPP
|
2022-09-23 13:08:40 -04:00 |
Ruobing Han
|
18b577b008
|
add CMake test
|
2022-09-23 13:08:28 -04:00 |
Ruobing Han
|
397cb9443f
|
Create LICENSE
|
2022-09-23 09:28:00 -04:00 |
Ruobing Han
|
9093f802b0
|
Delete LICENSE
remove the temporary license
|
2022-09-23 09:26:32 -04:00 |
Ruobing Han
|
63c9cc566c
|
avoid unnecessary extend arrays
|
2022-09-23 09:15:10 -04:00 |
Ruobing Han
|
c6442c8b23
|
modify name of extend arrays
|
2022-09-23 09:11:45 -04:00 |
Ruobing Han
|
ed51e5af91
|
apply divergence analysis for replicating local variables
|
2022-09-22 16:24:16 -04:00 |
Ruobing Han
|
e99205aa8b
|
apply divergence analysis for replicating local variables
|
2022-09-22 16:15:38 -04:00 |
Han Ruobing
|
8da1ecc5fd
|
update README
|
2022-09-22 14:53:32 -04:00 |
Han Ruobing
|
124e7fa0ae
|
remove unless initilization
|
2022-09-22 14:44:42 -04:00 |
Ruobing Han
|
dabe03409e
|
update CI
|
2022-09-22 11:43:03 -04:00 |
Ruobing Han
|
c0c3490e23
|
update CI
|
2022-09-22 11:32:03 -04:00 |
Ruobing Han
|
a8643e6981
|
update CI
|
2022-09-22 11:25:33 -04:00 |
Ruobing Han
|
c2222f2e39
|
update CMake to use official CUDA toolkit
|
2022-09-22 11:20:50 -04:00 |
Ruobing Han
|
f712c30b09
|
implement multistream APIs for CPU backend
|
2022-09-19 10:41:40 -04:00 |
Ruobing Han
|
ca089c4274
|
fix CI
|
2022-09-16 09:09:57 -04:00 |
Ruobing Han
|
5b40786ae3
|
fix CI
|
2022-09-15 21:14:29 -04:00 |
Ruobing Han
|
961a931f10
|
fix CI
|
2022-09-15 21:04:51 -04:00 |
Ruobing Han
|
e5f020d997
|
add static/dynamic shared memory example
|
2022-09-15 20:51:53 -04:00 |
Ruobing Han
|
3d22cc1f36
|
fix bug for dynamic shared memory
|
2022-09-15 20:38:48 -04:00 |
Ruobing Han
|
ba2c49abdd
|
add static shared memory example
|
2022-09-15 18:53:13 -04:00 |
Ruobing Han
|
3875e179b4
|
update runtime and threadPool with debug tools
|
2022-09-15 18:43:14 -04:00 |
Ruobing Han
|
f2a4f7fe64
|
update HostTranslator with debug tools
|
2022-09-15 18:19:13 -04:00 |
Ruobing Han
|
bb3724c486
|
update compilation with DEBUG mode
|
2022-09-15 12:33:28 -04:00 |
Ruobing Han
|
9152feb24f
|
remove useless examples
|
2022-09-15 11:31:58 -04:00 |
Ruobing Han
|
49adfd026c
|
add vecadd example and update README.md
|
2022-09-15 11:15:21 -04:00 |
Ruobing Han
|
91e94ad3a6
|
fix bug for segfault if without cudaSetDevice
|
2022-09-15 11:10:44 -04:00 |
Ruobing Han
|
ef77421142
|
add back O3 optimization in kernelTranslator
|
2022-09-07 20:17:34 -04:00 |
Ruobing Han
|
9cbbad3c4b
|
update CI/CD
|
2022-09-07 19:50:49 -04:00 |
Ruobing Han
|
8df75daf25
|
update commands in CI/CD
|
2022-09-07 19:42:59 -04:00 |
Ruobing Han
|
e0a361f47a
|
fix coding style issue
|
2022-09-07 19:38:21 -04:00 |
Ruobing Han
|
7572e0df27
|
change CMakeLists to include lock-free queue
|
2022-09-07 19:32:22 -04:00 |
Ruobing Han
|
f67d2849a4
|
add external party for lock-free queue
|
2022-09-07 19:23:51 -04:00 |
Ruobing Han
|
e0db88fb49
|
edit README.md
|
2022-09-07 19:21:14 -04:00 |
Ruobing Han
|
cf12d604eb
|
support CloverLeaf on LLVM14
|
2022-07-13 18:39:59 -04:00 |
Ruobing Han
|
8fddb647bd
|
remove performance optimization in kernelTranslator
|
2022-06-25 15:22:50 -04:00 |
Ruobing Han
|
fc1ed8d224
|
remove optnone metadata
|
2022-06-25 14:44:50 -04:00 |
Ruobing Han
|
57367c8348
|
fix bug in sync
|
2022-06-20 23:57:51 -04:00 |
Ruobing Han
|
c1045d8140
|
fix bug in CI
|
2022-06-20 23:03:01 -04:00 |
Ruobing Han
|
2618bd21a7
|
integrate lock-free queue into CI
|
2022-06-20 22:53:19 -04:00 |
Ruobing Han
|
db585083bb
|
use lock-free queue
|
2022-06-20 22:51:12 -04:00 |
Ruobing Han
|
cbf4cd90d8
|
[WIP] use lock-free queue
|
2022-06-20 19:01:28 -04:00 |
RobinHan
|
7d29a409f6
|
fix bug for inserting sync after kernelLaunch
|
2022-06-18 13:39:26 -04:00 |
RobinHan
|
4791dfc9c9
|
fix bug for hostCompilation, change function name
|
2022-06-18 13:02:19 -04:00 |
RobinHan
|
b189526edb
|
update CI to LLVM14
|
2022-06-17 23:46:45 -04:00 |
RobinHan
|
d7668ccd86
|
[WIP] migriate to LLVM14
|
2022-06-17 23:43:22 -04:00 |
RobinHan
|
bcdcccecc9
|
update README
|
2022-06-17 16:38:25 -04:00 |
Han Ruobing
|
f6ef5436de
|
reconstruct the code constructure
|
2022-06-07 12:53:32 -07:00 |
Ruobing Han
|
d17128640f
|
Merge pull request #11 from jchen706/master
fixed dwt2d workflow error
|
2022-06-07 19:12:40 +00:00 |
jchen706
|
d22722909a
|
fix dwt2d cuda version and input in SC
|
2022-05-25 00:10:52 -04:00 |