Colin
|
ac61c4d925
|
set use local dataset.
|
2024-02-24 13:44:22 +08:00 |
Colin
|
087366c59b
|
init gpt train without download.
|
2024-02-24 13:40:39 +08:00 |
Colin
|
b992ae99fa
|
Train on wiki data.
|
2024-02-24 12:06:30 +08:00 |
Colin
|
7d16743184
|
enable pretrain.
|
2024-02-22 15:03:32 +08:00 |
周以晴
|
b655153ec7
|
Merge pull request #2 from Yiqing-Zhou/fix-custom-models
[fix] fix genarate with custom models does not go to custom_models
|
2023-05-28 22:58:51 +08:00 |
yiqing-zhou
|
9f8f9ecc89
|
[fix] fix genarate with custom models does not go to custom_models
|
2023-05-28 22:57:51 +08:00 |
周以晴
|
e8d543558c
|
Merge pull request #1 from Yiqing-Zhou/custom-model-configs
[feature] custom model configs
|
2023-05-28 21:48:24 +08:00 |
yiqing-zhou
|
fcb93e52c4
|
[feature] custom model configs
|
2023-05-28 21:39:51 +08:00 |
yiqing-zhou
|
b76d333f39
|
[code] formatter-caused changes
|
2023-05-28 20:02:56 +08:00 |
周以晴
|
10a88a5012
|
Update README.md
|
2023-05-14 23:08:44 +08:00 |
Yiqing-Zhou
|
30df20402d
|
[code] update .vscode launch.json
|
2023-05-14 22:55:01 +08:00 |
Yiqing-Zhou
|
6827898339
|
.
|
2023-05-14 22:53:28 +08:00 |
Yiqing-Zhou
|
216bc4643c
|
[feature] custom_models
|
2023-05-14 22:23:16 +08:00 |
Yiqing-Zhou
|
5e6b747baf
|
[fix] add patch to fix DeepSpeedStrategy offload 'zero_force_ds_cpu_optimizer' issue
|
2023-05-09 23:00:28 +08:00 |
Yiqing-Zhou
|
8a5e2043bb
|
[optimize] map_location='cpu' for load_from_checkpoint
|
2023-05-09 00:37:52 +08:00 |
Yiqing-Zhou
|
3f92bbbaa2
|
[feature] new args learning_rate max_epochs
|
2023-05-09 00:35:28 +08:00 |
周以晴
|
dc2941f790
|
Create LICENSE
|
2023-05-08 00:26:38 +08:00 |
Yiqing-Zhou
|
70ff2acaf0
|
[fix] add patch to fix FSDPStrategy checkpoint issue
|
2023-05-07 16:51:57 +08:00 |
Yiqing-Zhou
|
5392a845f7
|
[feature] export model checkpoint from pl.LightningModule
|
2023-05-07 13:18:58 +08:00 |
Yiqing-Zhou
|
09507449f7
|
[code] refactor
|
2023-05-07 13:01:02 +08:00 |
Yiqing-Zhou
|
939be31c10
|
[feature] new arg use_tril_attention_mask
|
2023-05-06 21:06:18 +08:00 |
Yiqing-Zhou
|
0324eb4103
|
[code] update requirements
|
2023-05-06 21:05:53 +08:00 |
Yiqing-Zhou
|
9b7f9b9d60
|
[feature] persistent_workers; new args accumulate_grad_batches strategy
|
2023-05-06 19:39:24 +08:00 |
Yiqing-Zhou
|
45fa065530
|
Initial Commit
|
2023-05-04 21:52:25 +08:00 |