Commit Graph

24 Commits

Author SHA1 Message Date
Colin ac61c4d925 set use local dataset. 2024-02-24 13:44:22 +08:00
Colin 087366c59b init gpt train without download. 2024-02-24 13:40:39 +08:00
Colin b992ae99fa Train on wiki data. 2024-02-24 12:06:30 +08:00
Colin 7d16743184 enable pretrain. 2024-02-22 15:03:32 +08:00
周以晴 b655153ec7
Merge pull request #2 from Yiqing-Zhou/fix-custom-models
[fix] fix genarate with custom models does not go to custom_models
2023-05-28 22:58:51 +08:00
yiqing-zhou 9f8f9ecc89 [fix] fix genarate with custom models does not go to custom_models 2023-05-28 22:57:51 +08:00
周以晴 e8d543558c
Merge pull request #1 from Yiqing-Zhou/custom-model-configs
[feature] custom model configs
2023-05-28 21:48:24 +08:00
yiqing-zhou fcb93e52c4 [feature] custom model configs 2023-05-28 21:39:51 +08:00
yiqing-zhou b76d333f39 [code] formatter-caused changes 2023-05-28 20:02:56 +08:00
周以晴 10a88a5012
Update README.md 2023-05-14 23:08:44 +08:00
Yiqing-Zhou 30df20402d [code] update .vscode launch.json 2023-05-14 22:55:01 +08:00
Yiqing-Zhou 6827898339 . 2023-05-14 22:53:28 +08:00
Yiqing-Zhou 216bc4643c [feature] custom_models 2023-05-14 22:23:16 +08:00
Yiqing-Zhou 5e6b747baf [fix] add patch to fix DeepSpeedStrategy offload 'zero_force_ds_cpu_optimizer' issue 2023-05-09 23:00:28 +08:00
Yiqing-Zhou 8a5e2043bb [optimize] map_location='cpu' for load_from_checkpoint 2023-05-09 00:37:52 +08:00
Yiqing-Zhou 3f92bbbaa2 [feature] new args learning_rate max_epochs 2023-05-09 00:35:28 +08:00
周以晴 dc2941f790
Create LICENSE 2023-05-08 00:26:38 +08:00
Yiqing-Zhou 70ff2acaf0 [fix] add patch to fix FSDPStrategy checkpoint issue 2023-05-07 16:51:57 +08:00
Yiqing-Zhou 5392a845f7 [feature] export model checkpoint from pl.LightningModule 2023-05-07 13:18:58 +08:00
Yiqing-Zhou 09507449f7 [code] refactor 2023-05-07 13:01:02 +08:00
Yiqing-Zhou 939be31c10 [feature] new arg use_tril_attention_mask 2023-05-06 21:06:18 +08:00
Yiqing-Zhou 0324eb4103 [code] update requirements 2023-05-06 21:05:53 +08:00
Yiqing-Zhou 9b7f9b9d60 [feature] persistent_workers; new args accumulate_grad_batches strategy 2023-05-06 19:39:24 +08:00
Yiqing-Zhou 45fa065530 Initial Commit 2023-05-04 21:52:25 +08:00