Commit Graph

  • 122cbd9ff8 Use local tokenizer. main Colin 2024-02-24 14:14:12 +0800
  • ac61c4d925 set use local dataset. Colin 2024-02-24 13:44:22 +0800
  • 087366c59b init gpt train without download. Colin 2024-02-24 13:40:39 +0800
  • b992ae99fa Train on wiki data. Colin 2024-02-24 12:06:30 +0800
  • 7d16743184 enable pretrain. Colin 2024-02-22 15:03:32 +0800
  • b655153ec7
    Merge pull request #2 from Yiqing-Zhou/fix-custom-models 周以晴 2023-05-28 22:58:51 +0800
  • 9f8f9ecc89 [fix] fix genarate with custom models does not go to custom_models yiqing-zhou 2023-05-28 22:51:42 +0800
  • e8d543558c
    Merge pull request #1 from Yiqing-Zhou/custom-model-configs 周以晴 2023-05-28 21:48:24 +0800
  • fcb93e52c4 [feature] custom model configs yiqing-zhou 2023-05-28 21:33:46 +0800
  • b76d333f39 [code] formatter-caused changes yiqing-zhou 2023-05-28 20:02:56 +0800
  • 10a88a5012
    Update README.md 周以晴 2023-05-14 23:08:44 +0800
  • 30df20402d [code] update .vscode launch.json Yiqing-Zhou 2023-05-14 22:55:01 +0800
  • 6827898339 . Yiqing-Zhou 2023-05-14 22:53:28 +0800
  • 216bc4643c [feature] custom_models Yiqing-Zhou 2023-05-14 22:23:16 +0800
  • 5e6b747baf [fix] add patch to fix DeepSpeedStrategy offload 'zero_force_ds_cpu_optimizer' issue Yiqing-Zhou 2023-05-09 23:00:28 +0800
  • 8a5e2043bb [optimize] map_location='cpu' for load_from_checkpoint Yiqing-Zhou 2023-05-09 00:37:52 +0800
  • 3f92bbbaa2 [feature] new args learning_rate max_epochs Yiqing-Zhou 2023-05-09 00:02:29 +0800
  • dc2941f790
    Create LICENSE 周以晴 2023-05-08 00:26:38 +0800
  • 70ff2acaf0 [fix] add patch to fix FSDPStrategy checkpoint issue Yiqing-Zhou 2023-05-07 16:51:57 +0800
  • 5392a845f7 [feature] export model checkpoint from pl.LightningModule Yiqing-Zhou 2023-05-07 13:18:47 +0800
  • 09507449f7 [code] refactor Yiqing-Zhou 2023-05-07 13:01:02 +0800
  • 939be31c10 [feature] new arg use_tril_attention_mask Yiqing-Zhou 2023-05-06 21:06:18 +0800
  • 0324eb4103 [code] update requirements Yiqing-Zhou 2023-05-06 21:05:53 +0800
  • 9b7f9b9d60 [feature] persistent_workers; new args accumulate_grad_batches strategy Yiqing-Zhou 2023-05-06 19:39:24 +0800
  • 45fa065530 Initial Commit Yiqing-Zhou 2023-05-04 21:52:25 +0800