Commit Graph

112 Commits

Author SHA1 Message Date
Colin d6c78ecd68 Update meaning dataset define. 2024-03-26 11:32:02 +08:00
Colin e29c0b9a41 Add python pip required define. 2024-03-25 20:41:41 +08:00
Colin d10e7a8396 Refine train.py for train. 2024-03-25 19:53:11 +08:00
Colin 4c7fdbe817 Add GPU stress test. 2024-03-25 17:30:41 +08:00
Colin c7391b090e Delete unused files. 2024-03-20 23:05:05 +08:00
Colin c4f7ef2813 Update special dateset. 2024-03-20 23:04:29 +08:00
Colin 01e5f86e94 Add inference. 2024-03-20 22:27:28 +08:00
Colin b248d1d890 Fix model bug. 2024-03-20 22:23:52 +08:00
Colin 72718e6b72 Add Batch dataloader support. 2024-03-18 11:43:41 +08:00
Colin 9feaafcb7a Apply meaning data train. 2024-03-15 11:16:42 +08:00
Colin 0ae63298b2 use custom vocab_size. 2024-03-14 13:28:40 +08:00
Colin 05f17b1221 Refine model config and init. 2024-03-14 11:40:26 +08:00
Colin 8330cbb036 Add meaning dataset. 2024-03-13 19:41:02 +08:00
Colin c094afb0f9 Add tensorboard event out. 2024-03-09 16:55:03 +08:00
Colin f1394d5974 Refine code. 2024-03-08 20:46:42 +08:00
Colin 601c7f6510 Retest wit. 2024-03-07 16:30:37 +08:00
Colin a70d12d04d Rename train file. 2024-03-05 22:09:58 +08:00
Colin 9ef3e92b23 Try model train. 2024-03-05 22:09:28 +08:00
Colin 11fc8f1d39 Refine label used. 2024-03-05 22:08:37 +08:00
Colin fdc8c657b3 Add accurancy in loss. 2024-03-05 19:30:15 +08:00
Colin cf726a5b9f Add loss and logger code. 2024-03-05 15:54:03 +08:00
Colin 9e8e92ae25 Update trainer to custom data. 2024-03-04 21:41:46 +08:00
Colin 1622bf3054 add mnbvc dataset . 2024-03-03 23:35:40 +08:00
Colin 8120be66a6 sperate train and val dataset. 2024-02-26 23:59:00 +08:00
Colin d1906629ab Enable wit train on cutome dataset and loss down. 2024-02-26 22:44:26 +08:00
Colin 1ef3e419cb Add custom dataset support. 2024-02-26 22:44:26 +08:00
Colin e5f97af291 Add wit train support. 2024-02-26 22:44:26 +08:00
Colin fc071dce70 Remove no use tiktoken. 2024-02-21 21:11:15 +08:00
Colin fe13f12327 Add wit. 2024-02-06 14:08:45 +08:00
Colin 6366b52fef Add reaserch sile resault. 2024-02-04 23:48:51 +08:00
Colin 9d5d590b09 Add dataset and wit. 2024-02-04 23:48:24 +08:00
Colin b7c27af6c8 Add research_token to dump token relationship in attention layer0. 2024-01-29 00:12:08 +08:00
Colin 185278f3a9 Update research_attention dump without sum. 2024-01-28 17:55:08 +08:00
Colin 3f296ccdb2 Update research. 2024-01-26 20:35:25 +08:00
Colin bba27e3444 Refine prepareInput. 2024-01-25 18:05:08 +08:00
Colin 19491d1f4a Refine model of qwen. 2024-01-24 21:26:19 +08:00
Colin 11af10e710 Refine research_attention and forward model. 2024-01-23 13:13:21 +08:00
Colin 1811b9611a Refine research_attention. 2024-01-22 20:57:27 +08:00
Colin 5dbac40925 Refien. 2024-01-21 22:43:16 +08:00
Colin 17a2df2e6f Update show and q@k dump. 2024-01-21 20:50:36 +08:00
Colin ae6ea67bbe Refine qwen/research_attention.py. 2024-01-21 17:54:05 +08:00
Colin dab1c94bc6 Refine qwen to module fomater. 2024-01-21 16:47:54 +08:00
Colin 9d28280cb1 Refine model of qwen and add runner. 2024-01-21 12:45:56 +08:00
Colin 7c047f0b32 Refine model of qwen. 2024-01-21 02:33:55 +08:00
Colin 40ae899515 Refine model of qwen. 2024-01-20 23:01:09 +08:00
Colin 4d493014ba Refine model of qwen. 2024-01-20 20:20:18 +08:00
Colin 12dcbec718 PreTrainedModel to mm.Module 2024-01-20 20:06:59 +08:00
Colin 0458e7303c Remove attention_mask 2024-01-20 18:08:20 +08:00
Colin cd50c10e8c Move readme to charglm. 2024-01-20 00:11:12 +08:00
Colin e7ba788982 Delete docs. 2024-01-20 00:10:27 +08:00