Commit Graph

105 Commits

Author SHA1 Message Date
Colin b248d1d890 Fix model bug. 2024-03-20 22:23:52 +08:00
Colin 72718e6b72 Add Batch dataloader support. 2024-03-18 11:43:41 +08:00
Colin 9feaafcb7a Apply meaning data train. 2024-03-15 11:16:42 +08:00
Colin 0ae63298b2 use custom vocab_size. 2024-03-14 13:28:40 +08:00
Colin 05f17b1221 Refine model config and init. 2024-03-14 11:40:26 +08:00
Colin 8330cbb036 Add meaning dataset. 2024-03-13 19:41:02 +08:00
Colin c094afb0f9 Add tensorboard event out. 2024-03-09 16:55:03 +08:00
Colin f1394d5974 Refine code. 2024-03-08 20:46:42 +08:00
Colin 601c7f6510 Retest wit. 2024-03-07 16:30:37 +08:00
Colin a70d12d04d Rename train file. 2024-03-05 22:09:58 +08:00
Colin 9ef3e92b23 Try model train. 2024-03-05 22:09:28 +08:00
Colin 11fc8f1d39 Refine label used. 2024-03-05 22:08:37 +08:00
Colin fdc8c657b3 Add accurancy in loss. 2024-03-05 19:30:15 +08:00
Colin cf726a5b9f Add loss and logger code. 2024-03-05 15:54:03 +08:00
Colin 9e8e92ae25 Update trainer to custom data. 2024-03-04 21:41:46 +08:00
Colin 1622bf3054 add mnbvc dataset . 2024-03-03 23:35:40 +08:00
Colin 8120be66a6 sperate train and val dataset. 2024-02-26 23:59:00 +08:00
Colin d1906629ab Enable wit train on cutome dataset and loss down. 2024-02-26 22:44:26 +08:00
Colin 1ef3e419cb Add custom dataset support. 2024-02-26 22:44:26 +08:00
Colin e5f97af291 Add wit train support. 2024-02-26 22:44:26 +08:00
Colin fc071dce70 Remove no use tiktoken. 2024-02-21 21:11:15 +08:00
Colin fe13f12327 Add wit. 2024-02-06 14:08:45 +08:00
Colin 6366b52fef Add reaserch sile resault. 2024-02-04 23:48:51 +08:00
Colin 9d5d590b09 Add dataset and wit. 2024-02-04 23:48:24 +08:00
Colin b7c27af6c8 Add research_token to dump token relationship in attention layer0. 2024-01-29 00:12:08 +08:00
Colin 185278f3a9 Update research_attention dump without sum. 2024-01-28 17:55:08 +08:00
Colin 3f296ccdb2 Update research. 2024-01-26 20:35:25 +08:00
Colin bba27e3444 Refine prepareInput. 2024-01-25 18:05:08 +08:00
Colin 19491d1f4a Refine model of qwen. 2024-01-24 21:26:19 +08:00
Colin 11af10e710 Refine research_attention and forward model. 2024-01-23 13:13:21 +08:00
Colin 1811b9611a Refine research_attention. 2024-01-22 20:57:27 +08:00
Colin 5dbac40925 Refien. 2024-01-21 22:43:16 +08:00
Colin 17a2df2e6f Update show and q@k dump. 2024-01-21 20:50:36 +08:00
Colin ae6ea67bbe Refine qwen/research_attention.py. 2024-01-21 17:54:05 +08:00
Colin dab1c94bc6 Refine qwen to module fomater. 2024-01-21 16:47:54 +08:00
Colin 9d28280cb1 Refine model of qwen and add runner. 2024-01-21 12:45:56 +08:00
Colin 7c047f0b32 Refine model of qwen. 2024-01-21 02:33:55 +08:00
Colin 40ae899515 Refine model of qwen. 2024-01-20 23:01:09 +08:00
Colin 4d493014ba Refine model of qwen. 2024-01-20 20:20:18 +08:00
Colin 12dcbec718 PreTrainedModel to mm.Module 2024-01-20 20:06:59 +08:00
Colin 0458e7303c Remove attention_mask 2024-01-20 18:08:20 +08:00
Colin cd50c10e8c Move readme to charglm. 2024-01-20 00:11:12 +08:00
Colin e7ba788982 Delete docs. 2024-01-20 00:10:27 +08:00
colin 69154a4777 删除 doc/主观意识生成对话.md 2024-01-19 18:22:50 +08:00
colin fd0b0c63ba 删除 chatglm/graph.md 2024-01-19 18:22:39 +08:00
Colin f96bcc799c Refine model of qwen for long sequence in eval. 2024-01-19 14:54:48 +08:00
Colin 45c2f532ff Add mem_tracker in tools. 2024-01-19 14:52:28 +08:00
Colin 3233616aac Delete kv cache of qwen. 2024-01-18 20:23:21 +08:00
Colin 0a78627e48 Add doc 2024-01-17 22:56:30 +08:00
Colin 90fbc2642e Refine modeling and demo. 2024-01-14 17:21:14 +08:00