Compare commits

..

283 Commits
0.2.2 ... 0.3.2

Author SHA1 Message Date
zyxucp
e1fd288875 fix 样式修改 2024-04-22 23:37:20 +08:00
zyxucp
91eae9cfa8 fix 修改聊天记录存储 2024-04-22 23:31:08 +08:00
zyxucp
b0059942d3 Merge pull request #79 from AIDotNet/feature_chat
Feature chat
2024-04-22 23:17:58 +08:00
zyxucp
a716982878 add 聊天记录 2024-04-22 23:17:16 +08:00
zyxucp
3d4e48f9f5 fix 修改错误 2024-04-22 22:26:45 +08:00
zyxucp
1f212d3156 update semantic kernel to 1.8.0 2024-04-22 22:19:46 +08:00
zyxucp
7d91ef6ba1 Update LICENSE 2024-04-22 22:14:44 +08:00
zyxucp
2a450b00de add 处理在有用户时使用chats表次存储聊天记录,匿名访问时使用localstorage存储聊天记录 2024-04-22 22:03:34 +08:00
zyxucp
3a97068248 Update README.md 2024-04-21 12:12:46 +08:00
zeyu xu
1d9d95899a update 更新antsk logo 2024-04-21 11:17:51 +08:00
zyxucp
7ae8e52b57 Merge pull request #77 from AIDotNet/feature_bge
update kernelMemory nuget版本
2024-04-21 11:01:06 +08:00
zeyu xu
f5c195a1d0 update kernelMemory nuget版本 2024-04-21 11:00:36 +08:00
zyxucp
78a6b662d3 Update docker-compose.simple.yml 2024-04-20 23:30:27 +08:00
zyxucp
5f814eb76c Update docker-compose.yml 2024-04-20 23:30:07 +08:00
zyxucp
d9e5ebb464 Update README.md 2024-04-20 23:29:48 +08:00
zyxucp
bce0e9183c Update README.md 2024-04-20 23:29:26 +08:00
zyxucp
c40a7bcf22 Merge pull request #76 from AIDotNet/feature_bge
Feature bge
2024-04-20 23:19:10 +08:00
zeyu xu
97a7d447ab add rerank kms 2024-04-20 23:18:07 +08:00
zeyu xu
f803b9538b fix 调整目录 2024-04-20 21:17:27 +08:00
zeyu xu
1ac34c1702 add 应用增加rerank 2024-04-20 21:09:34 +08:00
zeyu xu
e07b480da1 add bgemodel 2024-04-20 21:02:44 +08:00
zeyu xu
9036af57e3 重命名 2024-04-20 20:56:55 +08:00
zeyu xu
93288f9b5c add bgererank 模型下载 2024-04-20 20:56:00 +08:00
zyxucp
f40dd8b013 Merge pull request #75 from AIDotNet/feature_menu
add 模型管理页面 文字超长的样式处理
2024-04-20 10:42:23 +08:00
zeyu xu
c6b83d0695 add 模型管理页面 文字超长的样式处理 2024-04-20 10:41:48 +08:00
zyxucp
592c850198 Merge pull request #74 from AIDotNet/feature_menu
add 单独剥离模型管理菜单
2024-04-20 10:31:43 +08:00
zeyu xu
4a3930ac7b add 单独剥离模型管理菜单 2024-04-20 10:31:19 +08:00
zyxucp
c05ba0af3e Update README.md 2024-04-19 23:20:44 +08:00
zyxucp
630ee51df6 Update docker-compose.simple.yml 2024-04-19 23:20:26 +08:00
zyxucp
d0e75e26c3 Update docker-compose.yml 2024-04-19 23:20:04 +08:00
zyxucp
62c36c3072 Merge pull request #73 from AIDotNet/feature_deldimensions
add DelDimensions
2024-04-19 23:09:33 +08:00
zeyu xu
baef309064 add DelDimensions 2024-04-19 23:08:50 +08:00
zeyu xu
d717cbad9c update nuget sqlsugar 2024-04-19 22:12:02 +08:00
zeyu xu
5ef0624605 fix 修改WithLogCallback 日志输出 2024-04-19 22:04:08 +08:00
zyxucp
98f0f9fe84 Update README.md 2024-04-18 22:15:30 +08:00
zeyu xu
28a23271e9 fix 文件名修改 2024-04-18 22:05:23 +08:00
zeyu xu
f1ba0bdf10 add 模型删除校验 2024-04-18 21:49:09 +08:00
zeyu xu
0d5513f374 add Directory.Build.props 2024-04-18 21:23:25 +08:00
zyxucp
4812cc308c Update README.md 2024-04-17 22:59:46 +08:00
zyxucp
584f7faded add 环境变量 2024-04-17 18:25:58 +08:00
zyxucp
08dcef2d8b Update docker-compose.simple.yml 2024-04-16 21:56:58 +08:00
zyxucp
68218733a2 Update docker-compose.yml 2024-04-16 21:56:40 +08:00
zyxucp
eb64cbf3d4 update 升级km nuget版本 2024-04-16 17:01:32 +08:00
zyxucp
f0e8a55522 Merge pull request #72 from AIDotNet/feature_qa1
fix 修改切分使用服务注入
2024-04-16 13:56:39 +08:00
zyxucp
5ec5a0bde4 fix 修改切分使用服务注入 2024-04-16 13:54:24 +08:00
zyxucp
1cc56dd553 Merge pull request #71 from AIDotNet/feature_qa
Feature qa
2024-04-15 23:27:02 +08:00
zeyu xu
64e949a88b add qa切片 2024-04-15 23:26:23 +08:00
zeyu xu
a2390a7c97 add qa问答 2024-04-15 23:21:11 +08:00
zeyu xu
559661bb6c add qa 参数 2024-04-15 21:30:06 +08:00
zyxucp
79326de263 Merge pull request #70 from IntptrMax/Add-stable-diffusion-reference-for-Windows
Add stable diffusion reference for windows
2024-04-15 10:37:03 +08:00
IntptrMax
3815891b28 remove unused stable-diffusion.dll 2024-04-15 08:51:17 +08:00
IntptrMax
42d474382a Merge branch 'AIDotNet:main' into Add-stable-diffusion-reference-for-Windows 2024-04-15 08:36:29 +08:00
IntptrMax
fe691f2d44 1.Update some Stable Diffusion code 2024-04-15 08:34:16 +08:00
IntptrMax
3ee41a8ab1 1. Add references for Stable Diffusion for windows.
2. Update load lib code
2024-04-15 08:22:12 +08:00
zeyu xu
7ca41dff8a add docker file py 2024-04-13 10:56:47 +08:00
zeyu xu
ba2e86993e add dll 2024-04-13 10:55:42 +08:00
zyxucp
13878046a2 忽略文件 2024-04-12 12:13:43 +08:00
zeyu xu
49ff8bf54f fix 删除空引用 2024-04-11 23:35:53 +08:00
zeyu xu
e9cc5a3993 add loadding 2024-04-11 22:32:00 +08:00
zyxucp
b213964b63 Merge pull request #67 from IntptrMax/UpdateStableDiffusion
Update Stable Diffusion
2024-04-11 21:53:44 +08:00
IntptrMax
bfbed44270 Update Stable Diffusion 2024-04-11 15:11:17 +08:00
zyxucp
9b07d88392 Update Dockerfile-py 2024-04-10 23:31:20 +08:00
zyxucp
3f8ed109f9 fix nuget 2024-04-10 23:23:18 +08:00
zyxucp
3f969627a4 fix 修改ocr runtime 2024-04-10 22:51:37 +08:00
zyxucp
d92970819a Merge pull request #65 from AIDotNet/feature_ocr
add runtime
2024-04-10 22:39:28 +08:00
zyxucp
23e756fa9b add runtime 2024-04-10 22:37:32 +08:00
zyxucp
5f58126fbf Merge pull request #64 from AIDotNet/feature_ocr
Feature ocr
2024-04-10 22:25:36 +08:00
zyxucp
dcfd0ffb8f add ocr 2024-04-10 22:23:59 +08:00
zyxucp
17221d056c add img 2024-04-10 21:57:45 +08:00
zyxucp
4a9dcfada4 fix 修改默认key 2024-04-10 21:54:26 +08:00
zyxucp
bb6c2bb020 fix 升级LLamaSharp 2024-04-09 12:32:39 +08:00
zyxucp
a8760a34de fix 修改地址错误问题 2024-04-09 12:20:15 +08:00
zeyu xu
1e432a5782 fix 删除pynet 2024-04-08 14:50:34 +08:00
zeyu xu
cb861ef2bb fix OCR 2024-04-08 12:38:14 +08:00
zeyu xu
7cee8fd87a add OCR 和文档查询上限 2024-04-08 12:10:32 +08:00
zeyu xu
8ce0e5d348 add ocr 2024-04-07 22:31:57 +08:00
zyxucp
90bce7c89f Merge branch 'main' of https://github.com/AIDotNet/AntSK 2024-04-07 15:59:03 +08:00
zyxucp
b840d0bcce fix add gpu avx 2024-04-07 15:58:26 +08:00
zyxucp
bfa6d28289 Update README.md 2024-04-07 14:40:47 +08:00
zeyu xu
f6e6ca9747 Merge branch 'main' of github.com:AIDotNet/AntSK 2024-04-07 11:44:15 +08:00
zeyu xu
75f8d39648 fix 修改类型 2024-04-07 11:44:05 +08:00
zyxucp
9a939eba5a fix 修改异步为同步 2024-04-07 11:03:18 +08:00
zyxucp
4e93efe821 fix 修改异步为同步 2024-04-07 10:49:35 +08:00
zyxucp
8bdbee80a0 fix 修改类型 2024-04-07 10:37:16 +08:00
zyxucp
6bdf5dcc03 fix 修复PG字段报错问题 2024-04-07 10:05:15 +08:00
zeyu xu
0bf0a9d78a fix chat style 2024-04-06 11:50:15 +08:00
zeyu xu
38e9fea601 Merge branch 'main' of github.com:AIDotNet/AntSK 2024-04-06 11:47:59 +08:00
zeyu xu
d2366b3b46 update Semantic Kernel and fix kmsdetaillist style 2024-04-06 11:47:49 +08:00
zyxucp
3aff93083a Update README.md 2024-04-06 11:04:58 +08:00
zeyu xu
eb998199db fix 删除不要的Controller 2024-04-06 00:03:53 +08:00
zeyu xu
1dd794af1b fix 修改安装向量插件 2024-04-06 00:03:17 +08:00
zeyu xu
08c9923e7e update docker-compose.yal 2024-04-05 23:58:45 +08:00
zyxucp
06b109ca87 Merge pull request #62 from AIDotNet/feature_kms
Feature kms
2024-04-05 22:27:21 +08:00
zeyu xu
9b039335c7 fix 修改style 2024-04-05 22:26:37 +08:00
zeyu xu
041378e5fd add 文档搜索测试 2024-04-05 22:16:44 +08:00
zeyu xu
6dc5ae10e3 add 搜索测试布局 2024-04-05 21:19:41 +08:00
zeyu xu
5807f4c283 fix 修改切片详情样式 2024-04-05 21:09:38 +08:00
zyxucp
8ef4445908 Merge pull request #61 from AIDotNet/feature_kms
Feature kms
2024-04-05 20:23:49 +08:00
zeyu xu
8a0609e970 add excel导入 2024-04-05 20:23:03 +08:00
zeyu xu
9f33b5009b add excel 导入 2024-04-05 19:50:58 +08:00
zeyu xu
50e66db8a1 Merge branch 'feature_kms' of github.com:AIDotNet/AntSK into feature_kms 2024-04-05 19:18:26 +08:00
zeyu xu
c3e83b569a fix 升级nuget 2024-04-05 19:18:20 +08:00
zyxucp
85d1c5ea7e add npoi 2024-04-05 19:17:49 +08:00
zeyu xu
ec1d126a02 add excel导入 2024-04-05 19:13:30 +08:00
zyxucp
e857695e70 Update docker-compose.simple.yml 2024-04-05 18:45:56 +08:00
zyxucp
fa9b2051fe Update docker-compose.yml 2024-04-05 18:45:41 +08:00
zyxucp
d450efcffe Merge pull request #60 from AIDotNet/feature_kms
fix 修改提示词上限
2024-04-05 15:51:22 +08:00
zeyu xu
2a6c84c200 fix 修改提示词上限 2024-04-05 15:50:46 +08:00
zyxucp
138a952ace Merge pull request #59 from AIDotNet/feature_kms
Feature kms
2024-04-05 15:41:14 +08:00
zeyu xu
eb6528ecd2 add 修改message结构,减少localstore存储 2024-04-05 15:39:56 +08:00
zeyu xu
2c30bbfa09 fix 细节调整 2024-04-05 15:29:39 +08:00
zeyu xu
c5a78c2135 add modeldownchange 2024-04-05 15:12:29 +08:00
zeyu xu
f03362ee41 fix 修改dropdown Trigger.Click 2024-04-05 15:04:44 +08:00
zeyu xu
fad3167d97 add kms settings 2024-04-05 15:00:37 +08:00
zeyu xu
ad949681dd add change 2024-04-05 14:41:58 +08:00
zeyu xu
27999d76b0 fix 修改知识库函数 2024-04-05 14:26:35 +08:00
zeyu xu
83278352d6 add kms 配置 2024-04-05 14:12:25 +08:00
zeyu xu
fcc56f5fef fix bge embedding 无法切片问题 2024-04-04 00:37:08 +08:00
zyxucp
4ebe2ecc32 fix 修改初始化,增加完成标识 2024-04-02 13:53:32 +08:00
zeyu xu
e684cba527 Merge branch 'main' of github.com:AIDotNet/AntSK 2024-04-02 13:34:38 +08:00
zeyu xu
888dc19ee0 fix bgeembedding 2024-04-02 13:34:24 +08:00
zyxucp
731aea702f fix 修改提示词 2024-04-02 11:17:53 +08:00
zyxucp
09e22bc76a Update README.md 2024-04-02 00:07:08 +08:00
zyxucp
74406d88a0 Merge pull request #58 from AIDotNet/feature_StableDiffusion
fix 修改为静态类
2024-04-01 23:57:12 +08:00
zeyu xu
e5f9d97560 fix 修改为静态类 2024-04-01 23:56:44 +08:00
zyxucp
59e768aaea Merge pull request #57 from AIDotNet/feature_StableDiffusion
Feature stable diffusion
2024-04-01 23:39:22 +08:00
zeyu xu
6a7cb24a5b add sd 2024-04-01 23:08:53 +08:00
zeyu xu
1db40d534c add apptype 2024-04-01 22:14:18 +08:00
zeyu xu
11d6e30f7e add sd function 2024-04-01 22:03:00 +08:00
zeyu xu
9d5214aaae add sdmodel 2024-04-01 21:57:18 +08:00
zeyu xu
010b906271 add sd 2024-04-01 21:35:51 +08:00
zeyu xu
16bf944edf add sd 2024-04-01 21:31:15 +08:00
zeyu xu
5bae5a099a margin 2024-04-01 21:01:29 +08:00
zyxucp
f771ea9521 Merge branch 'main' of https://github.com/AIDotNet/AntSK 2024-04-01 13:54:53 +08:00
zyxucp
994efbf37c update nuget 2024-04-01 13:54:20 +08:00
zyxucp
938cd86c88 Update README.md 2024-03-31 13:24:21 +08:00
zeyu xu
1339cbadbc fix 修改menukey 2024-03-31 13:07:30 +08:00
zeyu xu
bd0ad570ad add 增加使用文档 2024-03-31 13:07:08 +08:00
zeyu xu
234e649a7e fix 优化部分内容 2024-03-31 12:38:17 +08:00
zyxucp
c431dbc842 Update README.md 2024-03-31 00:28:16 +08:00
zyxucp
76283060d9 Update docker-compose.simple.yml 2024-03-30 23:28:52 +08:00
zyxucp
75ba506db4 Update docker-compose.yml 2024-03-30 23:28:33 +08:00
zyxucp
e086ca60df Merge pull request #56 from AIDotNet/feature_chatview
add chathistory to localstorage
2024-03-30 22:40:06 +08:00
zeyu xu
04acaa9b12 add chathistory to localstorage 2024-03-30 22:39:26 +08:00
zyxucp
7a824bf18c Merge pull request #55 from AIDotNet/feature_chatview
fix 处理文件上传会话 不配embedding模型则隐藏
2024-03-30 22:26:55 +08:00
zeyu xu
769de2e526 fix 处理文件上传会话 不配embedding模型则隐藏 2024-03-30 22:23:01 +08:00
zyxucp
c9950609c9 Merge pull request #54 from AIDotNet/feature_chatview
fix 删除不要元素
2024-03-30 21:58:08 +08:00
zeyu xu
ccee6cfea5 fix 删除不要元素 2024-03-30 21:57:38 +08:00
zyxucp
f626c618be Merge pull request #53 from AIDotNet/feature_chatview
Feature chatview
2024-03-30 21:53:01 +08:00
zeyu xu
3b601a9e3d fix 调整样式 2024-03-30 21:48:29 +08:00
zeyu xu
3d5f63d595 add chatview 2024-03-30 21:45:37 +08:00
zeyu xu
6933f2f495 add openchat file 2024-03-30 20:39:02 +08:00
zeyu xu
79c7e8626a add chatview 2024-03-30 20:28:40 +08:00
zyxucp
4a017d311c Update README.md 2024-03-30 20:03:57 +08:00
zeyu xu
0c8ad5fe8d add loadding 2024-03-30 19:50:29 +08:00
zeyu xu
68ce0db011 fix 样式修改 2024-03-30 17:35:40 +08:00
zeyu xu
c36de1a1e9 add 选项控制 2024-03-30 17:25:58 +08:00
zeyu xu
44ef759abd fix 修改控件 2024-03-30 14:47:29 +08:00
longdream
0c3d9844be Merge pull request #52 from longdream/main
bge embedding模型添加,bge用的CPU。
2024-03-29 21:51:35 +08:00
longdream
854c62a4ca 合并 2024-03-29 21:50:17 +08:00
longdream
5ed4fd5299 Merge branch 'main' of https://github.com/longdream/AntSK 2024-03-29 20:00:53 +08:00
longdream
af5ec43571 修改设置界面 2024-03-29 20:00:49 +08:00
zyxucp
24d685879e Update README.md 2024-03-29 18:43:10 +08:00
zyxucp
e801a2ec46 Update README.md 2024-03-29 18:42:18 +08:00
junlong
d7b56d1590 Merge branch 'main' of https://github.com/longdream/AntSK 2024-03-29 15:34:08 +08:00
longdream
b925f8890b 修改token长度 2024-03-28 23:06:21 +08:00
longdream
5d80ee994a 解决线程冲突问题 2024-03-28 19:04:11 +08:00
zeyu xu
da8f955ca2 Merge branch 'main' of github.com:AIDotNet/AntSK 2024-03-28 18:34:59 +08:00
zeyu xu
2e04582c5e fix prompt 2024-03-28 18:34:46 +08:00
zyxucp
e69994f727 Merge pull request #50 from ElderJames/fix/sparkdesk-func-call
fix function calling result for sparkdesk
2024-03-28 11:14:52 +08:00
James Yeung
d8dc26127d fix function calling result for sparkdesk 2024-03-28 10:49:50 +08:00
longdream
f73bd2dfda 增减embedding 2024-03-27 22:53:45 +08:00
zyxucp
9f08b60348 Update README.md 2024-03-27 20:25:36 +08:00
zeyu xu
75c2f36b30 update nuget 2024-03-27 19:27:59 +08:00
zyxucp
39c02a6064 Update README.en.md 2024-03-27 19:20:21 +08:00
zyxucp
52c119befd Update README.md 2024-03-27 19:19:12 +08:00
zyxucp
23903ded3f Update README.en.md 2024-03-27 19:11:46 +08:00
zyxucp
4799fbac72 Update README.md 2024-03-27 19:11:02 +08:00
zyxucp
16a7d55271 Merge pull request #49 from AIDotNet/feature_vctordb
add AzureAISearch
2024-03-27 19:10:27 +08:00
zeyu xu
be0bafcc50 add AzureAISearch 2024-03-27 19:09:42 +08:00
zyxucp
defc51a074 Update README.md 2024-03-27 18:46:51 +08:00
zyxucp
09709c210d Merge pull request #48 from AIDotNet/feature_vctordb
add RedisMemoryDb
2024-03-27 18:44:59 +08:00
zeyu xu
8ebb2f54eb add RedisMemoryDb 2024-03-27 18:44:26 +08:00
zyxucp
b831aab115 Update README.md 2024-03-27 17:49:27 +08:00
zyxucp
8e322162cc Update README.md 2024-03-27 17:48:40 +08:00
zyxucp
6a8a6509b8 Merge pull request #47 from AIDotNet/feature_vctordb
add QdrantMemoryDb
2024-03-27 17:42:12 +08:00
zeyu xu
707dff09f8 add QdrantMemoryDb 2024-03-27 17:41:15 +08:00
zyxucp
17c8fca40f Merge pull request #45 from duyanming/main
文字纠正
2024-03-27 15:49:29 +08:00
zeyu xu
415f9757e9 Merge branch 'main' of github.com:AIDotNet/AntSK 2024-03-27 15:40:57 +08:00
zeyu xu
27394f0699 fix 修改bearer 示例错误 2024-03-27 15:40:47 +08:00
duyanming
8a9ca40bb6 文字纠正 2024-03-27 08:09:23 +08:00
longdream
f340ee1088 embedding封装 2024-03-26 23:14:55 +08:00
zyxucp
080eb5765e Update README.md 2024-03-26 21:24:26 +08:00
zeyu xu
36c8ff184a Merge branch 'main' of github.com:AIDotNet/AntSK 2024-03-26 21:22:26 +08:00
zeyu xu
0486f67b50 add gzh 2024-03-26 21:21:57 +08:00
zyxucp
aa7e8d545c Update README.md 2024-03-26 21:07:02 +08:00
zyxucp
59f6a899a6 Update README.en.md 2024-03-26 21:06:28 +08:00
zyxucp
0fc98d42aa Update README.md 2024-03-26 21:04:21 +08:00
longdream
edad2644aa 删除没必要的py文件 2024-03-26 20:48:49 +08:00
longdream
8a56a0393a Merge branch 'main' of https://github.com/longdream/AntSK 2024-03-26 20:48:07 +08:00
zyxucp
f4cbf9a40a Update README.md 2024-03-26 00:03:18 +08:00
zyxucp
fb5b92f499 Update README.en.md 2024-03-26 00:02:59 +08:00
zeyu xu
c286258f2b del 删除LLamaSharp早起http版本 2024-03-25 23:04:17 +08:00
zeyu xu
4416651589 Merge branch 'main' of github.com:AIDotNet/AntSK 2024-03-25 22:29:16 +08:00
zeyu xu
48a33e8977 fix 修复格式变更 2024-03-25 22:29:04 +08:00
junlong
bd5ca06d8f test 2024-03-25 16:55:41 +08:00
junlong
e0985ecec3 Merge branch 'main' of https://github.com/longdream/AntSK 2024-03-25 16:48:21 +08:00
junlong
e56b74d4af 删除chat以外的文件 2024-03-25 16:48:11 +08:00
zyxucp
c417098c2c Merge pull request #44 from ElderJames/fix/sparkdesk-func-issue
fix: sparkdesk function call definition conversion
2024-03-25 14:15:03 +08:00
zyxucp
93527215a7 Merge branch 'main' of https://github.com/AIDotNet/AntSK 2024-03-25 13:43:57 +08:00
James Yeung
0cf3945693 fix: sparkdesk function call definition conversion 2024-03-25 13:16:10 +08:00
zyxucp
ced2a9b2e2 fix 调整llamafactory加载顺序 2024-03-25 12:03:09 +08:00
zyxucp
987b231c4d Update README.md 2024-03-25 11:16:26 +08:00
zyxucp
7a541c1da1 Update README.md 2024-03-25 11:14:43 +08:00
zyxucp
74e323158d fix 修改变量名规范 2024-03-24 23:46:14 +08:00
zyxucp
563a7409f6 Merge pull request #42 from AIDotNet/feature_chathistory
Feature chathistory
2024-03-24 23:44:20 +08:00
zyxucp
b13b93e04e Merge branch 'feature_chathistory' of https://github.com/AIDotNet/AntSK into feature_chathistory 2024-03-24 23:43:34 +08:00
zyxucp
44568c8d65 fix 修改OpenAIService 历史对话 2024-03-24 23:43:07 +08:00
zyxucp
fb277dff80 Merge pull request #41 from AIDotNet/feature_chathistory
Feature chathistory
2024-03-24 23:12:45 +08:00
zeyu xu
efae890650 fix 调整kms提示词 2024-03-24 23:08:32 +08:00
zyxucp
a146f6059e fix 调整历史记录会话 2024-03-24 22:51:20 +08:00
zyxucp
3c67096cd8 Update README.md 2024-03-24 19:56:13 +08:00
zeyu xu
a993a60f95 Merge branch 'main' of github.com:AIDotNet/AntSK 2024-03-24 13:47:51 +08:00
zeyu xu
d3fdc77600 add 增加文档 2024-03-24 13:47:38 +08:00
zyxucp
b62c56e36f Update README.md 2024-03-24 13:05:18 +08:00
zeyu xu
7d72911239 fix 修改gpu默认分层为20 2024-03-24 12:23:27 +08:00
zeyu xu
9e24d7cc67 add Authors 2024-03-24 12:07:31 +08:00
zeyu xu
9baa24b496 Merge branch 'main' of github.com:AIDotNet/AntSK 2024-03-24 12:06:02 +08:00
zeyu xu
da826525f7 add py环境docker file 2024-03-24 12:05:50 +08:00
zyxucp
62dfab41fd Update README.en.md 2024-03-24 12:03:08 +08:00
zeyu xu
04fc811c2c fix 修改首页样式和github链接样式 2024-03-24 12:02:05 +08:00
zeyu xu
8638ecbe29 fix 修改描述一致 2024-03-24 10:55:30 +08:00
zeyu xu
6f1f93fbaf update 更新AntDesign.ProLayout、SemanticKernel、KernelMemory 版本 2024-03-24 10:52:44 +08:00
zyxucp
dc38d83f89 Update README.md 2024-03-23 23:14:20 +08:00
zeyu xu
fd780780c5 update docker-compose.yml 2024-03-23 22:39:51 +08:00
zyxucp
6fd918f33b Update README.md 2024-03-23 21:48:17 +08:00
zyxucp
8fcfa8974b Update README.md 2024-03-23 16:49:15 +08:00
zyxucp
7e23c32c6c Update README.md 2024-03-23 16:48:24 +08:00
zyxucp
7fdfceeea5 Update docker-compose.yml 2024-03-23 13:01:03 +08:00
zyxucp
bf0e62634f Update README.md 2024-03-23 12:44:53 +08:00
zyxucp
469ef9aab2 Merge pull request #38 from AIDotNet/feature_llamafactory
Feature llamafactory
2024-03-23 12:34:11 +08:00
zeyu xu
f7df26030d add 增加环境安装 2024-03-23 12:30:52 +08:00
zeyu xu
f61cbe9780 add 增加记录llamafactory是否启动的状态 2024-03-23 12:20:56 +08:00
zeyu xu
56b62cff2a fix 提示修改 2024-03-23 11:55:50 +08:00
zeyu xu
0aec21cf03 fix 修改日志输出样式宽度 2024-03-23 11:40:30 +08:00
zeyu xu
ff4f6be5fc add 日志输出 2024-03-22 22:57:38 +08:00
longdream
964a5022c8 修改输出 2024-03-22 21:44:51 +08:00
longdream
849b18f677 Merge branch 'AIDotNet:main' into main 2024-03-22 19:36:20 +08:00
zeyu xu
b8c6a6a626 add 增加校验 2024-03-21 23:59:23 +08:00
zeyu xu
57fc9a9b7e fix 修改按钮启动后不可用 2024-03-21 23:58:31 +08:00
zeyu xu
068c126a23 add 增加llamafactory 下拉列表 2024-03-21 23:53:50 +08:00
zeyu xu
0ed1662c7b add llamafactory model 2024-03-21 23:20:05 +08:00
zeyu xu
a09377814f Merge branch 'main' into feature_llamafactory 2024-03-21 22:31:32 +08:00
zeyu xu
3cc952bb2a fix 修改SK config.json升级结构变更 2024-03-21 22:24:45 +08:00
zeyu xu
63c968742b Merge branch 'main' into feature_llamafactory 2024-03-21 22:15:48 +08:00
zeyu xu
a4d6f2a6fd fix 修复分享会话空指针bug 2024-03-21 22:14:49 +08:00
zeyu xu
d6de64853d fix 修改filelist封装 2024-03-21 21:51:09 +08:00
junlong
344128e49d Merge branch 'main' of https://github.com/longdream/AntSK 2024-03-21 19:38:03 +08:00
junlong
56fc9dd517 test 2024-03-21 19:37:56 +08:00
zyxucp
c7c1911eb1 fix 限制会话中上传按钮点击 2024-03-21 19:02:48 +08:00
zyxucp
dec8b5bef7 fix 修改样式 2024-03-21 18:12:15 +08:00
zyxucp
db7271b519 Merge branch 'feature_llamafactory' of https://github.com/AIDotNet/AntSK into feature_llamafactory 2024-03-21 14:28:37 +08:00
zyxucp
fcbee1f64f margin 2024-03-21 14:15:23 +08:00
zyxucp
2d9443a0a1 Merge pull request #36 from jeffersyuan1976/main
IconPicker组件
2024-03-21 13:53:43 +08:00
zeyu xu
6e3dd00d6f fix 删除重复文件 2024-03-21 13:46:45 +08:00
Jeffers
7a0656cd81 IconPicker组件 2024-03-21 13:23:22 +08:00
zyxucp
3b89d9e974 fix 整理启动注入函数 2024-03-21 12:38:08 +08:00
zyxucp
cd174308cf fix 修改codefirst注入模式 2024-03-21 12:20:05 +08:00
zyxucp
aacef47626 Update README.md 2024-03-20 23:27:45 +08:00
zyxucp
fcef01a41f Update README.md 2024-03-20 23:27:19 +08:00
zyxucp
7d19b694fa Update README.md 2024-03-20 23:26:44 +08:00
zyxucp
966a31b156 Update README.md 2024-03-20 23:25:28 +08:00
zeyu xu
7538393742 add requirements 2024-03-20 23:09:52 +08:00
zeyu xu
3658188be2 fix 修改llamafactory 单独一个类库,并增加目录输出 2024-03-20 22:26:23 +08:00
zyxucp
fcd9fb9079 Update docker-compose.simple.yml 2024-03-20 21:38:59 +08:00
zyxucp
d1168e16d6 Update docker-compose.yml 2024-03-20 21:38:44 +08:00
zyxucp
25a6f00dd2 Merge pull request #35 from longdream/main
llama factory 初始化
2024-03-20 17:41:33 +08:00
longdream
13c474f084 Merge branch 'feature_llamafactory' into main 2024-03-20 14:55:28 +08:00
zyxucp
8d78270007 fix 修复导入文件无法导入的bug 2024-03-20 13:47:57 +08:00
junlong
a94b59c156 llama factory 初始化 2024-03-20 11:19:57 +08:00
zyxucp
ae6d61ee6d Merge branch 'main' of https://github.com/AIDotNet/AntSK 2024-03-20 09:52:13 +08:00
zyxucp
74a7c94619 fix 修改配置文件目录层级 2024-03-20 09:52:02 +08:00
zeyu xu
bdd34ac786 update docker file 2024-03-19 22:37:28 +08:00
237 changed files with 18262 additions and 1887 deletions

4
.gitignore vendored
View File

@@ -324,10 +324,6 @@ ASALocalRun/
# MSBuild Binary and Structured Log
*.binlog
# NVidia Nsight GPU debugger configuration file
*.nvuser
*.dll
*.pdb
# MFractors (Xamarin productivity tool) working folder
.mfractor/
**/bin/

View File

@@ -22,4 +22,5 @@ WORKDIR /app
COPY --from=build /app/publish .
RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
RUN echo 'Asia/Shanghai' >/etc/timezone
RUN apt update && apt install -y libpugixml-dev libtbb-dev
ENTRYPOINT ["dotnet", "AntSK.dll"]

29
Dockerfile-py Normal file
View File

@@ -0,0 +1,29 @@
# 1. Define the Python image to use for getting pip
FROM pytorch/pytorch AS python-base
# 2. Define the .NET SDK image to build your application
FROM mcr.microsoft.com/dotnet/sdk:8.0 AS build
WORKDIR /src
COPY ["src/AntSK/AntSK.csproj", "AntSK/"]
RUN dotnet restore "AntSK/AntSK.csproj"
COPY src/ .
WORKDIR "/src/AntSK"
RUN dotnet build "AntSK.csproj" -c Release -o /app/build
RUN dotnet publish "AntSK.csproj" -c Release -o /app/publish
# 3. Define the final image that will contain both .NET runtime and Python
FROM mcr.microsoft.com/dotnet/aspnet:8.0 AS final
# Copy the Python/pip installation from the official Python image
COPY --from=python-base /usr/local /usr/local
COPY --from=python-base /opt/conda/ /opt/conda/
WORKDIR /app
COPY --from=build /app/publish .
# Make sure the app and Python directories are in PATH
ENV PATH="/app:/opt/conda/bin:/usr/local/bin:${PATH}"
RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
RUN echo 'Asia/Shanghai' >/etc/timezone
RUN pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
RUN apt update && apt install -y libpugixml-dev libtbb-dev
ENTRYPOINT ["dotnet", "AntSK.dll"]

View File

@@ -186,7 +186,7 @@
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Copyright [2024] [许泽宇]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.

View File

@@ -1,152 +1,89 @@
[简体中文](./README.md) | English
# AntSK
## AI Knowledge Base/Intelligent Agent built on .Net8+AntBlazor+SemanticKernel
## An AI knowledge base/intelligent agent built with .Net 8+AntBlazor+SemanticKernel
## ⭐Core Features
- **Semantic Kernel**: Utilizes advanced natural language processing technology to accurately understand, process, and respond to complex semantic queries, providing users with precise information retrieval and recommendation services.
- **Kernel Memory**: Capable of continuous learning and storing knowledge points, AntSK has long-term memory function, accumulates experience, and provides a more personalized interaction experience.
## Core Features
- **Knowledge Base**: Import knowledge base through documents (Word, PDF, Excel, Txt, Markdown, Json, PPT) and perform knowledge base Q&A.
- **GPT Generation**: This platform supports creating personalized GPT models, enabling users to build their own GPT models.
- **API Interface Publishing**: Exposes internal functions in the form of APIs, enabling developers to integrate AntSK into other applications and enhance application intelligence.
- **Semantic Kernel**: Utilizes advanced natural language processing technologies to accurately understand, process, and respond to complex semantic queries, providing users with precise information retrieval and recommendation services.
- **API Plugin System**: Open API plugin system that allows third-party developers or service providers to easily integrate their services into AntSK, continuously enhancing application functionality.
- **.Net Plugin System**: Open dll plugin system that allows third-party developers or service providers to easily integrate their business functions by generating dll in standard format code, continuously enhancing application functionality.
- **Kernel Memory**: Capable of continuous learning and knowledge storage, AntSK has a long-term memory function, accumulating experience to offer more personalized interaction experiences.
- **Online Search**: AntSK, real-time access to the latest information, ensuring users receive the most timely and relevant data.
- **Model Management**: Adapts and manages integration of different models from different manufacturers, including gguf types supported by **llama.cpp** and models offline running supported by **llamafactory**.
- **Knowledge base**: Import knowledge into the database through documents (Word, PDF, Excel, Txt, Markdown, Json, PPT) and manage knowledge base documents.
- **GPTs Generation**The platform supports the creation of personalized GPT models, try building your own GPT model.
- **API Interface Release**: Internal functions are provided as APIs for developers to integrate AntSK into other applications, enhancing application intelligence.
- **API Plugin System**: An open API plugin system allows third-party developers or service providers to easily integrate their services into AntSK, continuously enhancing application functions.
- **.Net Plugin System**: An open dll plugin system allows third-party developers or service providers to integrate their business functions into AntSK by generating dlls with the standard format codes, continuously enhancing application functions.
- **Internet Search**: AntSK can retrieve the latest information in real-time, ensuring that the information users receive is always timely and relevant.
- **Model management**: Adapts and manages different models from various manufacturers. It also supports offline running of models in 'gguf' format supported by llama.cpp.
- **National Information Creation**: AntSK supports domestic models and databases, and can operate under information creation conditions.
## Application scenarios
- **Domestic Innovation**: AntSK supports domestic models and databases and can run under domestic innovation conditions.
- **Model Fine-Tuning**: Planned based on llamafactory for model fine-tuning.
## ⛪Application Scenarios
AntSK is suitable for various business scenarios, such as:
- Corporate knowledge management systems
- Automated customer service and chatbots
- Enterprise Search Engine
- Enterprise knowledge management system
- Automatic customer service and chatbots
- Enterprise search engine
- Personalized recommendation system
- Intelligent writing assistance
- Education and online learning platforms
- Other interesting AI Apps
- Intelligent assisted writing
- Education and online learning platform
- Other interesting AI applications
## Function example
First, you need to create a knowledge base
![Knowledge base](https://github.com/xuzeyu91/AntSK/blob/main/images/%E7%9F%A5%E8%AF%86%E5%BA%93.png)
In the knowledge base, you can use documents or urls to import
![Knowledge base details](https://github.com/xuzeyu91/AntSK/blob/main/images/%E7%9F%A5%E8%AF%86%E5%BA%93%E8%AF%A6%E6%83%85.png)
Click View to view the document slicing of the knowledge base
![Document Slice](https://github.com/xuzeyu91/AntSK/blob/main/images/%E6%96%87%E6%A1%A3%E5%88%87%E7%89%87.png)
Then we need to create applications, which can create dialog applications and knowledge bases.
![Application](https://github.com/xuzeyu91/AntSK/blob/main/images/%E5%BA%94%E7%94%A8.png)
The application of knowledge base needs to select the existing knowledge base, which can be multiple
![Application Configuration](https://github.com/xuzeyu91/AntSK/blob/main/images/%E5%BA%94%E7%94%A8%E9%85%8D%E7%BD%AE.png)
Then you can ask questions about the knowledge base documents in the dialogue
![Q&A](https://github.com/xuzeyu91/AntSK/blob/main/images/%E9%97%AE%E7%AD%94.png)
In addition, we can also create dialogue applications, and configure prompt word templates in corresponding applications
![Conversation application](https://github.com/xuzeyu91/AntSK/blob/main/images/%E7%AE%80%E5%8D%95%E5%AF%B9%E8%AF%9D.png)
Let's see the effect
![Conversation effect](https://github.com/xuzeyu91/AntSK/blob/main/images/%E5%AF%B9%E8%AF%9D%E6%95%88%E6%9E%9C.png)
## How do I get started?
Here I am using Postgres as a data and vector store, because the Semantic Kernel and Kernel Memory both support it, though you can switch to others.
The model by default supports local models in 'gguf' format from openai, azure openai, and llama. If you need to use other models, you can integrate them using the one-api.
The login configuration in the configuration file is the default account and password.
The following configuration files are needed:
## Using Docker Compose
A appsettings.json for the pg version and a simplified version (Sqlite+disk) docker-compose.simple.yml are provided.
Download docker-compose.yml from the project root directory, then place the configuration file appsettings.json in the same directory,
The pg image has already been prepared. You can modify the default account and password in the docker-compose.yml, and then your appsettings.json database connection needs to be consistent.
Then you can enter the directory and execute
## ✏Function Examples
### Online Demo
```
docker compose up - d
https://antsk.ai-dotnet.com/
```
to start AntSK.
How to mount local models and model download directories in docker
```
# Non-host version, does not use local proxy
Default account: test
Default password: test
Due to the low configuration of the cloud server, the local model cannot be run, so the system settings permissions have been closed. You can simply view the interface. If you want to use the local model, please download and use it on your own.
```
### Other Function Examples
[Video Demonstration](https://www.bilibili.com/video/BV1zH4y1h7Y9/)
## ❓How to get started?
Here I am using Postgres as the data and vector storage because Semantic Kernel and Kernel Memory support it, but you can also use other options.
The model by default supports the local model of openai, azure openai, and llama. If you need to use other models, you can integrate them using one-api.
The Login configuration in the configuration file is the default login account and password.
The following configuration file needs to be configured
## 1⃣Using docker-compose
Provided the pg version **appsettings.json** and simplified version (Sqlite+disk) **docker-compose.simple.yml**
Download **docker-compose.yml** from the project root directory and place the configuration file **appsettings.json** in the same directory.
The pg image has already been prepared. You can modify the default username and password in docker-compose.yml, and then the database connection in your **appsettings.json** needs to be consistent.
Then you can execute the following command in the directory to start AntSK
```
docker-compose up -d
```
## 2⃣How to mount local models and model download directory in docker
```
# Non-host version, do not use local proxy
version: '3.8'
services:
antsk:
container_name: antsk
image: registry.cn-hangzhou.aliyuncs.com/xuzeyu91/antsk:v0.2.1
ports:
image: registry.cn-hangzhou.aliyuncs.com/AIDotNet/antsk:v0.1.5ports:
- 5000:5000
networks:
- antsk
@@ -156,48 +93,30 @@ services:
environment:
- ASPNETCORE_URLS=http://*:5000
volumes:
- ./appsettings.json:/app/appsettings.json # local configuration file must be placed in the same directory
- ./appsettings.json:/app/appsettings.json # Local configuration file needs to be placed in the same directory
- D://model:/app/model
networks:
antsk:
```
Using this as an example, the meaning is to mount the local folder D://model from Windows into the container /app/model. If so, your appsettings.json model directory should be configured as
Taking this as an example, it means mounting the local D://model folder of Windows into the container /app/model. If so, the model address in your appsettings.json should be configured as
```
model/xxx.gguf
```
Some meanings of the configuration file
// (The rest of the information is omitted as it's unnecessary for the translation example context.)
Solving the missing style issue:
Execute under AntSK/src/AntSK:
## 3⃣Some meanings of configuration file
```
dotnet clean
dotnet build
dotnet publish "AntSK.csproj"
```
Then go to AntSK/src/AntSK/bin/Release/net8.0/publish
```
dotnet AntSK.dll
```
```
{
"DBConnection": {
"DbType": "Sqlite",
"DbType": "Sqlite",
"ConnectionStrings": "Data Source=AntSK.db;"
},
"KernelMemory": {
"VectorDb": "Disk",
"VectorDb": "Disk",
"ConnectionString": "Host=;Port=;Database=antsk;Username=;Password=",
"TableNamePrefix": "km-"
},
"LLamaSharp": {
"RunType": "GPU",
"Chat": "D:\\Code\\AI\\AntBlazor\\model\\qwen1_5-1_8b-chat-q8_0.gguf",
"Embedding": "D:\\Code\\AI\\AntBlazor\\model\\qwen1_5-1_8b-chat-q8_0.gguf",
"RunType": "GPU",
"FileDirectory": "D:\\Code\\AI\\AntBlazor\\model\\"
},
"Login": {
@@ -210,44 +129,86 @@ dotnet AntSK.dll
}
}
}
```
```
//Supports multiple databases, including SqlSugar, MySql, SqlServer, Sqlite, Oracle, PostgreSQL, Dm, Kdbndp, Oscar, MySqlConnector, Access, OpenGaussian, QuestDB, HG, ClickHouse, GBase, Odbc, OceanBaseForOracle, TDengine, GaussDB, OceanBase, Tidb, Vastbase, PolarDB, Custom
DBConnection DbType
//Connection string, corresponding strings need to be used according to different DB types
DBConnection ConnectionStrings
//The type of vector storage supports Postgres Disk Memory, where Postgres requires the configuration of ConnectionString
KernelMemory VectorDb
//The running mode used by the local model is GUP CPU. If using an online API, you can freely use one
LLamaSharp RunType
//The model path of the local session model should pay attention to distinguishing between Linux and Windows drive letters
LLamaSharp Chat
//The model path of the local vector model should pay attention to distinguishing between Linux and Windows drive letters
LLamaSharp Embedding
//Default administrator account password
// Supports various databases, you can check SqlSugar, MySql, SqlServer, Sqlite, Oracle, PostgreSQL, Dm, Kdbndp, Oscar, MySqlConnector, Access, OpenGauss, QuestDB, HG, ClickHouse, GBase, Odbc, OceanBaseForOracle, TDengine, GaussDB, OceanBase, Tidb, Vastbase, PolarDB, Custom
DBConnection.DbType
// Connection string, need to use the corresponding string according to the different DB types
DBConnection.ConnectionStrings
//The type of vector storage, supporting Postgres, Disk, Memory, Qdrant, Redis, AzureAISearch
//Postgres and Redis require ConnectionString configuration
//The ConnectionString of Qdrant and AzureAISearch uses Endpoint | APIKey
KernelMemory.VectorDb
//Local model execution options: GPU and CPU. When using the online API, any option can be used.
LLamaSharp.RunType
//Local model path, used for quick selection of models under llama, as well as saving downloaded models.
LLamaSharp.FileDirectory
//Default admin account password
Login
//The number of threads for importing asynchronous processing can be higher when using online APIs. Local models suggest 1, otherwise memory overflow and crash may occur
BackgroundTaskBroker ImportKMSTask WorkerCount
//Import asynchronous processing thread count. A higher count can be used for online API, but for local models, 1 is recommended to avoid memory overflow issues.
BackgroundTaskBroker.ImportKMSTask.WorkerCount
```
## ⚠Fixing Style Issues:
Run the following in AntSK/src/AntSK:
```
dotnet clean
dotnet build
dotnet publish "AntSK.csproj"
```
Then navigate to AntSK/src/AntSK/bin/Release/net8.0/publish and run:
```
dotnet AntSK.dll
```
The styles should now be applied after starting.
To learn more or start using**AntSK**, you can follow my public account and join the exchange group.
I'm using CodeFirst mode for the database, so as long as the database connection is properly configured, the table structure will be created automatically.
## ✔Using llamafactory
```
1. First, ensure that Python and pip are installed in your environment. This step is not necessary if using an image, such as version v0.2.3.2, which already includes the complete Python environment.
2. Go to the model add page and select llamafactory.
3. Click "Initialize" to check whether the 'pip install' environment setup is complete.
4. Choose a model that you like.
5. Click "Start" to begin downloading the model from the tower. This may involve a somewhat lengthy wait.
6. After the model has finished downloading, enter http://localhost:8000/ in the request address. The default port is 8000.
7. Click "Save" and start chatting.
8. Many people ask about the difference between LLamaSharp and llamafactory. In fact, LLamaSharp is a .NET implementation of llama.cpp, but only supports local gguf models, while llamafactory supports a wider variety of models and uses Python implementation. The main difference lies here. Additionally, llamafactory has the ability to fine-tune models, which is an area we will focus on integrating in the future.
```
## 🤝 Contributing
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](https://github.com/AIDotNet/AntSK/pulls)
If you would like to contribute, feel free to create a [Pull Request](https://github.com/AIDotNet/AntSK/pulls), or give us [Bug Report](https://github.com/AIDotNet/AntSK/issues/new).
## 💕 Contributors
## Contact me
This project exists thanks to all the people who contribute.
If you have any questions or suggestions, please follow my public account through the following ways, and send a message to me. We also have an exchange group, which can send messages such as joining the group, and then I will bring you into the exchange group
<a href="https://github.com/AIDotNet/AntSK/graphs/contributors">
<img src="https://contrib.rocks/image?repo=AIDotNet/AntSK&max=1000&columns=15&anon=1" />
</a>
![Official account](https://github.com/xuzeyu91/Avalonia-Assistant/blob/main/img/gzh.jpg)
## 🚨 Code of Conduct
This project has adopted the code of conduct defined by the Contributor Covenant to clarify expected behavior in our community.
For more information see the [.NET Foundation Code of Conduct](https://dotnetfoundation.org/code-of-conduct).
To learn more or get started with **AntSK**, follow my official WeChat account and join the discussion group.
## ☎Contact Me
If you have any questions or suggestions, please contact me through my official WeChat account. We also have a discussion group where you can send a message to join, and then I will add you to the group.
![Official WeChat Account](https://github.com/AIDotNet/Avalonia-Assistant/blob/main/img/gzh.jpg)
---
We appreciate your interest in**AntSK**and look forward to working with you to create an intelligent future!
We appreciate your interest in **AntSK** and look forward to collaborating with you to create an intelligent future!

133
README.md
View File

@@ -1,14 +1,16 @@
中文|[English](https://github.com/xuzeyu91/AntSK/blob/main/README.en.md)
中文|[English](https://github.com/AIDotNet/AntSK/blob/main/README.en.md)
# AntSK
## 基于.Net8+AntBlazor+SemanticKernel 打造的AI知识库/智能体
## 使用.Net8+Blazor+SemanticKernel 打造的AI知识库/智能体
## 核心功能
## 核心功能
- **语义内核 (Semantic Kernel)**:采用领先的自然语言处理技术,准确理解、处理和响应复杂的语义查询,为用户提供精确的信息检索和推荐服务。
- **内存内核 (Kernel Memory)**具备持续学习和存储知识点的能力AntSK 拥有长期记忆功能,累积经验,提供更个性化的交互体验。
- **知识库**通过文档Word、PDF、Excel、Txt、Markdown、Json、PPT等形式导入知识库可以进行知识库文档
- **知识库**通过文档Word、PDF、Excel、Txt、Markdown、Json、PPT等形式导入知识库可以进行知识库问答支持本地bge-embedding 向量模型 以及bge-rerank 重排模型
- **文生图**:集成**StableDiffusion** 本地模型,可以进行文生图。
- **GPTs 生成**此平台支持创建个性化的GPT模型尝试构建您自己的GPT模型。
@@ -20,11 +22,14 @@
- **联网搜索**AntSK实时获取最新信息确保用户接受到的资料总是最及时、最相关的。
- **模型管理**适配和管理集成不同厂商的不同模型。并且支持llama.cpp所支持的gguf类型的模型离线运行
- **模型管理**:适配和管理集成不同厂商的不同模型。并且支持**llama.cpp**所支持的gguf类型,以及**llamafactory**所支持的模型离线运行
- **国产信创**AntSK支持国产模型和国产数据库可以在信创条件下运行
## 应用场景
- **模型微调**规划中基于llamafactory进行模型微调
## ⛪应用场景
AntSK 适用于多种业务场景,例如:
- 企业级知识管理系统
@@ -35,47 +40,39 @@ AntSK 适用于多种业务场景,例如:
- 教育与在线学习平台
- 其他有意思的AI App
## 功能示例
## ✏️功能示例
### 在线演示
```
https://antsk.ai-dotnet.com/
```
```
默认账号test
默认密码test
由于云服务器配置较低,无法运行本地模型,所以把系统设置权限关闭了,大家看看界面即可,要使用本地模型,请下载自行使用
请勿在演示站点上传敏感信息
```
### 其他功能示例
[视频示例](https://www.bilibili.com/video/BV1zH4y1h7Y9/)
首先需要创建知识库
![知识库](https://github.com/xuzeyu91/AntSK/blob/main/images/%E7%9F%A5%E8%AF%86%E5%BA%93.png)
[在线文档http://antsk.cn](http://antsk.cn)
在知识库里可以使用文档或者url进行导入
![知识库详情](https://github.com/xuzeyu91/AntSK/blob/main/images/%E7%9F%A5%E8%AF%86%E5%BA%93%E8%AF%A6%E6%83%85.png)
点击查看可以查看知识库的文档切片情况
![文档切片](https://github.com/xuzeyu91/AntSK/blob/main/images/%E6%96%87%E6%A1%A3%E5%88%87%E7%89%87.png)
然后我们需要创建应用,可以创建对话应用和知识库。
![应用](https://github.com/xuzeyu91/AntSK/blob/main/images/%E5%BA%94%E7%94%A8.png)
知识库应用需要选择已有的知识库,可以选多个
![应用配置](https://github.com/xuzeyu91/AntSK/blob/main/images/%E5%BA%94%E7%94%A8%E9%85%8D%E7%BD%AE.png)
然后再对话中可以对知识库的文档进行提问
![问答](https://github.com/xuzeyu91/AntSK/blob/main/images/%E9%97%AE%E7%AD%94.png)
另外我们也可以创建对话应用,可以在对应应用中配置提示词模板
![对话应用](https://github.com/xuzeyu91/AntSK/blob/main/images/%E7%AE%80%E5%8D%95%E5%AF%B9%E8%AF%9D.png)
下面来看看效果吧
![对话效果](https://github.com/xuzeyu91/AntSK/blob/main/images/%E5%AF%B9%E8%AF%9D%E6%95%88%E6%9E%9C.png)
## 如何开始?
## ❓如何开始?
在这里我使用的是Postgres 作为数据存储和向量存储因为Semantic Kernel和Kernel Memory都支持他当然你也可以换成其他的。
模型默认支持openai、azure openai 和llama支持的gguf本地模型,如果需要使用其他模型可以使用one-api进行集成。
模型默认支持openai、azure openai、讯飞星火、阿里云积、 和llama支持的gguf本地模型 以及llamafactory的本地模型,如果需要使用其他模型可以使用one-api进行集成。
配置文件中的Login配置是默认的登账号和密码
配置文件中的Login配置是默认的登账号和密码
需要配置如下的配置文件
## 使用docker-compose
## 1使用docker-compose
提供了pg版本 **appsettings.json** 和 简化版本Sqlite+disk **docker-compose.simple.yml**
提供了pg版本 **appsettings.json** 和 简化版本(**Sqlite+disk** **docker-compose.simple.yml**
从项目根目录下载**docker-compose.yml**,然后把配置文件**appsettings.json**和它放在统一目录,
@@ -87,14 +84,14 @@ docker-compose up -d
```
来启动AntSK
## 如何在docker中挂载本地模型和模型下载的目录
## 2如何在docker中挂载本地模型和模型下载的目录
```
# 非 host 版本, 不使用本机代理
version: '3.8'
services:
antsk:
container_name: antsk
image: registry.cn-hangzhou.aliyuncs.com/xuzeyu91/antsk:v0.1.5
image: registry.cn-hangzhou.aliyuncs.com/AIDotNet/antsk:v0.3.1
ports:
- 5000:5000
networks:
@@ -107,6 +104,7 @@ services:
volumes:
- ./appsettings.json:/app/appsettings.json # 本地配置文件 需要放在同级目录
- D://model:/app/model
- D://model:/root/.cache/modelscope/hub/AI-ModelScope #使用Llamafactory时需要挂载 否则初始化的环境重启后会丢失
networks:
antsk:
```
@@ -115,7 +113,7 @@ networks:
model/xxx.gguf
```
## 配置文件的一些含义
## 3配置文件的一些含义
```
{
"DBConnection": {
@@ -129,8 +127,6 @@ model/xxx.gguf
},
"LLamaSharp": {
"RunType": "GPU",
"Chat": "D:\\Code\\AI\\AntBlazor\\model\\qwen1_5-1_8b-chat-q8_0.gguf",
"Embedding": "D:\\Code\\AI\\AntBlazor\\model\\qwen1_5-1_8b-chat-q8_0.gguf",
"FileDirectory": "D:\\Code\\AI\\AntBlazor\\model\\"
},
"Login": {
@@ -150,15 +146,14 @@ DBConnection.DbType
//连接字符串需要根据不同DB类型用对应的字符串
DBConnection.ConnectionStrings
//向量存储的类型,支持 Postgres Disk Memory 其中Postgres需要配置 ConnectionString
//向量存储的类型,支持 PostgresDiskMemory、Qdrant、Redis、AzureAISearch
//Postgres、Redis需要配置 ConnectionString
//Qdrant 和AzureAISearch 的 ConnectionString 使用 Endpoint|APIKey
KernelMemory.VectorDb
//本地模型使用的运行方式 GUP CPU ,如果用在线API 这个随意使用一个即可
LLamaSharp.RunType
//本地会话模型的模型路径 注意区分linux和windows盘符不同
LLamaSharp.Chat
//本地向量模型的模型路径 注意区分linux和windows盘符不同
LLamaSharp.Embedding
//本地模型路径用于在选择llama时可以快速选择目录下的模型以及保存下载的模型
LLamaSharp.FileDirectory
@@ -168,7 +163,7 @@ Login
BackgroundTaskBroker.ImportKMSTask.WorkerCount
```
## 找不到样式问题解决:
## ⚠️找不到样式问题解决:
AntSK/src/AntSK下执行:
```
dotnet clean
@@ -183,15 +178,49 @@ dotnet AntSK.dll
DB我使用的是CodeFirst模式只要配置好数据库链接表结构是自动创建的
## ✔使用llamafactory
```
1、首先需要确保你的环境已经安装了python和pip如果使用镜像例如p0.2.4版本已经包含了 python全套环境则无需此步骤
2、进入模型添加页面选择llamafactory
3、点击初始化可以检查pip install 环境是否完成
4、选择一个喜欢的模型
5、点击启动,这会开始从魔塔下载模型,你可能需要有一个较为漫长的等待
6、等待模型下载完毕后在请求地址输入 http://localhost:8000/ 这里默认是使用8000端口
7、点击保存然后就可以开始聊天了
8、很多人会问 LLamaSharp与llamafactory有什么区别其实这两者LLamaSharp是llama.cpp的 dotnet实现但是只支持本地gguf模型 而llamafactory 支持的模型种类更多但使用的是python的实现其主要差异在这里另外llamafactory具有模型微调的能力这也是我们下一步需要重点集成的部分。
```
## 🤝 贡献
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](https://github.com/AIDotNet/AntSK/pulls)

如果你想贡献,可以创建一个[拉取请求](https://github.com/AIDotNet/AntSK/pulls), 或给我们[错误报告](https://github.com/AIDotNet/AntSK/issues/new).


## 💕 贡献者
这个项目的存在要感谢所有的贡献者。

<a href="https://github.com/AIDotNet/AntSK/graphs/contributors">
<img src="https://contrib.rocks/image?repo=AIDotNet/AntSK&max=1000&columns=15&anon=1" />
</a>

## 🚨 行为准则
该项目采用了贡献者公约定义的行为准则,以阐明我们社区的预期行为。有关更多信息,请参见 .NET Foundation 行为准则。 [.NET Foundation Code of Conduct](https://dotnetfoundation.org/code-of-conduct).
想了解更多信息或开始使用 **AntSK**,可以关注我的公众号以及加入交流群。
## 联系我
如有任何问题或建议,请通过以下方式关注我的公众号,发消息与我联系,我们也有交流群,可以发送进群等消息,然后我会拉你进交流群
![公众号](https://github.com/xuzeyu91/Avalonia-Assistant/blob/main/img/gzh.jpg)
## ☎️联系我
如有任何问题或建议,请通过以下方式关注我的公众号《许泽宇的技术分享》,发消息与我联系,我们也有AIDotnet交流群,可以发送进群等消息,然后我会拉你进交流群
![公众号](https://github.com/AIDotNet/AntSK/blob/main/images/gzh.jpg)
---
我们对您在**AntSK**的兴趣表示感谢,并期待与您携手共创智能化的未来!
## 🌟 Star History
<a href="https://github.com/AIDotNet/AntSK/stargazers" target="_blank" style="display: block" align="center">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=AIDotNet/AntSK&type=Date&theme=dark" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=AIDotNet/AntSK&type=Date" />
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=AIDotNet/AntSK&type=Date" />
</picture>
</a>

View File

@@ -3,7 +3,9 @@ version: '3.8'
services:
antsk:
container_name: antsk
image: registry.cn-hangzhou.aliyuncs.com/xuzeyu91/antsk:v0.2.1
image: registry.cn-hangzhou.aliyuncs.com/xuzeyu91/antsk:v0.3.1
# 如果需要pytorch环境需要使用下面这个镜像镜像比较大
# image: registry.cn-hangzhou.aliyuncs.com/xuzeyu91/antsk:p0.3.1
ports:
- 5000:5000
networks:

View File

@@ -18,7 +18,9 @@ services:
- ./pg/data:/var/lib/postgresql/data
antsk:
container_name: antsk
image: registry.cn-hangzhou.aliyuncs.com/xuzeyu91/antsk:v0.2.1
image: registry.cn-hangzhou.aliyuncs.com/xuzeyu91/antsk:v0.3.1
# 如果需要pytorch环境需要使用下面这个镜像镜像比较大
# image: registry.cn-hangzhou.aliyuncs.com/xuzeyu91/antsk:p0.3.1
ports:
- 5000:5000
networks:

View File

@@ -0,0 +1,14 @@
{
"position": 3,
"label": "部署",
"collapsible": true,
"collapsed": false,
"className": "red",
"link": {
"type": "generated-index",
"title": "使用案例"
},
"customProps": {
"description": "提供快速使用AntSK的一些案例"
}
}

56
docs/deploy/settings.md Normal file
View File

@@ -0,0 +1,56 @@
---
sidebar_position: 2
---
# 配置文件的一些含义
```
{
"DBConnection": {
"DbType": "Sqlite",
"ConnectionStrings": "Data Source=AntSK.db;"
},
"KernelMemory": {
"VectorDb": "Disk",
"ConnectionString": "Host=;Port=;Database=antsk;Username=;Password=",
"TableNamePrefix": "km-"
},
"LLamaSharp": {
"RunType": "GPU",
"Chat": "D:\\Code\\AI\\AntBlazor\\model\\qwen1_5-1_8b-chat-q8_0.gguf",
"Embedding": "D:\\Code\\AI\\AntBlazor\\model\\qwen1_5-1_8b-chat-q8_0.gguf",
"FileDirectory": "D:\\Code\\AI\\AntBlazor\\model\\"
},
"Login": {
"User": "admin",
"Password": "xuzeyu"
},
"BackgroundTaskBroker": {
"ImportKMSTask": {
"WorkerCount": 1
}
}
}
```
```
//支持多种数据库具体可以查看SqlSugarMySqlSqlServerSqliteOraclePostgreSQLDmKdbndpOscarMySqlConnectorAccessOpenGaussQuestDBHGClickHouseGBaseOdbcOceanBaseForOracleTDengineGaussDBOceanBaseTidbVastbasePolarDBCustom
DBConnection.DbType
//连接字符串需要根据不同DB类型用对应的字符串
DBConnection.ConnectionStrings
//向量存储的类型,支持 Postgres Disk Memory 其中Postgres需要配置 ConnectionString
KernelMemory.VectorDb
//本地模型使用的运行方式 GUP CPU ,如果用在线API 这个随意使用一个即可
LLamaSharp.RunType
//本地会话模型的模型路径 注意区分linux和windows盘符不同
LLamaSharp.Chat
//本地向量模型的模型路径 注意区分linux和windows盘符不同
LLamaSharp.Embedding
//本地模型路径用于在选择llama时可以快速选择目录下的模型以及保存下载的模型
LLamaSharp.FileDirectory
//默认管理员账号密码
Login
//导入异步处理的线程数使用在线API可以高一点本地模型建议1 否则容易内存溢出崩掉
BackgroundTaskBroker.ImportKMSTask.WorkerCount
```

57
docs/deploy/start.md Normal file
View File

@@ -0,0 +1,57 @@
---
sidebar_position: 1
---
# 如何开始?
在这里我使用的是Postgres 作为数据存储和向量存储因为Semantic Kernel和Kernel Memory都支持他当然你也可以换成其他的。
模型默认支持openai、azure openai 和llama支持的gguf本地模型,如果需要使用其他模型可以使用one-api进行集成。
配置文件中的Login配置是默认的登陆账号和密码
需要配置如下的配置文件
## 使用docker-compose
提供了pg版本 **appsettings.json** 和 简化版本Sqlite+disk **docker-compose.simple.yml**
从项目根目录下载**docker-compose.yml**,然后把配置文件**appsettings.json**和它放在统一目录,
这里已经把pg的镜像做好了。在docker-compose.yml中可以修改默认账号密码然后你的**appsettings.json**的数据库连接需要保持一致。
然后你可以进入到目录后执行
```
docker-compose up -d
```
来启动AntSK
## 如何在docker中挂载本地模型和模型下载的目录
```
# 非 host 版本, 不使用本机代理
version: '3.8'
services:
antsk:
container_name: antsk
image: registry.cn-hangzhou.aliyuncs.com/AIDotNet/antsk:v0.1.5
ports:
- 5000:5000
networks:
- antsk
depends_on:
- antskpg
restart: always
environment:
- ASPNETCORE_URLS=http://*:5000
volumes:
- ./appsettings.json:/app/appsettings.json # 本地配置文件 需要放在同级目录
- D://model:/app/model
networks:
antsk:
```
以这个为示例意思是把windows本地D://model的文件夹挂载进 容器内/app/model 如果是这样你的appsettings.json中的模型地址应该配置为
```
model/xxx.gguf
```
DB我使用的是CodeFirst模式只要配置好数据库链接表结构是自动创建的

16
docs/deploy/style.md Normal file
View File

@@ -0,0 +1,16 @@
---
sidebar_position: 3
---
# 找不到样式问题解决
AntSK/src/AntSK下执行:
```
dotnet clean
dotnet build
dotnet publish "AntSK.csproj"
```
再去AntSK/src/AntSK/bin/Release/net8.0/publish下
```
dotnet AntSK.dll
```
然后启动就有样式了

View File

@@ -0,0 +1,14 @@
{
"position": 2,
"label": "快速开发",
"collapsible": true,
"collapsed": false,
"className": "red",
"link": {
"type": "generated-index",
"title": "快速开发"
},
"customProps": {
"description": "快速基于项目二次开发!"
}
}

View File

@@ -0,0 +1,14 @@
{
"position": 2,
"label": "介绍",
"collapsible": true,
"collapsed": false,
"className": "red",
"link": {
"type": "generated-index",
"title": "使用案例"
},
"customProps": {
"description": "提供快速使用AntSK的一些案例"
}
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 101 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 202 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 55 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 170 KiB

70
docs/introduce/readme.md Normal file
View File

@@ -0,0 +1,70 @@
---
sidebar_position: 1
---
# AntSK功能介绍
## 基于.Net8+AntBlazor+SemanticKernel 打造的AI知识库/智能体
## 核心功能
- **语义内核 (Semantic Kernel)**:采用领先的自然语言处理技术,准确理解、处理和响应复杂的语义查询,为用户提供精确的信息检索和推荐服务。
- **内存内核 (Kernel Memory)**具备持续学习和存储知识点的能力AntSK 拥有长期记忆功能,累积经验,提供更个性化的交互体验。
- **知识库**通过文档Word、PDF、Excel、Txt、Markdown、Json、PPT等形式导入知识库可以进行知识库问答。
- **GPTs 生成**此平台支持创建个性化的GPT模型尝试构建您自己的GPT模型。
- **API接口发布**将内部功能以API的形式对外提供便于开发者将AntSK 集成进其他应用,增强应用智慧。
- **API插件系统**开放式API插件系统允许第三方开发者或服务商轻松将其服务集成到AntSK不断增强应用功能。
- **.Net插件系统**开放式dll插件系统允许第三方开发者或服务商轻松将其业务功能通过标准格式的代码生成dll后集成到AntSK不断增强应用功能。
- **联网搜索**AntSK实时获取最新信息确保用户接受到的资料总是最及时、最相关的。
- **模型管理**:适配和管理集成不同厂商的不同模型。并且支持**llama.cpp**所支持的gguf类型以及**llamafactory**所支持的模型离线运行
- **国产信创**AntSK支持国产模型和国产数据库可以在信创条件下运行
- **模型微调**规划中基于llamafactory进行模型微调
## 应用场景
AntSK 适用于多种业务场景,例如:
- 企业级知识管理系统
- 自动客服与聊天机器人
- 企业级搜索引擎
- 个性化推荐系统
- 智能辅助写作
- 教育与在线学习平台
- 其他有意思的AI App
## 功能示例
[视频示例](https://www.bilibili.com/video/BV1zH4y1h7Y9/)
首先需要创建知识库
![知识库](./img/知识库.png)
在知识库里可以使用文档或者url进行导入
![知识库详情](./img/知识库详情.png)
点击查看可以查看知识库的文档切片情况
![文档切片](./img/文档切片.png)
然后我们需要创建应用,可以创建对话应用和知识库。
![应用](./img/应用.png)
知识库应用需要选择已有的知识库,可以选多个
![应用配置](./img/应用配置.png)
然后再对话中可以对知识库的文档进行提问
![问答](./img/问答.png)
另外我们也可以创建对话应用,可以在对应应用中配置提示词模板
![对话应用](./img/简单对话.png)
下面来看看效果吧
![对话效果](./img/对话效果.png)

BIN
images/gzh.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 180 KiB

View File

@@ -0,0 +1,53 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net8.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
<DocumentationFile>AntSK.Domain.xml</DocumentationFile>
<NoWarn>CA1050,CA1707,CA2007,VSTHRD111,CS1591,RCS1110,CA5394,SKEXP0001,SKEXP0002,SKEXP0003,SKEXP0004,SKEXP0010,SKEXP0011,,SKEXP0012,SKEXP0020,SKEXP0021,SKEXP0022,SKEXP0023,SKEXP0024,SKEXP0025,SKEXP0026,SKEXP0027,SKEXP0028,SKEXP0029,SKEXP0030,SKEXP0031,SKEXP0032,SKEXP0040,SKEXP0041,SKEXP0042,SKEXP0050,SKEXP0051,SKEXP0052,SKEXP0053,SKEXP0054,SKEXP0055,SKEXP0060,SKEXP0061,SKEXP0101,SKEXP0102</NoWarn>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="AntDesign.Charts" Version="0.5.1" />
<PackageReference Include="AntDesign.ProLayout" Version="0.18.2" />
<PackageReference Include="BlazorComponents.Terminal" Version="0.6.0" />
<PackageReference Include="Blazored.LocalStorage" Version="4.5.0" />
<PackageReference Include="pythonnet" Version="3.0.3" />
<PackageReference Include="Swashbuckle.AspNetCore" Version="6.5.0" />
<PackageReference Include="AutoMapper" Version="8.1.0" />
<PackageReference Include="BCrypt.Net-Next" Version="4.0.3" />
<PackageReference Include="Markdig" Version="0.37.0" />
<PackageReference Include="Newtonsoft.Json" Version="13.0.3" />
<PackageReference Include="SqlSugarCore" Version="5.1.4.151" />
<PackageReference Include="System.Data.SQLite.Core" Version="1.0.118" />
<PackageReference Include="RestSharp" Version="110.2.0" />
<PackageReference Include="NPOI" Version="2.7.0" />
<PackageReference Include="Microsoft.SemanticKernel" Version="1.7.1" />
<PackageReference Include="Microsoft.SemanticKernel.Core" Version="1.7.1" />
<PackageReference Include="Microsoft.SemanticKernel.Plugins.Core" Version="1.7.1-alpha" />
<PackageReference Include="Microsoft.KernelMemory.Core" Version="0.36.240415.2" />
<PackageReference Include="Microsoft.KernelMemory.MemoryDb.Postgres" Version="0.36.240415.2" />
<PackageReference Include="Microsoft.KernelMemory.MemoryDb.Qdrant" Version="0.36.240415.2" />
<PackageReference Include="Microsoft.KernelMemory.MemoryDb.Redis" Version="0.36.240415.2" />
<PackageReference Include="Microsoft.KernelMemory.MemoryDb.AzureAISearch" Version="0.36.240415.2" />
<PackageReference Include="LLamaSharp" Version="0.11.2" />
<PackageReference Include="LLamaSharp.Backend.Cpu" Version="0.11.2" />
<PackageReference Include="LLamaSharp.Backend.Cuda12" Version="0.11.2" />
<PackageReference Include="LLamaSharp.kernel-memory" Version="0.11.2" />
<PackageReference Include="LLamaSharp.semantic-kernel" Version="0.11.2" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\AntSK.LLamaFactory\AntSK.LLamaFactory.csproj" />
<ProjectReference Include="..\AntSk.LLM\AntSK.LLM.csproj" />
<ProjectReference Include="..\AntSK.OCR\AntSK.OCR.csproj" />
<ProjectReference Include="..\MiddleWare\AntSK.BackgroundTask\AntSK.BackgroundTask.csproj" />
</ItemGroup>
</Project>

View File

@@ -9,32 +9,44 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="AntDesign.Charts" Version="0.5.1" />
<PackageReference Include="AntDesign.ProLayout" Version="0.18.0" />
<PackageReference Include="AntDesign.ProLayout" Version="0.18.2" />
<PackageReference Include="BlazorComponents.Terminal" Version="0.6.0" />
<PackageReference Include="Blazored.LocalStorage" Version="4.5.0" />
<PackageReference Include="pythonnet" Version="3.0.3" />
<PackageReference Include="Swashbuckle.AspNetCore" Version="6.5.0" />
<PackageReference Include="AutoMapper" Version="8.1.0" />
<PackageReference Include="BCrypt.Net-Next" Version="4.0.3" />
<PackageReference Include="Markdig" Version="0.36.2" />
<PackageReference Include="Markdig" Version="0.37.0" />
<PackageReference Include="Newtonsoft.Json" Version="13.0.3" />
<PackageReference Include="SqlSugarCore" Version="5.1.4.145" />
<PackageReference Include="SqlSugarCore" Version="5.1.4.152" />
<PackageReference Include="System.Data.SQLite.Core" Version="1.0.118" />
<PackageReference Include="RestSharp" Version="110.2.0" />
<PackageReference Include="NPOI" Version="2.7.0" />
<PackageReference Include="Microsoft.SemanticKernel" Version="1.6.2" />
<PackageReference Include="Microsoft.SemanticKernel.Core" Version="1.6.2" />
<PackageReference Include="Microsoft.SemanticKernel.Plugins.Core" Version="1.6.2-alpha" />
<PackageReference Include="Microsoft.KernelMemory.Core" Version="0.34.240313.1" />
<PackageReference Include="Microsoft.KernelMemory.MemoryDb.Postgres" Version="0.34.240313.1" />
<PackageReference Include="Microsoft.SemanticKernel" Version="1.8.0" />
<PackageReference Include="Microsoft.SemanticKernel.Core" Version="1.8.0" />
<PackageReference Include="Microsoft.SemanticKernel.Plugins.Core" Version="1.8.0-alpha" />
<PackageReference Include="Microsoft.KernelMemory.Core" Version="$(KMVersion)" />
<PackageReference Include="Microsoft.KernelMemory.MemoryDb.Postgres" Version="$(KMVersion)" />
<PackageReference Include="Microsoft.KernelMemory.MemoryDb.Qdrant" Version="$(KMVersion)" />
<PackageReference Include="Microsoft.KernelMemory.MemoryDb.Redis" Version="$(KMVersion)" />
<PackageReference Include="Microsoft.KernelMemory.MemoryDb.AzureAISearch" Version="$(KMVersion)" />
<PackageReference Include="LLamaSharp" Version="0.10.0" />
<PackageReference Include="LLamaSharp.Backend.Cpu" Version="0.10.0" />
<PackageReference Include="LLamaSharp.Backend.Cuda12" Version="0.10.0" />
<PackageReference Include="LLamaSharp.kernel-memory" Version="0.10.0" />
<PackageReference Include="LLamaSharp.semantic-kernel" Version="0.10.0" />
<PackageReference Include="LLamaSharp" Version="$(LLamaSharpVersion)" />
<PackageReference Include="LLamaSharp.Backend.Cpu" Version="$(LLamaSharpVersion)" />
<PackageReference Include="LLamaSharp.Backend.Cuda12" Version="$(LLamaSharpVersion)" />
<PackageReference Include="LLamaSharp.kernel-memory" Version="$(LLamaSharpVersion)" />
<PackageReference Include="LLamaSharp.semantic-kernel" Version="$(LLamaSharpVersion)" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\AntSK.LLamaFactory\AntSK.LLamaFactory.csproj" />
<ProjectReference Include="..\AntSk.LLM\AntSK.LLM.csproj" />
<ProjectReference Include="..\AntSK.OCR\AntSK.OCR.csproj" />
<ProjectReference Include="..\MiddleWare\AntSK.BackgroundTask\AntSK.BackgroundTask.csproj" />
</ItemGroup>

View File

@@ -17,6 +17,27 @@
<param name="assemblies">程序集集合</param>
<returns></returns>
</member>
<member name="M:AntSK.Domain.Common.DependencyInjection.InitExtensions.CodeFirst(Microsoft.AspNetCore.Builder.WebApplication)">
<summary>
使用codefirst创建数据库表
</summary>
<param name="services"></param>
<returns></returns>
</member>
<member name="M:AntSK.Domain.Common.DependencyInjection.InitExtensions.LoadFun(Microsoft.AspNetCore.Builder.WebApplication)">
<summary>
加载数据库的插件
</summary>
<param name="services"></param>
<returns></returns>
</member>
<member name="M:AntSK.Domain.Common.DependencyInjection.InitExtensions.AddAntSKSwagger(Microsoft.Extensions.DependencyInjection.IServiceCollection)">
<summary>
swagger 初始化
</summary>
<param name="serviceCollection"></param>
<returns></returns>
</member>
<member name="F:AntSK.Domain.Common.DependencyInjection.ServiceLifetime.Scoped">
<summary>
作用域
@@ -48,6 +69,84 @@
<param name="value"></param>
<returns></returns>
</member>
<member name="M:AntSK.Domain.ExeclHelper.ExcelToDataTable(System.String,System.Boolean)">
<summary>
将excel导入到datatable
</summary>
<param name="filePath">excel路径</param>
<param name="isColumnName">第一行是否是列名</param>
<returns>返回datatable</returns>
</member>
<member name="M:AntSK.Domain.ExeclHelper.ExcelToDataTable(System.IO.Stream,System.Boolean)">
<summary>
将excel导入到datatable
</summary>
<param name="stream"></param>
<param name="isColumnName">第一行是否是列名</param>
<returns></returns>
</member>
<member name="M:AntSK.Domain.ExeclHelper.ExcelToList``1(System.IO.Stream)">
<summary>
excel转list
</summary>
<typeparam name="TResult"></typeparam>
<param name="stream"></param>
<returns></returns>
</member>
<member name="M:AntSK.Domain.ExeclHelper.ExcelToList``1(System.IO.Stream,System.String)">
<summary>
excel转list-根据sheetName得到List
</summary>
<typeparam name="TResult"></typeparam>
<param name="stream"></param>
<param name="sheetName"></param>
<returns></returns>
</member>
<member name="M:AntSK.Domain.ExeclHelper.ListToExcel``1(``0[],System.String)">
<summary>
List导出excel 二进制流
</summary>
<typeparam name="T">实体</typeparam>
<param name="data">List</param>
<param name="sheetName">sheetname 可不填默认Sheet0</param>
<returns></returns>
</member>
<member name="M:AntSK.Domain.ExeclHelper.DataTableToExcel(System.Data.DataTable,System.String,System.String)">
<summary>
Dt导出excel 二进制流
</summary>
<param name="dt">datatable</param>
<param name="strFile">strFile</param>
<returns></returns>
</member>
<member name="M:AntSK.Domain.ExeclHelper.ListWriteExcel``1(``0[],System.String,System.String)">
<summary>
List写入excel
</summary>
<typeparam name="T"></typeparam>
<param name="data"></param>
<param name="strFile">路径</param>
<param name="sheetName"></param>
<returns></returns>
</member>
<member name="M:AntSK.Domain.ExeclHelper.DataTableWriteExcel(System.Data.DataTable,System.String,System.String)">
<summary>
dt写入excel
</summary>
<param name="dt">datatable</param>
<param name="strFile">路径</param>
<returns></returns>
</member>
<member name="M:AntSK.Domain.ExeclHelper.SetCellDropdownList(NPOI.SS.UserModel.IWorkbook,NPOI.SS.UserModel.ISheet,System.Collections.Generic.List{System.String},System.String,System.Int32,System.Int32,System.Int32)">
<summary>
设置单元格下拉框(除去标题行)
</summary>
<param name="workbook"></param>
<param name="sheet"></param>
<param name="ddlList"></param>
<param name="firstcol"></param>
<param name="lastcol"></param>
</member>
<member name="T:AntSK.Domain.Domain.Model.Enum.AIType">
<summary>
AI类型
@@ -58,11 +157,6 @@
模型类型
</summary>
</member>
<member name="P:AntSK.Domain.Domain.Model.MessageInfo.IsSend">
<summary>
发送是true 接收是false
</summary>
</member>
<member name="P:AntSK.Domain.Domain.Model.PageList`1.PageIndex">
<summary>
当前页从1开始
@@ -78,12 +172,34 @@
总数
</summary>
</member>
<member name="M:AntSK.Domain.Domain.Other.Bge.BegRerankConfig.LoadModel(System.String,System.String)">
<summary>
模型写死
</summary>
</member>
<member name="M:AntSK.Domain.Domain.Other.Bge.BgeEmbeddingConfig.LoadModel(System.String,System.String)">
<summary>
模型写死
</summary>
</member>
<member name="P:AntSK.Domain.Domain.Other.KMExcelHandler.StepName">
<inheritdoc />
</member>
<member name="M:AntSK.Domain.Domain.Other.KMExcelHandler.InvokeAsync(Microsoft.KernelMemory.Pipeline.DataPipeline,System.Threading.CancellationToken)">
<inheritdoc />
</member>
<member name="F:AntSK.Domain.Domain.Other.LLamaConfig.dicLLamaWeights">
<summary>
避免模型重复加载,本地缓存
</summary>
</member>
<member name="M:AntSK.Domain.Domain.Service.ChatService.SendChatByAppAsync(AntSK.Domain.Repositories.Apps,System.String,System.String)">
<member name="P:AntSK.Domain.Domain.Other.QAHandler.StepName">
<inheritdoc />
</member>
<member name="M:AntSK.Domain.Domain.Other.QAHandler.InvokeAsync(Microsoft.KernelMemory.Pipeline.DataPipeline,System.Threading.CancellationToken)">
<inheritdoc />
</member>
<member name="M:AntSK.Domain.Domain.Service.ChatService.SendChatByAppAsync(AntSK.Domain.Repositories.Apps,System.String,Microsoft.SemanticKernel.ChatCompletion.ChatHistory)">
<summary>
发送消息
</summary>
@@ -266,6 +382,56 @@
API调用秘钥
</summary>
</member>
<member name="P:AntSK.Domain.Repositories.Apps.Relevance">
<summary>
相似度
</summary>
</member>
<member name="P:AntSK.Domain.Repositories.Apps.MaxAskPromptSize">
<summary>
提问最大token数
</summary>
</member>
<member name="P:AntSK.Domain.Repositories.Apps.MaxMatchesCount">
<summary>
向量匹配数
</summary>
</member>
<member name="P:AntSK.Domain.Repositories.Apps.AnswerTokens">
<summary>
回答最大token数
</summary>
</member>
<member name="P:AntSK.Domain.Repositories.Chats.UserName">
<summary>
用户名
</summary>
</member>
<member name="P:AntSK.Domain.Repositories.Chats.AppId">
<summary>
应用ID
</summary>
</member>
<member name="P:AntSK.Domain.Repositories.Chats.Context">
<summary>
消息内容
</summary>
</member>
<member name="P:AntSK.Domain.Repositories.Chats.IsSend">
<summary>
发送是true 接收是false
</summary>
</member>
<member name="P:AntSK.Domain.Repositories.Chats.CreateTime">
<summary>
创建事件
</summary>
</member>
<member name="P:AntSK.Domain.Repositories.Chats.FileName">
<summary>
文件名
</summary>
</member>
<member name="P:AntSK.Domain.Repositories.Funs.Path">
<summary>
接口描述
@@ -750,6 +916,14 @@
<param name="parameters"></param>
<returns></returns>
</member>
<member name="M:AntSK.Domain.Utils.ConvertUtils.ComparisonIgnoreCase(System.String,System.String)">
<summary>
忽略大小写匹配
</summary>
<param name="s"></param>
<param name="value"></param>
<returns></returns>
</member>
<member name="M:AntSK.Domain.Utils.RepoFiles.SamplePluginsPath">
<summary>
Scan the local folders from the repo, looking for "samples/plugins" folder.

View File

@@ -0,0 +1,166 @@
using AntSK.Domain.Domain.Model.Constant;
using AntSK.Domain.Domain.Service;
using AntSK.Domain.Repositories;
using DocumentFormat.OpenXml.Office2016.Drawing.ChartDrawing;
using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.OpenApi.Models;
using SqlSugar;
using Swashbuckle.AspNetCore.SwaggerGen;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using System.Text;
using System.Threading.Tasks;
namespace AntSK.Domain.Common.DependencyInjection
{
public static class InitExtensions
{
/// <summary>
/// 使用codefirst创建数据库表
/// </summary>
/// <param name="services"></param>
/// <returns></returns>
public static WebApplication CodeFirst(this WebApplication app)
{
using (var scope = app.Services.CreateScope())
{
// 获取仓储服务
var _repository = scope.ServiceProvider.GetRequiredService<IApps_Repositories>();
// 创建数据库(如果不存在)
_repository.GetDB().DbMaintenance.CreateDatabase();
// 获取当前应用程序域中所有程序集
var assemblies = AppDomain.CurrentDomain.GetAssemblies();
// 在所有程序集中查找具有[SugarTable]特性的类
foreach (var assembly in assemblies)
{
// 获取该程序集中所有具有SugarTable特性的类型
var entityTypes = assembly.GetTypes()
.Where(type => TypeIsEntity(type));
// 为每个找到的类型初始化数据库表
foreach (var type in entityTypes)
{
_repository.GetDB().CodeFirst.InitTables(type);
}
}
//安装向量插件
_repository.GetDB().Ado.ExecuteCommandAsync($"CREATE EXTENSION IF NOT EXISTS vector;");
}
return app;
}
public static WebApplication InitDbData(this WebApplication app)
{
using (var scope = app.Services.CreateScope())
{
// 初始化字典
var _dic_Repository = scope.ServiceProvider.GetRequiredService<IDics_Repositories>();
var llamafactoryStart = _dic_Repository.GetFirst(p => p.Type == LLamaFactoryConstantcs.LLamaFactorDic && p.Key == LLamaFactoryConstantcs.IsStartKey);
if (llamafactoryStart==null)
{
llamafactoryStart = new Dics();
llamafactoryStart.Id=Guid.NewGuid().ToString();
llamafactoryStart.Type = LLamaFactoryConstantcs.LLamaFactorDic;
llamafactoryStart.Key = LLamaFactoryConstantcs.IsStartKey;
llamafactoryStart.Value = "false";
_dic_Repository.Insert(llamafactoryStart);
}
}
return app;
}
/// <summary>
/// 加载数据库的插件
/// </summary>
/// <param name="services"></param>
/// <returns></returns>
public static WebApplication LoadFun(this WebApplication app)
{
try
{
using (var scope = app.Services.CreateScope())
{
//codefirst 创建表
var funRep = scope.ServiceProvider.GetRequiredService<IFuns_Repositories>();
var functionService = scope.ServiceProvider.GetRequiredService<FunctionService>();
var funs = funRep.GetList();
foreach (var fun in funs)
{
functionService.FuncLoad(fun.Path);
}
}
}
catch (Exception ex)
{
Console.WriteLine(ex.Message + " ---- " + ex.StackTrace);
}
return app;
}
private static bool TypeIsEntity(Type type)
{
// 检查类型是否具有SugarTable特性
return type.GetCustomAttributes(typeof(SugarTable), inherit: false).Length > 0;
}
/// <summary>
/// swagger 初始化
/// </summary>
/// <param name="serviceCollection"></param>
/// <returns></returns>
public static IServiceCollection AddAntSKSwagger(this IServiceCollection serviceCollection)
{
serviceCollection.AddSwaggerGen(c =>
{
c.SwaggerDoc("v1", new() { Title = "AntSK.Api", Version = "v1" });
//添加Api层注释true表示显示控制器注释
var xmlFile = $"{Assembly.GetExecutingAssembly().GetName().Name}.xml";
var xmlPath = Path.Combine(AppContext.BaseDirectory, xmlFile);
c.IncludeXmlComments(xmlPath, true);
//添加Domain层注释true表示显示控制器注释
var xmlFile1 = $"{Assembly.GetExecutingAssembly().GetName().Name.Replace("Api", "Domain")}.xml";
var xmlPath1 = Path.Combine(AppContext.BaseDirectory, xmlFile1);
c.IncludeXmlComments(xmlPath1, true);
c.DocInclusionPredicate((docName, apiDes) =>
{
if (!apiDes.TryGetMethodInfo(out MethodInfo method))
return false;
var version = method.DeclaringType.GetCustomAttributes(true).OfType<ApiExplorerSettingsAttribute>().Select(m => m.GroupName);
if (docName == "v1" && !version.Any())
return true;
var actionVersion = method.GetCustomAttributes(true).OfType<ApiExplorerSettingsAttribute>().Select(m => m.GroupName);
if (actionVersion.Any())
return actionVersion.Any(v => v == docName);
return version.Any(v => v == docName);
});
c.AddSecurityDefinition("Bearer", new OpenApiSecurityScheme()
{
Description = "Directly enter bearer {token} in the box below (note that there is a space between bearer and token)",
Name = "Authorization",
In = ParameterLocation.Header,
Type = SecuritySchemeType.ApiKey,
});
c.AddSecurityRequirement(new OpenApiSecurityRequirement
{
{
new OpenApiSecurityScheme
{
Reference = new OpenApiReference()
{
Id = "Bearer",
Type = ReferenceType.SecurityScheme
}
}, Array.Empty<string>()
}
});
});
return serviceCollection;
}
}
}

View File

@@ -0,0 +1,21 @@
using LLamaSharp.KernelMemory;
using Microsoft.KernelMemory.AI;
using Microsoft.KernelMemory;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace AntSK.Domain.Common.Embedding
{
public static class BuilderBgeExtensions
{
public static IKernelMemoryBuilder WithBgeTextEmbeddingGeneration(this IKernelMemoryBuilder builder, HuggingfaceTextEmbeddingGenerator textEmbeddingGenerator)
{
builder.AddSingleton((ITextEmbeddingGenerator)textEmbeddingGenerator);
builder.AddIngestionEmbeddingGenerator(textEmbeddingGenerator);
return builder;
}
}
}

View File

@@ -0,0 +1,56 @@
using LLama.Common;
using LLama;
using LLamaSharp.KernelMemory;
using Microsoft.KernelMemory.AI;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using AntSK.Domain.Domain.Other.Bge;
namespace AntSK.Domain.Common.Embedding
{
public class HuggingfaceTextEmbeddingGenerator : ITextEmbeddingGenerator, ITextTokenizer, IDisposable
{
public int MaxTokens => 1024;
public int MaxTokenTotal => 1024;
private readonly dynamic _embedder;
public HuggingfaceTextEmbeddingGenerator(string pyDllPath,string modelName)
{
_embedder = BgeEmbeddingConfig.LoadModel(pyDllPath, modelName);
}
public void Dispose()
{
BgeEmbeddingConfig.Dispose();
}
//public async Task<IList<ReadOnlyMemory<float>>> GenerateEmbeddingAsync(IList<string> data, CancellationToken cancellationToken = default)
//{
// IList<ReadOnlyMemory<float>> results = new List<ReadOnlyMemory<float>>();
// foreach (var d in data)
// {
// var embeddings = await EmbeddingConfig.GetEmbedding(d);
// results.Add(new ReadOnlyMemory<float>(embeddings));
// }
// return results;
//}
public async Task<Microsoft.KernelMemory.Embedding> GenerateEmbeddingAsync(string text, CancellationToken cancellationToken = default)
{
var embeddings = await BgeEmbeddingConfig.GetEmbedding(text);
return new Microsoft.KernelMemory.Embedding(embeddings);
}
public int CountTokens(string text)
{
return BgeEmbeddingConfig.TokenCount(text);
}
}
}

View File

@@ -0,0 +1,822 @@
using NPOI.HSSF.UserModel;
using NPOI.SS.UserModel;
using NPOI.SS.Util;
using NPOI.XSSF.Streaming;
using NPOI.XSSF.UserModel;
using System;
using System.Collections.Generic;
using System.Data;
using System.IO;
using System.Linq;
using System.Reflection;
using System.Threading.Tasks;
namespace AntSK.Domain
{
public class ExeclHelper
{
/// <summary>
/// 将excel导入到datatable
/// </summary>
/// <param name="filePath">excel路径</param>
/// <param name="isColumnName">第一行是否是列名</param>
/// <returns>返回datatable</returns>
public static DataTable ExcelToDataTable(string filePath, bool isColumnName)
{
DataTable dataTable = null;
FileStream fs = null;
DataColumn column = null;
DataRow dataRow = null;
IWorkbook workbook = null;
ISheet sheet = null;
IRow row = null;
ICell cell = null;
int startRow = 0;
try
{
using (fs = File.OpenRead(filePath))
{
// 2007版本
if (filePath.Contains(".xlsx"))
workbook = new XSSFWorkbook(fs);
// 2003版本
else if (filePath.Contains(".xls"))
workbook = new HSSFWorkbook(fs);
if (workbook != null)
{
sheet = workbook.GetSheetAt(0);//读取第一个sheet当然也可以循环读取每个sheet
dataTable = new DataTable();
if (sheet != null)
{
int rowCount = sheet.LastRowNum;//总行数
if (rowCount > 0)
{
IRow firstRow = sheet.GetRow(0);//第一行
int cellCount = firstRow.LastCellNum;//列数
//构建datatable的列
if (isColumnName)
{
startRow = 1;//如果第一行是列名,则从第二行开始读取
for (int i = firstRow.FirstCellNum; i < cellCount; ++i)
{
cell = firstRow.GetCell(i);
if (cell != null)
{
if (cell.StringCellValue != null)
{
column = new DataColumn(cell.StringCellValue);
dataTable.Columns.Add(column);
}
}
}
}
else
{
for (int i = firstRow.FirstCellNum; i < cellCount; ++i)
{
column = new DataColumn("column" + (i + 1));
dataTable.Columns.Add(column);
}
}
//填充行
for (int i = startRow; i <= rowCount; ++i)
{
row = sheet.GetRow(i);
if (row == null) continue;
dataRow = dataTable.NewRow();
for (int j = row.FirstCellNum; j < cellCount; ++j)
{
cell = row.GetCell(j);
if (cell == null)
{
dataRow[j] = "";
}
else
{
//CellType(Unknown = -1,Numeric = 0,String = 1,Formula = 2,Blank = 3,Boolean = 4,Error = 5,)
switch (cell.CellType)
{
case CellType.Blank:
dataRow[j] = "";
break;
case CellType.Numeric:
short format = cell.CellStyle.DataFormat;
//对时间格式2015.12.5、2015/12/5、2015-12-5等的处理
if (format == 14 || format == 31 || format == 57 || format == 58)
dataRow[j] = cell.DateCellValue;
else
dataRow[j] = cell.NumericCellValue;
break;
case CellType.String:
dataRow[j] = cell.StringCellValue;
break;
}
}
}
dataTable.Rows.Add(dataRow);
}
}
}
}
}
return dataTable;
}
catch (Exception)
{
if (fs != null)
{
fs.Close();
}
return null;
}
}
/// <summary>
/// 将excel导入到datatable
/// </summary>
/// <param name="stream">流</param>
/// <param name="isColumnName">第一行是否是列名</param>
/// <returns></returns>
public static DataTable ExcelToDataTable(Stream stream, bool isColumnName)
{
DataTable dataTable = null;
DataColumn column = null;
DataRow dataRow = null;
IWorkbook workbook = new XSSFWorkbook(stream);
ISheet sheet = null;
IRow row = null;
ICell cell = null;
int startRow = 0;
try
{
if (workbook != null)
{
sheet = workbook.GetSheetAt(0);//读取第一个sheet当然也可以循环读取每个sheet
dataTable = new DataTable();
if (sheet != null)
{
int rowCount = sheet.LastRowNum;//总行数
if (rowCount > 0)
{
IRow firstRow = sheet.GetRow(0);//第一行
int cellCount = firstRow.LastCellNum;//列数
//构建datatable的列
if (isColumnName)
{
startRow = 1;//如果第一行是列名,则从第二行开始读取
for (int i = firstRow.FirstCellNum; i < cellCount; ++i)
{
cell = firstRow.GetCell(i);
if (cell != null)
{
if (cell.StringCellValue != null)
{
column = new DataColumn(cell.StringCellValue);
dataTable.Columns.Add(column);
}
}
}
}
else
{
for (int i = firstRow.FirstCellNum; i < cellCount; ++i)
{
column = new DataColumn("column" + (i + 1));
dataTable.Columns.Add(column);
}
}
//填充行
for (int i = startRow; i <= rowCount; ++i)
{
row = sheet.GetRow(i);
if (row == null) continue;
dataRow = dataTable.NewRow();
for (int j = row.FirstCellNum; j < cellCount; ++j)
{
cell = row.GetCell(j);
if (cell == null)
{
dataRow[j] = "";
}
else
{
//CellType(Unknown = -1,Numeric = 0,String = 1,Formula = 2,Blank = 3,Boolean = 4,Error = 5,)
switch (cell.CellType)
{
case CellType.Blank:
dataRow[j] = "";
break;
case CellType.Numeric:
short format = cell.CellStyle.DataFormat;
//对时间格式2015.12.5、2015/12/5、2015-12-5等的处理
if (format == 14 || format == 31 || format == 57 || format == 58)
dataRow[j] = cell.DateCellValue;
else
dataRow[j] = cell.NumericCellValue;
break;
case CellType.String:
dataRow[j] = cell.StringCellValue;
break;
}
}
}
dataTable.Rows.Add(dataRow);
}
}
}
}
return dataTable;
}
catch (Exception)
{
throw;
}
}
/// <summary>
/// excel转list
/// </summary>
/// <typeparam name="TResult"></typeparam>
/// <param name="stream"></param>
/// <returns></returns>
public static IEnumerable<TResult> ExcelToList<TResult>(Stream stream) where TResult : new()
{
var propertyInfos = typeof(TResult).GetProperties(BindingFlags.Public | BindingFlags.Instance).Where(p => p.CustomAttributes.Count() > 0)
.OrderBy(p => p.GetCustomAttribute<ExeclPropertyAttribute>().Order).ToArray();
List<TResult> list = new List<TResult>();
IWorkbook workbook = new XSSFWorkbook(stream);
ISheet sheet = null;
IRow row = null;
ICell cell = null;
int startRow = 1;
try
{
if (workbook != null)
{
sheet = workbook.GetSheetAt(0);//读取第一个sheet当然也可以循环读取每个sheet
if (sheet != null)
{
int rowCount = sheet.LastRowNum;//总行数
if (rowCount > 0)
{
IRow firstRow = sheet.GetRow(0);//第一行
int cellCount = firstRow.LastCellNum;//列数
//填充行
for (int i = startRow; i <= rowCount; ++i)
{
row = sheet.GetRow(i);
if (row == null) continue;
bool emptyRow = true;//是否空行
TResult dataModel = new TResult();
for (int j = row.FirstCellNum; j < cellCount; ++j)
{
var execlPropertyAttribute = propertyInfos[j].GetCustomAttribute<ExeclPropertyAttribute>();
cell = row.GetCell(j);
if (cell == null)
{
propertyInfos[j].SetValue(dataModel, "");
}
else
{
switch (cell.CellType)
{
case CellType.Blank:
propertyInfos[j].SetValue(dataModel, "");
break;
case CellType.Numeric:
short format = cell.CellStyle.DataFormat;
//对时间格式2015.12.5、2015/12/5、2015-12-5等的处理
if (format == 14 || format == 31 || format == 57 || format == 58)
propertyInfos[j].SetValue(dataModel, cell.DateCellValue);
else
{
if (execlPropertyAttribute.CellType == CellType.String)
{
propertyInfos[j].SetValue(dataModel, cell.NumericCellValue.ToString());
}
else
{
propertyInfos[j].SetValue(dataModel, cell.NumericCellValue);
}
}
break;
case CellType.String:
propertyInfos[j].SetValue(dataModel, cell.StringCellValue);
break;
}
}
if (cell != null && !string.IsNullOrEmpty(cell.ToString().Trim()))
{
emptyRow = false;
}
}
//非空数据行数据添加到DataTable
if (!emptyRow)
{
list.Add(dataModel);
}
}
}
}
}
return list;
}
catch (Exception)
{
throw;
}
}
public static IEnumerable<TResult> ExcelToListFileName<TResult>(Stream stream, string fileName) where TResult : new()
{
var propertyInfos = typeof(TResult).GetProperties(BindingFlags.Public | BindingFlags.Instance).Where(p => p.CustomAttributes.Count() > 0)
.OrderBy(p => p.GetCustomAttribute<ExeclPropertyAttribute>().Order).ToArray();
List<TResult> list = new List<TResult>();
IWorkbook workbook = null;
if (fileName.Contains(".xlsx"))
workbook = new XSSFWorkbook(stream);
// 2003版本
else if (fileName.Contains(".xls"))
workbook = new HSSFWorkbook(stream);
ISheet sheet = null;
IRow row = null;
ICell cell = null;
int startRow = 1;
try
{
if (workbook != null)
{
sheet = workbook.GetSheetAt(0);//读取第一个sheet当然也可以循环读取每个sheet
if (sheet != null)
{
int rowCount = sheet.LastRowNum;//总行数
if (rowCount > 0)
{
IRow firstRow = sheet.GetRow(0);//第一行
int cellCount = firstRow.LastCellNum;//列数
//填充行
for (int i = startRow; i <= rowCount; ++i)
{
row = sheet.GetRow(i);
if (row == null) continue;
bool emptyRow = true;//是否空行
TResult dataModel = new TResult();
for (int j = row.FirstCellNum; j < cellCount; ++j)
{
var execlPropertyAttribute = propertyInfos[j].GetCustomAttribute<ExeclPropertyAttribute>();
cell = row.GetCell(j);
if (cell == null)
{
propertyInfos[j].SetValue(dataModel, "");
}
else
{
switch (cell.CellType)
{
case CellType.Blank:
propertyInfos[j].SetValue(dataModel, "");
break;
case CellType.Numeric:
short format = cell.CellStyle.DataFormat;
//对时间格式2015.12.5、2015/12/5、2015-12-5等的处理
if (format == 14 || format == 31 || format == 57 || format == 58)
propertyInfos[j].SetValue(dataModel, cell.DateCellValue);
else
{
if (execlPropertyAttribute.CellType == CellType.String)
{
propertyInfos[j].SetValue(dataModel, cell.NumericCellValue.ToString());
}
else
{
propertyInfos[j].SetValue(dataModel, cell.NumericCellValue);
}
}
break;
case CellType.String:
propertyInfos[j].SetValue(dataModel, cell.StringCellValue);
break;
}
}
if (cell != null && !string.IsNullOrEmpty(cell.ToString().Trim()))
{
emptyRow = false;
}
}
//非空数据行数据添加到DataTable
if (!emptyRow)
{
list.Add(dataModel);
}
}
}
}
}
return list;
}
catch (Exception)
{
throw;
}
}
/// <summary>
/// excel转list-根据sheetName得到List
/// </summary>
/// <typeparam name="TResult"></typeparam>
/// <param name="stream"></param>
/// <param name="sheetName"></param>
/// <returns></returns>
public static IEnumerable<TResult> ExcelToList<TResult>(Stream stream, string sheetName) where TResult : new()
{
var propertyInfos = typeof(TResult).GetProperties(BindingFlags.Public | BindingFlags.Instance)
.OrderBy(p => p.GetCustomAttribute<ExeclPropertyAttribute>().Order).ToArray();
List<TResult> list = new List<TResult>();
IWorkbook workbook = new XSSFWorkbook(stream);
ISheet sheet = null;
IRow row = null;
ICell cell = null;
int startRow = 1;
try
{
if (workbook != null)
{
sheet = workbook.GetSheet(sheetName);//根据sheet读取对应的DataTable
if (sheet != null)
{
int rowCount = sheet.LastRowNum;//总行数
if (rowCount > 0)
{
IRow firstRow = sheet.GetRow(0);//第一行
int cellCount = firstRow.LastCellNum;//列数
//填充行
for (int i = startRow; i <= rowCount; ++i)
{
row = sheet.GetRow(i);
if (row == null) continue;
bool emptyRow = true;//是否空行
TResult dataModel = new TResult();
for (int j = row.FirstCellNum; j < cellCount; ++j)
{
var execlPropertyAttribute = propertyInfos[j].GetCustomAttribute<ExeclPropertyAttribute>();
cell = row.GetCell(j);
if (cell == null)
{
propertyInfos[j].SetValue(dataModel, "");
}
else
{
switch (cell.CellType)
{
case CellType.Blank:
propertyInfos[j].SetValue(dataModel, "");
break;
case CellType.Numeric:
short format = cell.CellStyle.DataFormat;
//对时间格式2015.12.5、2015/12/5、2015-12-5等的处理
if (format == 14 || format == 31 || format == 57 || format == 58)
propertyInfos[j].SetValue(dataModel, cell.DateCellValue);
else
{
if (execlPropertyAttribute.CellType == CellType.String)
{
propertyInfos[j].SetValue(dataModel, cell.NumericCellValue.ToString());
}
else
{
propertyInfos[j].SetValue(dataModel, cell.NumericCellValue);
}
}
break;
case CellType.String:
propertyInfos[j].SetValue(dataModel, cell.StringCellValue);
break;
}
}
if (cell != null && !string.IsNullOrEmpty(cell.ToString().Trim()))
{
emptyRow = false;
}
}
//非空数据行数据添加到DataTable
if (!emptyRow)
{
list.Add(dataModel);
}
}
}
}
}
return list;
}
catch (Exception ex)
{
throw;
}
}
/// <summary>
/// List导出excel 二进制流
/// </summary>
/// <typeparam name="T">实体</typeparam>
/// <param name="data">List</param>
/// <param name="sheetName">sheetname 可不填默认Sheet0</param>
/// <returns></returns>
public static byte[] ListToExcel<T>(T[] data, string sheetName = "Sheet0")
{
IWorkbook workbook = null;
IRow row = null;
ISheet sheet = null;
ICell cell = null;
var propertyInfos = typeof(T).GetProperties(BindingFlags.Public | BindingFlags.Instance)
.OrderBy(p => p.GetCustomAttribute<ExeclPropertyAttribute>().Order).ToArray();
workbook = new XSSFWorkbook();
sheet = workbook.CreateSheet(sheetName);//创建一个名称为Sheet0的表
int rowCount = data.Count();//行数
int columnCount = propertyInfos.Length;//列数
//设置列头
row = sheet.CreateRow(0);//excel第一行设为列头
for (int c = 0; c < columnCount; c++)
{
cell = row.CreateCell(c);
cell.SetCellValue(propertyInfos[c].GetCustomAttribute<ExeclPropertyAttribute>().DisplayName);
}
//设置每行每列的单元格,
for (int i = 0; i < rowCount; i++)
{
row = sheet.CreateRow(i + 1);
for (int j = 0; j < columnCount; j++)
{
cell = row.CreateCell(j);//excel第二行开始写入数据
cell.SetCellValue(propertyInfos[j].GetValue(data[i])?.ToString());
}
}
using (MemoryStream memoryStream = new MemoryStream())
{
workbook.Write(memoryStream);//向打开的这个xls文件中写入数据
return memoryStream.ToArray();
}
}
/// <summary>
/// Dt导出excel 二进制流
/// </summary>
/// <param name="dt">datatable</param>
/// <param name="strFile">strFile</param>
/// <returns></returns>
public static byte[] DataTableToExcel(DataTable dt, string strFile, string sheetName = "Sheet0")
{
bool result = false;
IWorkbook workbook = null;
FileStream fs = null;
IRow row = null;
ISheet sheet = null;
ICell cell = null;
if (dt != null && dt.Rows.Count > 0)
{
workbook = new XSSFWorkbook();
sheet = workbook.CreateSheet(sheetName);//创建一个名称为Sheet0的表
int rowCount = dt.Rows.Count;//行数
int columnCount = dt.Columns.Count;//列数
//设置列头
row = sheet.CreateRow(0);//excel第一行设为列头
for (int c = 0; c < columnCount; c++)
{
cell = row.CreateCell(c);
cell.SetCellValue(dt.Columns[c].ColumnName);
}
//设置每行每列的单元格,
for (int i = 0; i < rowCount; i++)
{
row = sheet.CreateRow(i + 1);
for (int j = 0; j < columnCount; j++)
{
cell = row.CreateCell(j);//excel第二行开始写入数据
cell.SetCellValue(dt.Rows[i][j].ToString());
}
}
using (MemoryStream memoryStream = new MemoryStream())
{
workbook.Write(memoryStream);//向打开的这个xls文件中写入数据
return memoryStream.ToArray();
}
}
else
{
return new byte[0];
}
}
/// <summary>
/// List写入excel
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="data"></param>
/// <param name="strFile">路径</param>
/// <param name="sheetName"></param>
/// <returns></returns>
public static bool ListWriteExcel<T>(T[] data, string strFile, string sheetName = "Sheet0")
{
bool result = false;
IWorkbook workbook = null;
FileStream fs = null;
IRow row = null;
ISheet sheet = null;
ICell cell = null;
try
{
var propertyInfos = typeof(T).GetProperties(BindingFlags.Public | BindingFlags.Instance)
.OrderBy(p => p.GetCustomAttribute<ExeclPropertyAttribute>().Order).ToArray();
workbook = new XSSFWorkbook();
sheet = workbook.CreateSheet(sheetName);//创建一个名称为Sheet0的表
int rowCount = data.Count();//行数
int columnCount = propertyInfos.Length;//列数
//设置列头
row = sheet.CreateRow(0);//excel第一行设为列头
for (int c = 0; c < columnCount; c++)
{
cell = row.CreateCell(c);
cell.SetCellValue(propertyInfos[c].GetCustomAttribute<ExeclPropertyAttribute>().DisplayName);
}
//设置每行每列的单元格,
for (int i = 0; i < rowCount; i++)
{
row = sheet.CreateRow(i + 1);
for (int j = 0; j < columnCount; j++)
{
cell = row.CreateCell(j);//excel第二行开始写入数据
cell.SetCellValue(propertyInfos[j].GetValue(data[i])?.ToString());
}
}
using (fs = File.OpenWrite(strFile))
{
workbook.Write(fs);//向打开的这个xls文件中写入数据
result = true;
}
return result;
}
catch (Exception ex)
{
if (fs != null)
{
fs.Close();
}
return false;
}
}
/// <summary>
/// dt写入excel
/// </summary>
/// <param name="dt">datatable</param>
/// <param name="strFile">路径</param>
/// <returns></returns>
public static bool DataTableWriteExcel(DataTable dt, string strFile, string sheetName = "Sheet0")
{
bool result = false;
IWorkbook workbook = null;
FileStream fs = null;
IRow row = null;
ISheet sheet = null;
ICell cell = null;
try
{
if (dt != null && dt.Rows.Count > 0)
{
workbook = new XSSFWorkbook();
sheet = workbook.CreateSheet(sheetName);//创建一个名称为Sheet0的表
int rowCount = dt.Rows.Count;//行数
int columnCount = dt.Columns.Count;//列数
//设置列头
row = sheet.CreateRow(0);//excel第一行设为列头
for (int c = 0; c < columnCount; c++)
{
cell = row.CreateCell(c);
cell.SetCellValue(dt.Columns[c].ColumnName);
}
//设置每行每列的单元格,
for (int i = 0; i < rowCount; i++)
{
row = sheet.CreateRow(i + 1);
for (int j = 0; j < columnCount; j++)
{
cell = row.CreateCell(j);//excel第二行开始写入数据
cell.SetCellValue(dt.Rows[i][j].ToString());
}
}
using (fs = File.OpenWrite(strFile))
{
workbook.Write(fs);//向打开的这个xls文件中写入数据
result = true;
}
}
return result;
}
catch (Exception ex)
{
if (fs != null)
{
fs.Close();
}
return false;
}
}
/// <summary>
/// 设置单元格下拉框(除去标题行)
/// </summary>
/// <param name="workbook"></param>
/// <param name="sheet"></param>
/// <param name="ddlList"></param>
/// <param name="firstcol"></param>
/// <param name="lastcol"></param>
public static void SetCellDropdownList(IWorkbook workbook, ISheet sheet, List<string> ddlList, string sheetname, int sheetIndex, int firstcol, int lastcol)
{
# region ExcelHSSFWorkbook
//ISheet sheet2 = workbook.CreateSheet(sheetname);
////隐藏
//workbook.SetSheetHidden(sheetIndex, 1);
//int rowIndex = 0;
//foreach (var item in ddlList)
//{
// IRow vrow = sheet2.CreateRow(rowIndex);
// vrow.CreateCell(0).SetCellValue(item);
// rowIndex++;
//}
////创建的下拉项的区域:
//var rangeName = sheetname + "Range";
//IName range = workbook.CreateName();
//range.RefersToFormula = sheetname + "!$A$1:$A$" + rowIndex;
//range.NameName = rangeName;
//CellRangeAddressList regions = new CellRangeAddressList(1, 65535, firstcol, lastcol);
//DVConstraint constraint = DVConstraint.CreateFormulaListConstraint(rangeName);
//HSSFDataValidation dataValidate = new HSSFDataValidation(regions, constraint);
//dataValidate.CreateErrorBox("输入不合法", "请输入或选择下拉列表中的值。");
//dataValidate.ShowPromptBox = true;
//sheet.AddValidationData(dataValidate);
#endregion
//高版本excel【XSSFWorkbook】 设置下拉框
XSSFSheet sheetDDL = (XSSFSheet)workbook.CreateSheet(sheetname);
workbook.SetSheetHidden(sheetIndex, 1); //隐藏下拉框数据sheet
String[] datas = ddlList.ToArray(); //下拉框数据源
XSSFDataValidationHelper dvHelper = new XSSFDataValidationHelper(sheetDDL);
XSSFDataValidationConstraint dvConstraint = (XSSFDataValidationConstraint)dvHelper.CreateExplicitListConstraint(datas);
CellRangeAddressList addressList = new CellRangeAddressList(1, 65535, firstcol, lastcol); //下拉设置列
XSSFDataValidation validation = (XSSFDataValidation)dvHelper.CreateValidation(dvConstraint, addressList);
validation.SuppressDropDownArrow = true;
validation.ShowErrorBox = true;
validation.ShowPromptBox = true;
sheet.AddValidationData(validation);
}
}
}

View File

@@ -0,0 +1,28 @@
using NPOI.SS.UserModel;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
namespace AntSK.Domain
{
public class ExeclPropertyAttribute : Attribute
{
public ExeclPropertyAttribute()
{
}
public ExeclPropertyAttribute(string displayName, int order, CellType cellType = CellType.String)
{
DisplayName = displayName;
Order = order;
CellType = cellType;
}
public string DisplayName { get; set; }
public int Order { get; set; }
public CellType CellType { get; set; }
}
}

View File

@@ -0,0 +1,71 @@
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace AntSK.Domain.Common.LLamaFactory
{
public class ProcessWrapper
{
private Process process;
public static bool isProcessComplete = false;
public void StartProcess(string arguments, string workingDirectory)
{
process = new Process
{
StartInfo = new ProcessStartInfo
{
FileName = "python",
Arguments = arguments,
UseShellExecute = false,
RedirectStandardOutput = true,
RedirectStandardError = true,
CreateNoWindow = true,
WorkingDirectory = workingDirectory
}
};
using (Process start = Process.Start(process.StartInfo))
{
using (StreamReader reader = start.StandardOutput)
{
string result = reader.ReadToEnd();
if (result != null)
{
if (result.Contains(":8000"))
{
isProcessComplete = true;
}
}
Console.WriteLine(result);
}
start.WaitForExit();
}
}
public string WaitForProcessExit()
{
process.WaitForExit();
return process.StandardOutput.ReadToEnd();
}
public void KillProcess()
{
try
{
if (!process.HasExited)
{
process.Kill();
}
}
catch (InvalidOperationException)
{
// Process already exited.
}
}
}
}

View File

@@ -0,0 +1,8 @@
<Project>
<!-- See https://aka.ms/dotnet/msbuild/customize for more details on customizing your build -->
<PropertyGroup>
<KMVersion>0.37.240420.2</KMVersion>
<LLamaSharpVersion>0.11.2</LLamaSharpVersion>
</PropertyGroup>
</Project>

View File

@@ -1,8 +1,11 @@
using AntSK.Domain.Domain.Model.Dto;
using AntSK.Domain.Domain.Model;
using AntSK.Domain.Domain.Model.Dto;
using AntSK.Domain.Repositories;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using System;
using System.Collections.Generic;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
@@ -11,8 +14,10 @@ namespace AntSK.Domain.Domain.Interface
{
public interface IChatService
{
IAsyncEnumerable<StreamingKernelContent> SendChatByAppAsync(Apps app, string questions, string history);
IAsyncEnumerable<StreamingKernelContent> SendChatByAppAsync(Apps app, string questions, ChatHistory history);
IAsyncEnumerable<StreamingKernelContent> SendKmsByAppAsync(Apps app, string questions, string history, string filePath, List<RelevantSource> relevantSources = null);
IAsyncEnumerable<StreamingKernelContent> SendKmsByAppAsync(Apps app, string questions, ChatHistory history, string filePath, List<RelevantSource> relevantSources = null);
Task<string> SendImgByAppAsync(Apps app, string questions);
Task<ChatHistory> GetChatHistory(List<Chats> MessageList);
}
}

View File

@@ -7,13 +7,13 @@ namespace AntSK.Domain.Domain.Interface
{
public interface IKMService
{
MemoryServerless GetMemory(Apps app);
MemoryServerless GetMemoryByApp(Apps app);
MemoryServerless GetMemoryByKMS(string kmsID, SearchClientConfig searchClientConfig = null);
MemoryServerless GetMemoryByKMS(string kmsID);
Task<List<KMFile>> GetDocumentByFileID(string kmsId, string fileId);
Task<List<RelevantSource>> GetRelevantSourceList(string kmsIdListStr, string msg);
Task<List<RelevantSource>> GetRelevantSourceList(Apps app, string msg);
List<UploadFileItem> FileList { get; }

View File

@@ -6,6 +6,8 @@ namespace AntSK.Domain.Domain.Interface
public interface IKernelService
{
Kernel GetKernelByApp(Apps app);
Kernel GetKernelByAIModelID(string modelid);
void ImportFunctionsByApp(Apps app, Kernel _kernel);
Task<string> HistorySummarize(Kernel _kernel, string questions, string history);
}

View File

@@ -0,0 +1,21 @@
using AntSK.LLamaFactory.Model;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using static AntSK.Domain.Domain.Service.LLamaFactoryService;
namespace AntSK.Domain.Domain.Interface
{
public interface ILLamaFactoryService
{
public event LogMessageHandler LogMessageReceived;
Task PipInstall();
Task StartLLamaFactory(string modelName, string templateName);
void KillProcess();
List<LLamaModel> GetLLamaFactoryModels();
}
}

View File

@@ -11,5 +11,24 @@ namespace AntSK.Domain.Domain.Model.Constant
public const string KmsIdTag = "kmsid";
public const string KmsIndex = "kms";
public const string KmsSearchNull="知识库未搜索到相关内容";
public const string KmsPrompt = @"使用<data></data>标记的内容作为你的知识:
<data>
{{$doc}}
</data>
--------------------------
回答要求:
- 如果你不清楚答案,你需要澄清
- 避免提及你是从<data></data>获取的知识
- 保持答案与<data></data>众描述一致
- 使用Markdown语法优化回答格式。
- 如果Markdown有图片则正常显示
--------------------------
历史聊天记录:{{ConversationSummaryPlugin.SummarizeConversation $history}}
--------------------------
用户问题: {{$input}}";
public const string KMExcelSplit = "*&antsk_excel&*";
}
}

View File

@@ -0,0 +1,14 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace AntSK.Domain.Domain.Model.Constant
{
public class LLamaFactoryConstantcs
{
public const string LLamaFactorDic = "llamafactory";
public const string IsStartKey = "isstart";
}
}

View File

@@ -8,6 +8,8 @@ namespace AntSK.Domain.Domain.Model.Dto
public string Text { get; set; }
public float Relevance { get; set; }
public double RerankScore { get; set; }
public override string ToString()
{
return $"[file:{SourceName};Relevance:{(Relevance * 100):F2}%]:{Text}";

View File

@@ -22,8 +22,17 @@ namespace AntSK.Domain.Domain.Model.Enum
[Display(Name = "灵积大模型")]
DashScope = 5,
[Display(Name = "LLamaFactory")]
LLamaFactory = 6,
[Display(Name = "Bge Embedding")]
BgeEmbedding = 7,
[Display(Name = "Bge Rerank")]
BgeRerank = 8,
[Display(Name = "StableDiffusion")]
StableDiffusion = 9,
[Display(Name = "模拟输出")]
Mock = 100,
}
/// <summary>
@@ -33,5 +42,7 @@ namespace AntSK.Domain.Domain.Model.Enum
{
Chat = 1,
Embedding = 2,
Image=3,
Rerank=4
}
}

View File

@@ -9,6 +9,7 @@ namespace AntSK.Domain.Domain.Model.Enum
public enum AppType
{
chat = 1,
kms = 2
kms = 2,
img=3
}
}

View File

@@ -0,0 +1,17 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace AntSK.Domain.Domain.Model.Excel
{
public class KMSExcelModel
{
[ExeclProperty("问题",0)]
public string Question { get; set; }
[ExeclProperty("答案", 1)]
public string Answer { get; set; }
}
}

View File

@@ -17,11 +17,14 @@ namespace AntSK.Domain.Domain.Model
public string FilePath { get; set; } = "";
public string FileName { get; set; } = "";
public bool IsQA { get; set; } = false;
}
public class ImportKMSTaskReq : ImportKMSTaskDTO
{
public bool IsQA { get; set; }=false;
public KmsDetails KmsDetail { get; set; } = new KmsDetails();
}
@@ -29,6 +32,13 @@ namespace AntSK.Domain.Domain.Model
{
File = 1,
Url = 2,
Text = 3
Text = 3,
Excel=4
}
public class QAModel
{
public string ChatModelId { get; set; }
public string Context { get; set; }
}
}

View File

@@ -1,20 +0,0 @@
namespace AntSK.Domain.Domain.Model
{
public class MessageInfo
{
public string ID { get; set; } = "";
public string Context { get; set; } = "";
public string HtmlAnswers { get; set; } = "";
/// <summary>
/// 发送是true 接收是false
/// </summary>
public bool IsSend { get; set; } = false;
public DateTime CreateTime { get; set; }
public string? FilePath { get; set; }
public string? FileName { get; set; }
}
}

View File

@@ -18,7 +18,7 @@ namespace AntSK.Domain.Domain.Model.hfmirror
public string Author { get; set; }
public HfAuthorData AuthorData { get; set; }
public int Downloads { get; set; }
public bool Gated { get; set; }
public object Gated { get; set; }
public string Id { get; set; }
public DateTime LastModified { get; set; }
public int Likes { get; set; }

View File

@@ -0,0 +1,81 @@
using Newtonsoft.Json;
using Python.Runtime;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using static Python.Runtime.Py;
namespace AntSK.Domain.Domain.Other.Bge
{
public static class BegRerankConfig
{
public static dynamic model { get; set; }
static object lockobj = new object();
/// <summary>
/// 模型写死
/// </summary>
public static dynamic LoadModel(string pythondllPath, string modelName)
{
lock (lockobj)
{
if (model == null)
{
if (string.IsNullOrEmpty(Runtime.PythonDLL))
{
Runtime.PythonDLL = pythondllPath;
}
PythonEngine.Initialize();
try
{
using (GIL())// 初始化Python环境的Global Interpreter Lock)
{
dynamic modelscope = Py.Import("modelscope");
dynamic flagEmbedding = Py.Import("FlagEmbedding");
dynamic model_dir = modelscope.snapshot_download(modelName, revision: "master");
dynamic flagReranker = flagEmbedding.FlagReranker(model_dir, use_fp16: true);
model = flagReranker;
return model;
}
}
catch (Exception ex)
{
throw ex;
}
}
else
{
return model;
}
}
}
public static double Rerank(List<string> list)
{
using (GIL())
{
try
{
PyList pyList = new PyList();
foreach (string item in list)
{
pyList.Append(item.ToPython()); // 将C# string转换为Python对象并添加到PyList中
}
PyObject result = model.compute_score(pyList, normalize: true);
return result.As<double>();
}
catch (Exception ex)
{
throw ex;
}
}
}
}
}

View File

@@ -0,0 +1,97 @@
using Microsoft.KernelMemory.AI.OpenAI.GPT3;
using Python.Runtime;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using static Python.Runtime.Py;
namespace AntSK.Domain.Domain.Other.Bge
{
public static class BgeEmbeddingConfig
{
public static dynamic model { get; set; }
static object lockobj = new object();
/// <summary>
/// 模型写死
/// </summary>
public static dynamic LoadModel(string pythondllPath, string modelName)
{
lock (lockobj)
{
if (model == null)
{
//Runtime.PythonDLL = @"D:\Programs\Python\Python311\python311.dll";
if (string.IsNullOrEmpty(Runtime.PythonDLL))
{
Runtime.PythonDLL = pythondllPath;
}
PythonEngine.Initialize();
PythonEngine.BeginAllowThreads();
try
{
using (GIL())// 初始化Python环境的Global Interpreter Lock)
{
dynamic modelscope = Import("modelscope");
//dynamic model_dir = modelscope.snapshot_download("AI-ModelScope/bge-large-zh-v1.5", revision: "master");
dynamic model_dir = modelscope.snapshot_download(modelName, revision: "master");
dynamic HuggingFaceBgeEmbeddingstemp = Import("langchain_community.embeddings.huggingface");
dynamic HuggingFaceBgeEmbeddings = HuggingFaceBgeEmbeddingstemp.HuggingFaceBgeEmbeddings;
string model_name = model_dir;
dynamic model_kwargs = new PyDict();
model_kwargs["device"] = new PyString("cpu");
dynamic hugginmodel = HuggingFaceBgeEmbeddings(
model_name: model_dir,
model_kwargs: model_kwargs
);
model = hugginmodel;
return hugginmodel;
}
}
catch (Exception ex)
{
throw ex;
}
}
else
return model;
}
}
public static Task<float[]> GetEmbedding(string queryStr)
{
using (GIL())
{
PyObject queryResult = model.embed_query(queryStr);
var floatList = queryResult.As<float[]>();
return Task.FromResult(floatList); ;
}
}
public static int TokenCount(string queryStr)
{
//using (Py.GIL())
//{
// PyObject queryResult = model.client.tokenize(queryStr);
// // 使用Python的内置len()函数获取长度
// PyObject lenFunc = Py.Import("builtins").GetAttr("len");
// PyObject length = lenFunc.Invoke(queryResult["input_ids"]);
// int len = length.As<int>(); // 将PyObject转换为C#中的整数
// return len;
//}
var tokenCount1 = GPT3Tokenizer.Encode(queryStr).Count;
return tokenCount1;
}
public static void Dispose()
{
Console.WriteLine("python dispose");
}
}
}

View File

@@ -0,0 +1,156 @@
using AntSK.Domain.Domain.Model.Constant;
using Microsoft.Extensions.Logging;
using Microsoft.KernelMemory.AI.OpenAI;
using Microsoft.KernelMemory.Configuration;
using Microsoft.KernelMemory.DataFormats.Text;
using Microsoft.KernelMemory.Diagnostics;
using Microsoft.KernelMemory.Extensions;
using Microsoft.KernelMemory.Pipeline;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace AntSK.Domain.Domain.Other
{
public class KMExcelHandler: IPipelineStepHandler
{
private readonly TextPartitioningOptions _options;
private readonly IPipelineOrchestrator _orchestrator;
private readonly ILogger<KMExcelHandler> _log;
private readonly TextChunker.TokenCounter _tokenCounter;
public KMExcelHandler(
string stepName,
IPipelineOrchestrator orchestrator,
TextPartitioningOptions? options = null,
ILogger<KMExcelHandler>? log = null)
{
this.StepName = stepName;
this._orchestrator = orchestrator;
this._options = options ?? new TextPartitioningOptions();
this._options.Validate();
this._log = log ?? DefaultLogger<KMExcelHandler>.Instance;
this._tokenCounter = DefaultGPTTokenizer.StaticCountTokens;
}
/// <inheritdoc />
public string StepName { get; }
/// <inheritdoc />
public async Task<(bool success, DataPipeline updatedPipeline)> InvokeAsync(
DataPipeline pipeline, CancellationToken cancellationToken = default)
{
this._log.LogDebug("Partitioning text, pipeline '{0}/{1}'", pipeline.Index, pipeline.DocumentId);
if (pipeline.Files.Count == 0)
{
this._log.LogWarning("Pipeline '{0}/{1}': there are no files to process, moving to next pipeline step.", pipeline.Index, pipeline.DocumentId);
return (true, pipeline);
}
foreach (DataPipeline.FileDetails uploadedFile in pipeline.Files)
{
// Track new files being generated (cannot edit originalFile.GeneratedFiles while looping it)
Dictionary<string, DataPipeline.GeneratedFileDetails> newFiles = new();
foreach (KeyValuePair<string, DataPipeline.GeneratedFileDetails> generatedFile in uploadedFile.GeneratedFiles)
{
var file = generatedFile.Value;
if (file.AlreadyProcessedBy(this))
{
this._log.LogTrace("File {0} already processed by this handler", file.Name);
continue;
}
// Partition only the original text
if (file.ArtifactType != DataPipeline.ArtifactTypes.ExtractedText)
{
this._log.LogTrace("Skipping file {0} (not original text)", file.Name);
continue;
}
// Use a different partitioning strategy depending on the file type
List<string> partitions;
List<string> sentences;
BinaryData partitionContent = await this._orchestrator.ReadFileAsync(pipeline, file.Name, cancellationToken).ConfigureAwait(false);
// Skip empty partitions. Also: partitionContent.ToString() throws an exception if there are no bytes.
if (partitionContent.ToArray().Length == 0) { continue; }
switch (file.MimeType)
{
case MimeTypes.PlainText:
{
this._log.LogDebug("Partitioning text file {0}", file.Name);
string content = partitionContent.ToString();
var excelList = content.Split(KmsConstantcs.KMExcelSplit, StringSplitOptions.RemoveEmptyEntries).ToList();
sentences = excelList;
partitions = excelList;
break;
}
case MimeTypes.MarkDown:
{
this._log.LogDebug("Partitioning text file {0}", file.Name);
string content = partitionContent.ToString();
var excelList = content.Split(KmsConstantcs.KMExcelSplit, StringSplitOptions.RemoveEmptyEntries).ToList();
sentences = excelList;
partitions = excelList;
break;
}
default:
this._log.LogWarning("File {0} cannot be partitioned, type '{1}' not supported", file.Name, file.MimeType);
// Don't partition other files
continue;
}
if (partitions.Count == 0) { continue; }
this._log.LogDebug("Saving {0} file partitions", partitions.Count);
for (int partitionNumber = 0; partitionNumber < partitions.Count; partitionNumber++)
{
// TODO: turn partitions in objects with more details, e.g. page number
string text = partitions[partitionNumber];
int sectionNumber = 0; // TODO: use this to store the page number (if any)
BinaryData textData = new(text);
int tokenCount = this._tokenCounter(text);
this._log.LogDebug("Partition size: {0} tokens", tokenCount);
var destFile = uploadedFile.GetPartitionFileName(partitionNumber);
await this._orchestrator.WriteFileAsync(pipeline, destFile, textData, cancellationToken).ConfigureAwait(false);
var destFileDetails = new DataPipeline.GeneratedFileDetails
{
Id = Guid.NewGuid().ToString("N"),
ParentId = uploadedFile.Id,
Name = destFile,
Size = text.Length,
MimeType = MimeTypes.PlainText,
ArtifactType = DataPipeline.ArtifactTypes.TextPartition,
PartitionNumber = partitionNumber,
SectionNumber = sectionNumber,
Tags = pipeline.Tags,
ContentSHA256 = textData.CalculateSHA256(),
};
newFiles.Add(destFile, destFileDetails);
destFileDetails.MarkProcessedBy(this);
}
file.MarkProcessedBy(this);
}
// Add new files to pipeline status
foreach (var file in newFiles)
{
uploadedFile.GeneratedFiles.Add(file.Key, file.Value);
}
}
return (true, pipeline);
}
}
}

View File

@@ -31,7 +31,7 @@ namespace AntSK.Domain.Domain.Other
{
ContextSize = lsConfig?.ContextSize ?? 2048,
Seed = lsConfig?.Seed ?? 0,
GpuLayerCount = lsConfig?.GpuLayerCount ?? 10,
GpuLayerCount = lsConfig?.GpuLayerCount ?? 20,
EmbeddingMode = true
};
var weights = LLamaWeights.LoadFromFile(parameters);

View File

@@ -0,0 +1,173 @@
using AntSK.Domain.Domain.Interface;
using AntSK.Domain.Domain.Model;
using AntSK.Domain.Utils;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.Logging;
using Microsoft.KernelMemory.AI.OpenAI;
using Microsoft.KernelMemory.Configuration;
using Microsoft.KernelMemory.DataFormats.Text;
using Microsoft.KernelMemory.Diagnostics;
using Microsoft.KernelMemory.Extensions;
using Microsoft.KernelMemory.Pipeline;
using Microsoft.SemanticKernel;
using Newtonsoft.Json;
using RestSharp;
using System.Security.Policy;
using System.Text;
using System.Text.RegularExpressions;
namespace AntSK.Domain.Domain.Other
{
public class QAHandler : IPipelineStepHandler
{
private readonly TextPartitioningOptions _options;
private readonly IPipelineOrchestrator _orchestrator;
private readonly ILogger<QAHandler> _log;
private readonly TextChunker.TokenCounter _tokenCounter;
private readonly IKernelService _kernelService;
public QAHandler(
string stepName,
IPipelineOrchestrator orchestrator,
IKernelService kernelService,
TextPartitioningOptions? options = null,
ILogger<QAHandler>? log = null
)
{
this.StepName = stepName;
this._orchestrator = orchestrator;
this._options = options ?? new TextPartitioningOptions();
this._options.Validate();
this._log = log ?? DefaultLogger<QAHandler>.Instance;
this._tokenCounter = DefaultGPTTokenizer.StaticCountTokens;
this._kernelService = kernelService;
}
/// <inheritdoc />
public string StepName { get; }
/// <inheritdoc />
public async Task<(bool success, DataPipeline updatedPipeline)> InvokeAsync(
DataPipeline pipeline, CancellationToken cancellationToken = default)
{
this._log.LogDebug("Partitioning text, pipeline '{0}/{1}'", pipeline.Index, pipeline.DocumentId);
if (pipeline.Files.Count == 0)
{
this._log.LogWarning("Pipeline '{0}/{1}': there are no files to process, moving to next pipeline step.", pipeline.Index, pipeline.DocumentId);
return (true, pipeline);
}
foreach (DataPipeline.FileDetails uploadedFile in pipeline.Files)
{
// Track new files being generated (cannot edit originalFile.GeneratedFiles while looping it)
Dictionary<string, DataPipeline.GeneratedFileDetails> newFiles = new();
foreach (KeyValuePair<string, DataPipeline.GeneratedFileDetails> generatedFile in uploadedFile.GeneratedFiles)
{
var file = generatedFile.Value;
if (file.AlreadyProcessedBy(this))
{
this._log.LogTrace("File {0} already processed by this handler", file.Name);
continue;
}
// Partition only the original text
if (file.ArtifactType != DataPipeline.ArtifactTypes.ExtractedText)
{
this._log.LogTrace("Skipping file {0} (not original text)", file.Name);
continue;
}
// Use a different partitioning strategy depending on the file type
List<string> partitions;
List<string> sentences;
BinaryData partitionContent = await this._orchestrator.ReadFileAsync(pipeline, file.Name, cancellationToken).ConfigureAwait(false);
// Skip empty partitions. Also: partitionContent.ToString() throws an exception if there are no bytes.
if (partitionContent.ToArray().Length == 0) { continue; }
switch (file.MimeType)
{
case MimeTypes.PlainText:
case MimeTypes.MarkDown:
{
this._log.LogDebug("Partitioning text file {0}", file.Name);
string content = partitionContent.ToString();
var kernel = _kernelService.GetKernelByAIModelID(StepName);
var lines = TextChunker.SplitPlainTextLines(content, 299);
var paragraphs = TextChunker.SplitPlainTextParagraphs(lines, 3000);
KernelFunction jsonFun = kernel.Plugins.GetFunction("KMSPlugin", "QA");
List<string> qaList = new List<string>();
foreach (var para in paragraphs)
{
var qaresult = await kernel.InvokeAsync(function: jsonFun, new KernelArguments() { ["input"] = para });
var qaListStr = qaresult.GetValue<string>().ConvertToString();
string pattern = @"Q\d+:.*?A\d+:.*?(?=(Q\d+:|$))";
RegexOptions options = RegexOptions.Singleline;
foreach (Match match in Regex.Matches(qaListStr, pattern, options))
{
qaList.Add(match.Value.Trim()); // Trim用于删除可能的首尾空格
}
}
sentences = qaList;
partitions = qaList;
break;
}
default:
this._log.LogWarning("File {0} cannot be partitioned, type '{1}' not supported", file.Name, file.MimeType);
// Don't partition other files
continue;
}
if (partitions.Count == 0) { continue; }
this._log.LogDebug("Saving {0} file partitions", partitions.Count);
for (int partitionNumber = 0; partitionNumber < partitions.Count; partitionNumber++)
{
// TODO: turn partitions in objects with more details, e.g. page number
string text = partitions[partitionNumber];
int sectionNumber = 0; // TODO: use this to store the page number (if any)
BinaryData textData = new(text);
int tokenCount = this._tokenCounter(text);
this._log.LogDebug("Partition size: {0} tokens", tokenCount);
var destFile = uploadedFile.GetPartitionFileName(partitionNumber);
await this._orchestrator.WriteFileAsync(pipeline, destFile, textData, cancellationToken).ConfigureAwait(false);
var destFileDetails = new DataPipeline.GeneratedFileDetails
{
Id = Guid.NewGuid().ToString("N"),
ParentId = uploadedFile.Id,
Name = destFile,
Size = text.Length,
MimeType = MimeTypes.PlainText,
ArtifactType = DataPipeline.ArtifactTypes.TextPartition,
PartitionNumber = partitionNumber,
SectionNumber = sectionNumber,
Tags = pipeline.Tags,
ContentSHA256 = textData.CalculateSHA256(),
};
newFiles.Add(destFile, destFileDetails);
destFileDetails.MarkProcessedBy(this);
}
file.MarkProcessedBy(this);
}
// Add new files to pipeline status
foreach (var file in newFiles)
{
uploadedFile.GeneratedFiles.Add(file.Key, file.Value);
}
}
return (true, pipeline);
}
}
}

View File

@@ -1,17 +1,22 @@
using AntSK.Domain.Common.DependencyInjection;
using AntSK.Domain.Domain.Interface;
using AntSK.Domain.Repositories;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using Microsoft.SemanticKernel;
using System.Text;
using AntSK.Domain.Utils;
using AntSK.Domain.Domain.Model.Dto;
using AntSK.Domain.Domain.Model;
using AntSK.Domain.Domain.Model.Constant;
using DocumentFormat.OpenXml.Drawing;
using System.Reflection.Metadata;
using Microsoft.KernelMemory;
using System.Collections.Generic;
using AntSK.Domain.Domain.Model.Dto;
using AntSK.Domain.Domain.Other.Bge;
using AntSK.Domain.Repositories;
using AntSK.Domain.Utils;
using AntSK.LLM.StableDiffusion;
using Markdig;
using Microsoft.KernelMemory;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using System.Diagnostics;
using System.Drawing;
using System.Runtime.InteropServices;
using System.Text;
using System.Text.RegularExpressions;
using ChatHistory = Microsoft.SemanticKernel.ChatCompletion.ChatHistory;
namespace AntSK.Domain.Domain.Service
{
@@ -19,7 +24,8 @@ namespace AntSK.Domain.Domain.Service
public class ChatService(
IKernelService _kernelService,
IKMService _kMService,
IKmsDetails_Repositories _kmsDetails_Repositories
IKmsDetails_Repositories _kmsDetails_Repositories,
IAIModels_Repositories _aIModels_Repositories
) : IChatService
{
/// <summary>
@@ -29,13 +35,31 @@ namespace AntSK.Domain.Domain.Service
/// <param name="questions"></param>
/// <param name="history"></param>
/// <returns></returns>
public async IAsyncEnumerable<StreamingKernelContent> SendChatByAppAsync(Apps app, string questions, string history)
public async IAsyncEnumerable<StreamingKernelContent> SendChatByAppAsync(Apps app, string questions, ChatHistory history)
{
if (string.IsNullOrEmpty(app.Prompt) || !app.Prompt.Contains("{{$input}}"))
{
//如果模板为空,给默认提示词
app.Prompt = app.Prompt.ConvertToString() + "{{$input}}";
}
KernelArguments args = new KernelArguments();
if (history.Count > 10)
{
app.Prompt = @"${{ConversationSummaryPlugin.SummarizeConversation $history}}" + app.Prompt;
args = new() {
{ "history", string.Join("\n", history.Select(x => x.Role + ": " + x.Content)) },
{ "input", questions }
};
}
else
{
args = new()
{
{ "input", $"{string.Join("\n", history.Select(x => x.Role + ": " + x.Content))}{Environment.NewLine} user:{questions}" }
};
}
var _kernel = _kernelService.GetKernelByApp(app);
var temperature = app.Temperature / 100;//存的是0~100需要缩小
OpenAIPromptExecutionSettings settings = new() { Temperature = temperature };
@@ -45,20 +69,22 @@ namespace AntSK.Domain.Domain.Service
settings.ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions;
}
var func = _kernel.CreateFunctionFromPrompt(app.Prompt, settings);
var chatResult = _kernel.InvokeStreamingAsync(function: func, arguments: new KernelArguments() { ["input"] = $"{history}{Environment.NewLine} user:{questions}" });
var chatResult = _kernel.InvokeStreamingAsync(function: func,
arguments: args);
await foreach (var content in chatResult)
{
yield return content;
}
}
public async IAsyncEnumerable<StreamingKernelContent> SendKmsByAppAsync(Apps app, string questions, string history, string filePath, List<RelevantSource> relevantSources = null)
public async IAsyncEnumerable<StreamingKernelContent> SendKmsByAppAsync(Apps app, string questions, ChatHistory history, string filePath, List<RelevantSource> relevantSources = null)
{
var relevantSourceList = await _kMService.GetRelevantSourceList(app.KmsIdList, questions);
relevantSources?.Clear();
var relevantSourceList = await _kMService.GetRelevantSourceList(app, questions);
var _kernel = _kernelService.GetKernelByApp(app);
if (!string.IsNullOrWhiteSpace(filePath))
{
var memory = _kMService.GetMemory(app);
var memory = _kMService.GetMemoryByApp(app);
var fileId = Guid.NewGuid().ToString();
var result = await memory.ImportDocumentAsync(new Microsoft.KernelMemory.Document(fileId).AddFile(filePath)
.AddTag(KmsConstantcs.KmsIdTag, app.Id)
@@ -75,22 +101,87 @@ namespace AntSK.Domain.Domain.Service
})));
}
var dataMsg = new StringBuilder();
if (relevantSourceList.Any())
{
relevantSources?.AddRange(relevantSourceList);
foreach (var item in relevantSources)
if (!string.IsNullOrEmpty(app.RerankModelID))
{
dataMsg.AppendLine(item.ToString());
var rerankModel=_aIModels_Repositories.GetById(app.RerankModelID);
BegRerankConfig.LoadModel(rerankModel.EndPoint, rerankModel.ModelName);
//进行rerank
foreach (var item in relevantSourceList)
{
List<string> rerank = new List<string>();
rerank.Add(questions);
rerank.Add(item.Text);
item.RerankScore = BegRerankConfig.Rerank(rerank);
}
relevantSourceList = relevantSourceList.OrderByDescending(p => p.RerankScore).Take(app.MaxMatchesCount).ToList();
}
bool isSearch = false;
foreach (var item in relevantSourceList)
{
if (!string.IsNullOrEmpty(app.RerankModelID))
{
//匹配重排后相似度
if (item.RerankScore >= app.Relevance / 100)
{
dataMsg.AppendLine(item.ToString());
isSearch = true;
}
}
else
{
//匹配相似度
if (item.Relevance >= app.Relevance / 100)
{
dataMsg.AppendLine(item.ToString());
isSearch = true;
}
}
}
KernelFunction jsonFun = _kernel.Plugins.GetFunction("KMSPlugin", "Ask");
var chatResult = _kernel.InvokeStreamingAsync(function: jsonFun,
arguments: new KernelArguments() { ["doc"] = dataMsg, ["history"] = history, ["questions"] = questions });
await foreach (var content in chatResult)
//处理markdown显示
relevantSources?.AddRange(relevantSourceList);
Dictionary<string, string> fileDic = new Dictionary<string, string>();
foreach (var item in relevantSourceList)
{
yield return content;
if (fileDic.ContainsKey(item.SourceName))
{
item.SourceName = fileDic[item.SourceName];
}
else
{
string fileName = _kmsDetails_Repositories.GetFirst(p => p.FileGuidName == item.SourceName).FileName;
fileDic.Add(item.SourceName, fileName);
item.SourceName = fileName;
}
item.Text = Markdown.ToHtml(item.Text);
}
if (isSearch)
{
//KernelFunction jsonFun = _kernel.Plugins.GetFunction("KMSPlugin", "Ask1");
var temperature = app.Temperature / 100;//存的是0~100需要缩小
OpenAIPromptExecutionSettings settings = new() { Temperature = temperature };
var func = _kernel.CreateFunctionFromPrompt(app.Prompt, settings);
var chatResult = _kernel.InvokeStreamingAsync(function: func,
arguments: new KernelArguments() { ["doc"] = dataMsg.ToString(), ["history"] = string.Join("\n", history.Select(x => x.Role + ": " + x.Content)), ["input"] = questions });
await foreach (var content in chatResult)
{
yield return content;
}
}
else
{
yield return new StreamingTextContent(KmsConstantcs.KmsSearchNull);
}
}
else
@@ -98,5 +189,133 @@ namespace AntSK.Domain.Domain.Service
yield return new StreamingTextContent(KmsConstantcs.KmsSearchNull);
}
}
public async Task<string> SendImgByAppAsync(Apps app, string questions)
{
var imageModel = _aIModels_Repositories.GetFirst(p => p.Id == app.ImageModelID);
KernelArguments args = new() {
{ "input", questions }
};
var _kernel = _kernelService.GetKernelByApp(app);
var temperature = app.Temperature / 100; //存的是0~100需要缩小
OpenAIPromptExecutionSettings settings = new() { Temperature = temperature };
var func = _kernel.CreateFunctionFromPrompt("Translate this into English:{{$input}}", settings);
var chatResult = await _kernel.InvokeAsync(function: func, arguments: args);
if (chatResult.IsNotNull())
{
//Can Load stable-diffusion library in diffenert environment
//SDHelper.LoadLibrary()
string versionString = string.Empty;
string extensionString = string.Empty;
if (RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
{
extensionString = ".dll";
}
else if (RuntimeInformation.IsOSPlatform(OSPlatform.Linux))
{
extensionString = ".so";
}
else
{
throw new InvalidOperationException("OS Platform no support");
}
ProcessStartInfo startInfo = new ProcessStartInfo("nvcc", "--version");
startInfo.RedirectStandardOutput = true;
startInfo.UseShellExecute = false;
startInfo.CreateNoWindow = true;
using (Process process = Process.Start(startInfo))
{
if (process != null)
{
string result = process.StandardOutput.ReadToEnd();
Regex regex = new Regex(@"release (\d+).[\d]");
Match match = regex.Match(result);
if (match.Success)
{
switch (match.Groups[1].Value.ToString())
{
case "11":
versionString = "Cuda11";
break;
case "12":
versionString = "Cuda12";
break;
default:
versionString = "CPU";
break;
}
}
}
else
{
throw new Exception("nvcc get an error");
}
}
string libraryPath = System.IO.Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "StableDiffusion", "Backend", versionString, "stable-diffusion" + extensionString);
NativeLibrary.TryLoad(libraryPath, out _);
string prompt = chatResult.GetValue<string>();
if (!SDHelper.IsInitialized)
{
Structs.ModelParams modelParams = new Structs.ModelParams
{
ModelPath = imageModel.ModelName,
RngType = Structs.RngType.CUDA_RNG,
//VaePath = vaePath,
//KeepVaeOnCpu = keepVaeOnCpu,
//set false can get a better image, otherwise can use lower vram
VaeTiling = false,
//LoraModelDir = loraModelDir,
};
bool result = SDHelper.Initialize(modelParams);
}
Structs.TextToImageParams textToImageParams = new Structs.TextToImageParams
{
Prompt = prompt,
NegativePrompt = "bad quality, wrong image, worst quality",
SampleMethod = (Structs.SampleMethod)Enum.Parse(typeof(Structs.SampleMethod), "EULER_A"),
//the base image size in SD1.5 is 512x512
Width = 512,
Height = 512,
NormalizeInput = true,
ClipSkip = -1,
CfgScale = 7,
SampleSteps = 20,
Seed = -1,
};
Bitmap[] outputImages = SDHelper.TextToImage(textToImageParams);
var base64 = ImageUtils.BitmapToBase64(outputImages[0]);
return base64;
}
else
{
return "";
}
}
public async Task<ChatHistory> GetChatHistory(List<Chats> MessageList)
{
ChatHistory history = new ChatHistory();
if (MessageList.Count > 1)
{
foreach (var item in MessageList)
{
if (item.IsSend)
{
history.AddUserMessage(item.Context);
}
else
{
history.AddAssistantMessage(item.Context);
}
}
}
return history;
}
}
}

View File

@@ -2,8 +2,12 @@
using AntSK.Domain.Domain.Interface;
using AntSK.Domain.Domain.Model;
using AntSK.Domain.Domain.Model.Constant;
using AntSK.Domain.Domain.Model.Excel;
using AntSK.Domain.Domain.Other;
using AntSK.Domain.Repositories;
using Microsoft.KernelMemory;
using Microsoft.KernelMemory.Handlers;
using System.Text;
namespace AntSK.Domain.Domain.Service
{
@@ -20,18 +24,40 @@ namespace AntSK.Domain.Domain.Service
try
{
var km = _kmss_Repositories.GetFirst(p => p.Id == req.KmsId);
var _memory = _kMService.GetMemoryByKMS(km.Id);
string fileid = req.KmsDetail.Id;
List<string> step = new List<string>();
if (req.IsQA)
{
_memory.Orchestrator.AddHandler<TextExtractionHandler>("extract_text");
_memory.Orchestrator.AddHandler<QAHandler>(km.ChatModelID);
_memory.Orchestrator.AddHandler<GenerateEmbeddingsHandler>("generate_embeddings");
_memory.Orchestrator.AddHandler<SaveRecordsHandler>("save_memory_records");
step.Add("extract_text");
step.Add(km.ChatModelID);
step.Add("generate_embeddings");
step.Add("save_memory_records");
}
switch (req.ImportType)
{
case ImportType.File:
//导入文件
{
var importResult = _memory.ImportDocumentAsync(new Document(fileid)
.AddFile(req.FilePath)
.AddTag(KmsConstantcs.KmsIdTag, req.KmsId)
, index: KmsConstantcs.KmsIndex).Result;
//导入文件
if (req.IsQA)
{
var importResult = _memory.ImportDocumentAsync(new Document(fileid)
.AddFile(req.FilePath)
.AddTag(KmsConstantcs.KmsIdTag, req.KmsId)
,index: KmsConstantcs.KmsIndex ,steps: step.ToArray()).Result;
}
else
{
var importResult = _memory.ImportDocumentAsync(new Document(fileid)
.AddFile(req.FilePath)
.AddTag(KmsConstantcs.KmsIdTag, req.KmsId)
, index: KmsConstantcs.KmsIndex).Result;
}
//查询文档数量
var docTextList = _kMService.GetDocumentByFileID(km.Id, fileid).Result;
string fileGuidName = Path.GetFileName(req.FilePath);
@@ -44,8 +70,16 @@ namespace AntSK.Domain.Domain.Service
case ImportType.Url:
{
//导入url
var importResult = _memory.ImportWebPageAsync(req.Url, fileid, new TagCollection() { { KmsConstantcs.KmsIdTag, req.KmsId } }
, index: KmsConstantcs.KmsIndex).Result;
if (req.IsQA)
{
var importResult = _memory.ImportWebPageAsync(req.Url, fileid, new TagCollection() { { KmsConstantcs.KmsIdTag, req.KmsId } }
, index: KmsConstantcs.KmsIndex, steps: step.ToArray()).Result;
}
else
{
var importResult = _memory.ImportWebPageAsync(req.Url, fileid, new TagCollection() { { KmsConstantcs.KmsIdTag, req.KmsId } }
, index: KmsConstantcs.KmsIndex).Result;
}
//查询文档数量
var docTextList = _kMService.GetDocumentByFileID(km.Id, fileid).Result;
req.KmsDetail.Url = req.Url;
@@ -55,8 +89,16 @@ namespace AntSK.Domain.Domain.Service
case ImportType.Text:
//导入文本
{
var importResult = _memory.ImportTextAsync(req.Text, fileid, new TagCollection() { { KmsConstantcs.KmsIdTag, req.KmsId } }
, index: KmsConstantcs.KmsIndex).Result;
if (req.IsQA)
{
var importResult = _memory.ImportTextAsync(req.Text, fileid, new TagCollection() { { KmsConstantcs.KmsIdTag, req.KmsId } }
, index: KmsConstantcs.KmsIndex, steps: step.ToArray()).Result;
}
else
{
var importResult = _memory.ImportTextAsync(req.Text, fileid, new TagCollection() { { KmsConstantcs.KmsIdTag, req.KmsId } }
, index: KmsConstantcs.KmsIndex).Result;
}
//查询文档数量
var docTextList = _kMService.GetDocumentByFileID(km.Id, fileid).Result;
req.KmsDetail.Url = req.Url;
@@ -64,6 +106,36 @@ namespace AntSK.Domain.Domain.Service
}
break;
case ImportType.Excel:
using (var fs = File.OpenRead(req.FilePath))
{
var excelList= ExeclHelper.ExcelToList<KMSExcelModel>(fs);
_memory.Orchestrator.AddHandler<TextExtractionHandler>("extract_text");
_memory.Orchestrator.AddHandler<KMExcelHandler>("antsk_excel_split");
_memory.Orchestrator.AddHandler<GenerateEmbeddingsHandler>("generate_embeddings");
_memory.Orchestrator.AddHandler<SaveRecordsHandler>("save_memory_records");
StringBuilder text = new StringBuilder();
foreach (var item in excelList)
{
text.AppendLine(@$"Question:{item.Question}{Environment.NewLine}Answer:{item.Answer}{KmsConstantcs.KMExcelSplit}");
}
var importResult = _memory.ImportTextAsync(text.ToString(), fileid, new TagCollection() { { KmsConstantcs.KmsIdTag, req.KmsId } }
, index: KmsConstantcs.KmsIndex,
steps: new[]
{
"extract_text",
"antsk_excel_split",
"generate_embeddings",
"save_memory_records"
}
).Result;
req.KmsDetail.FileName = req.FileName;
string fileGuidName = Path.GetFileName(req.FilePath);
req.KmsDetail.FileGuidName = fileGuidName;
req.KmsDetail.DataCount = excelList.Count();
}
break;
}
req.KmsDetail.Status = Model.Enum.ImportKmsStatus.Success;
_kmsDetails_Repositories.Update(req.KmsDetail);

View File

@@ -1,5 +1,6 @@
using AntDesign;
using AntSK.Domain.Common.DependencyInjection;
using AntSK.Domain.Common.Embedding;
using AntSK.Domain.Domain.Interface;
using AntSK.Domain.Domain.Model.Constant;
using AntSK.Domain.Domain.Model.Dto;
@@ -7,6 +8,7 @@ using AntSK.Domain.Domain.Other;
using AntSK.Domain.Options;
using AntSK.Domain.Repositories;
using AntSK.Domain.Utils;
using AntSK.OCR;
using DocumentFormat.OpenXml.Drawing.Diagrams;
using LLama;
using LLamaSharp.KernelMemory;
@@ -15,6 +17,7 @@ using Microsoft.AspNetCore.Components;
using Microsoft.Extensions.Configuration;
using Microsoft.KernelMemory;
using Microsoft.KernelMemory.Configuration;
using Microsoft.KernelMemory.DataFormats;
using Microsoft.KernelMemory.FileSystem.DevTools;
using Microsoft.KernelMemory.MemoryStorage;
using Microsoft.KernelMemory.MemoryStorage.DevTools;
@@ -26,7 +29,8 @@ namespace AntSK.Domain.Domain.Service
public class KMService(
IKmss_Repositories _kmss_Repositories,
IAIModels_Repositories _aIModels_Repositories,
IMessageService? _message
IMessageService? _message,
IKernelService _kernelService
) : IKMService
{
private MemoryServerless _memory;
@@ -35,20 +39,36 @@ namespace AntSK.Domain.Domain.Service
public List<UploadFileItem> FileList => _fileList;
public MemoryServerless GetMemory(Apps app)
public MemoryServerless GetMemoryByApp(Apps app)
{
var chatModel = _aIModels_Repositories.GetFirst(p => p.Id == app.ChatModelID);
var embedModel = _aIModels_Repositories.GetFirst(p => p.Id == app.EmbeddingModelID);
var chatHttpClient = OpenAIHttpClientHandlerUtil.GetHttpClient(chatModel.EndPoint);
var embeddingHttpClient = OpenAIHttpClientHandlerUtil.GetHttpClient(embedModel.EndPoint);
var searchClientConfig = new SearchClientConfig
SearchClientConfig searchClientConfig;
if (string.IsNullOrEmpty(app.RerankModelID))
{
MaxAskPromptSize = 2048,
MaxMatchesCount = 3,
AnswerTokens = 1000,
EmptyAnswer = KmsConstantcs.KmsSearchNull
};
//不重排直接取查询数
searchClientConfig = new SearchClientConfig
{
MaxAskPromptSize = app.MaxAskPromptSize,
MaxMatchesCount = app.MaxMatchesCount,
AnswerTokens = app.AnswerTokens,
EmptyAnswer = KmsConstantcs.KmsSearchNull
};
}
else
{
//重排取rerank数
searchClientConfig = new SearchClientConfig
{
MaxAskPromptSize = app.MaxAskPromptSize,
MaxMatchesCount = app.RerankCount,
AnswerTokens = app.AnswerTokens,
EmptyAnswer = KmsConstantcs.KmsSearchNull
};
}
var memoryBuild = new KernelMemoryBuilder()
.WithSearchClientConfig(searchClientConfig)
@@ -70,7 +90,7 @@ namespace AntSK.Domain.Domain.Service
return _memory;
}
public MemoryServerless GetMemoryByKMS(string kmsID, SearchClientConfig searchClientConfig = null)
public MemoryServerless GetMemoryByKMS(string kmsID)
{
//if (_memory.IsNull())
{
@@ -84,33 +104,35 @@ namespace AntSK.Domain.Domain.Service
var embeddingHttpClient = OpenAIHttpClientHandlerUtil.GetHttpClient(embedModel.EndPoint);
//搜索配置
if (searchClientConfig.IsNull())
{
searchClientConfig = new SearchClientConfig
{
MaxAskPromptSize = 2048,
MaxMatchesCount = 3,
AnswerTokens = 1000,
EmptyAnswer = KmsConstantcs.KmsSearchNull
};
}
//if (searchClientConfig.IsNull())
//{
// searchClientConfig = new SearchClientConfig
// {
// MaxAskPromptSize = 2048,
// MaxMatchesCount = 3,
// AnswerTokens = 1000,
// EmptyAnswer = KmsConstantcs.KmsSearchNull
// };
//}
var memoryBuild = new KernelMemoryBuilder()
.WithSearchClientConfig(searchClientConfig)
//.WithSearchClientConfig(searchClientConfig)
.WithCustomTextPartitioningOptions(new TextPartitioningOptions
{
MaxTokensPerLine = kms.MaxTokensPerLine,
MaxTokensPerParagraph = kms.MaxTokensPerParagraph,
OverlappingTokens = kms.OverlappingTokens
});
//加载OCR
WithOcr(memoryBuild, kms);
//加载会话模型
WithTextGenerationByAIType(memoryBuild, chatModel, chatHttpClient);
//加载向量模型
WithTextEmbeddingGenerationByAIType(memoryBuild, embedModel, embeddingHttpClient);
//加载向量库
WithMemoryDbByVectorDB(memoryBuild);
_memory = memoryBuild.Build<MemoryServerless>();
_memory = memoryBuild.AddSingleton<IKernelService>(_kernelService).Build<MemoryServerless>();
return _memory;
}
//else {
@@ -118,6 +140,14 @@ namespace AntSK.Domain.Domain.Service
//}
}
private static void WithOcr(IKernelMemoryBuilder memoryBuild, Kmss kms)
{
if (kms.IsOCR == 1)
{
memoryBuild.WithCustomImageOcr(new AntSKOcrEngine());
}
}
private void WithTextEmbeddingGenerationByAIType(IKernelMemoryBuilder memory, AIModels embedModel,
HttpClient embeddingHttpClient)
{
@@ -147,6 +177,11 @@ namespace AntSK.Domain.Domain.Service
var embedder = new LLamaEmbedder(weights, parameters);
memory.WithLLamaSharpTextEmbeddingGeneration(new LLamaSharpTextEmbeddingGenerator(embedder));
break;
case Model.Enum.AIType.BgeEmbedding:
string pyDll = embedModel.EndPoint;
string bgeEmbeddingModelName = embedModel.ModelName;
memory.WithBgeTextEmbeddingGeneration(new HuggingfaceTextEmbeddingGenerator(pyDll,bgeEmbeddingModelName));
break;
case Model.Enum.AIType.DashScope:
memory.WithDashScopeDefaults(embedModel.ModelKey);
break;
@@ -183,6 +218,14 @@ namespace AntSK.Domain.Domain.Service
var executor = new StatelessExecutor(weights, parameters);
memory.WithLLamaSharpTextGeneration(new LlamaSharpTextGenerator(weights, context, executor));
break;
case Model.Enum.AIType.LLamaFactory:
memory.WithOpenAITextGeneration(new OpenAIConfig()
{
APIKey = "NotNull",
TextModel = chatModel.ModelName
}, null, chatHttpClient);
break;
case Model.Enum.AIType.DashScope:
memory.WithDashScopeTextGeneration(new Cnblogs.KernelMemory.AI.DashScope.DashScopeConfig
{
@@ -220,6 +263,20 @@ namespace AntSK.Domain.Domain.Service
StorageType = FileSystemTypes.Volatile
});
break;
case "Qdrant":
var qdrantConfig = ConnectionString.Split("|");
memory.WithQdrantMemoryDb(qdrantConfig[0],qdrantConfig[1]);
break;
case "Redis":
memory.WithRedisMemoryDb(new RedisConfig()
{
ConnectionString = ConnectionString,
});
break;
case "AzureAISearch":
var aisearchConfig = ConnectionString.Split("|");
memory.WithAzureAISearchMemoryDb(aisearchConfig[0], aisearchConfig[1]);
break;
}
}
@@ -234,7 +291,7 @@ namespace AntSK.Domain.Domain.Service
{
foreach (var memoryDb in memoryDbs)
{
var items = await memoryDb.GetListAsync(memoryIndex.Name, new List<MemoryFilter>() { new MemoryFilter().ByDocument(fileId) }, 100, true).ToListAsync();
var items = await memoryDb.GetListAsync(memoryIndex.Name, new List<MemoryFilter>() { new MemoryFilter().ByDocument(fileId) }, 1000, true).ToListAsync();
docTextList.AddRange(items.Select(item => new KMFile()
{
DocumentId = item.GetDocumentId(),
@@ -249,15 +306,15 @@ namespace AntSK.Domain.Domain.Service
return docTextList;
}
public async Task<List<RelevantSource>> GetRelevantSourceList(string kmsIdListStr, string msg)
public async Task<List<RelevantSource>> GetRelevantSourceList(Apps app ,string msg)
{
var result = new List<RelevantSource>();
if (string.IsNullOrWhiteSpace(kmsIdListStr))
if (string.IsNullOrWhiteSpace(app.KmsIdList))
return result;
var kmsIdList = kmsIdListStr.Split(",");
var kmsIdList = app.KmsIdList.Split(",");
if (!kmsIdList.Any()) return result;
var memory = GetMemoryByKMS(kmsIdList.FirstOrDefault()!);
var memory = GetMemoryByApp(app);
var filters = kmsIdList.Select(kmsId => new MemoryFilter().ByTag(KmsConstantcs.KmsIdTag, kmsId)).ToList();
@@ -269,7 +326,7 @@ namespace AntSK.Domain.Domain.Service
result.AddRange(item.Partitions.Select(part => new RelevantSource()
{
SourceName = item.SourceName,
Text = Markdown.ToHtml(part.Text),
Text = part.Text,
Relevance = part.Relevance
}));
}
@@ -291,7 +348,10 @@ namespace AntSK.Domain.Domain.Service
"application/pdf",
"application/json",
"text/x-markdown",
"text/markdown"
"text/markdown",
"image/jpeg",
"image/png",
"image/tiff"
};
string[] exceptExts = [".md", ".pdf"];

View File

@@ -15,8 +15,11 @@ using System;
using ServiceLifetime = AntSK.Domain.Common.DependencyInjection.ServiceLifetime;
using AntSK.LLM.Mock;
using AntSK.Domain.Domain.Model.Enum;
using AntSK.LLM.LLamaFactory;
using System.Reflection;
using DocumentFormat.OpenXml.Drawing;
using Microsoft.KernelMemory;
using OpenCvSharp.ML;
namespace AntSK.Domain.Domain.Service
{
@@ -56,7 +59,7 @@ namespace AntSK.Domain.Domain.Service
var chatHttpClient = OpenAIHttpClientHandlerUtil.GetHttpClient(chatModel.EndPoint);
var builder = Kernel.CreateBuilder();
WithTextGenerationByAIType(builder, app, chatModel, chatHttpClient);
WithTextGenerationByAIType(builder, chatModel, chatHttpClient);
_kernel = builder.Build();
RegisterPluginsWithKernel(_kernel);
@@ -68,7 +71,18 @@ namespace AntSK.Domain.Domain.Service
//}
}
private void WithTextGenerationByAIType(IKernelBuilder builder, Apps app, AIModels chatModel, HttpClient chatHttpClient)
public Kernel GetKernelByAIModelID(string modelid)
{
var chatModel = _aIModels_Repositories.GetById(modelid);
var chatHttpClient = OpenAIHttpClientHandlerUtil.GetHttpClient(chatModel.EndPoint);
var builder = Kernel.CreateBuilder();
WithTextGenerationByAIType(builder, chatModel, chatHttpClient);
_kernel = builder.Build();
RegisterPluginsWithKernel(_kernel);
return _kernel;
}
private void WithTextGenerationByAIType(IKernelBuilder builder,AIModels chatModel, HttpClient chatHttpClient)
{
switch (chatModel.AIType)
{
@@ -95,7 +109,7 @@ namespace AntSK.Domain.Domain.Service
case Model.Enum.AIType.SparkDesk:
var options = new SparkDeskOptions { AppId = chatModel.EndPoint, ApiSecret = chatModel.ModelKey, ApiKey = chatModel.ModelName, ModelVersion = Sdcb.SparkDesk.ModelVersion.V3_5 };
builder.Services.AddKeyedSingleton<ITextGenerationService>("spark-desk", new SparkDeskTextCompletion(options, app.Id));
builder.Services.AddKeyedSingleton<ITextGenerationService>("spark-desk", new SparkDeskTextCompletion(options, chatModel.Id));
break;
case Model.Enum.AIType.DashScope:
@@ -105,6 +119,13 @@ namespace AntSK.Domain.Domain.Service
case Model.Enum.AIType.Mock:
builder.Services.AddKeyedSingleton<ITextGenerationService>("mock", new MockTextCompletion());
break;
case Model.Enum.AIType.LLamaFactory:
builder.AddOpenAIChatCompletion(
modelId: chatModel.ModelName,
apiKey: "123",
httpClient: chatHttpClient
);
break;
}
}
@@ -154,7 +175,7 @@ namespace AntSK.Domain.Domain.Service
new KernelParameterMetadata("jsonbody"){
Name="json参数字符串",
ParameterType=typeof(string),
Description=$"需要根据背景文档:{Environment.NewLine}{api.InputPrompt} {Environment.NewLine}提取出对应的json格式字符串参考如下格式:{Environment.NewLine}{api.Query}"
Description=$"背景文档:{Environment.NewLine}{api.InputPrompt} {Environment.NewLine}提取出对应的json格式字符串参考如下格式:{Environment.NewLine}{api.Query}"
}
};
functions.Add(_kernel.CreateFunctionFromMethod((string jsonbody) =>
@@ -193,7 +214,7 @@ namespace AntSK.Domain.Domain.Service
new KernelParameterMetadata("jsonbody"){
Name="json参数字符串",
ParameterType=typeof(string),
Description=$"需要根据背景文档:{Environment.NewLine}{api.InputPrompt} {Environment.NewLine}提取出对应的json格式字符串参考如下格式:{Environment.NewLine}{api.JsonBody}"
Description=$"背景文档:{Environment.NewLine}{api.InputPrompt} {Environment.NewLine}提取出对应的json格式字符串参考如下格式:{Environment.NewLine}{api.JsonBody}"
}
};
functions.Add(_kernel.CreateFunctionFromMethod((string jsonBody) =>

View File

@@ -0,0 +1,170 @@
using AntSK.Domain.Common.DependencyInjection;
using AntSK.Domain.Domain.Interface;
using AntSK.Domain.Domain.Model.Dto;
using AntSK.Domain.Options;
using AntSK.LLamaFactory.Model;
using Microsoft.AspNetCore.Mvc.ModelBinding;
using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Diagnostics.Tracing;
using System.Linq;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
namespace AntSK.Domain.Domain.Service
{
[ServiceDescription(typeof(ILLamaFactoryService), ServiceLifetime.Singleton)]
public class LLamaFactoryService : ILLamaFactoryService
{
private Process process;
public static bool isProcessComplete = false;
private readonly object _syncLock = new object();
private List<LLamaModel> modelList = new List<LLamaModel>();
public LLamaFactoryService() { }
public delegate Task LogMessageHandler(string message);
public event LogMessageHandler LogMessageReceived;
protected virtual async Task OnLogMessageReceived(string message)
{
LogMessageReceived?.Invoke(message);
}
public async Task PipInstall()
{
var cmdTask = Task.Factory.StartNew(() =>
{
var isProcessComplete = false;
process = new Process
{
StartInfo = new ProcessStartInfo
{
FileName = "pip",
Arguments = "install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple",
UseShellExecute = false,
RedirectStandardOutput = true,
RedirectStandardError = true,
WorkingDirectory = AppDomain.CurrentDomain.BaseDirectory,
}
};
process.OutputDataReceived += (sender, eventArgs) =>
{
Console.WriteLine($"{eventArgs.Data}");
OnLogMessageReceived(eventArgs.Data);
};
process.ErrorDataReceived += (sender, eventArgs) =>
{
Console.WriteLine($"{eventArgs.Data}");
OnLogMessageReceived(eventArgs.Data);
};
process.Start();
process.BeginOutputReadLine();
process.BeginErrorReadLine();
process.WaitForExit();
OnLogMessageReceived("--------------------完成--------------------");
}, TaskCreationOptions.LongRunning);
await cmdTask;
}
public async Task StartLLamaFactory(string modelName, string templateName)
{
var cmdTask = Task.Factory.StartNew(() =>
{
var isProcessComplete = false;
process = new Process
{
StartInfo = new ProcessStartInfo
{
FileName = "python",
Arguments = "api_demo.py --model_name_or_path " + modelName + " --template " + templateName + " ",
UseShellExecute = false,
RedirectStandardOutput = true,
RedirectStandardError=true,
WorkingDirectory = Path.Combine(Path.GetDirectoryName(System.Reflection.Assembly.GetEntryAssembly().Location), "llamafactory"),
}
};
process.StartInfo.Environment["CUDA_VISIBLE_DEVICES"] = Environment.GetEnvironmentVariable("CUDA_VISIBLE_DEVICES") ?? "0";
process.StartInfo.Environment["API_PORT"] = "8000";
process.StartInfo.EnvironmentVariables["USE_MODELSCOPE_HUB"] = Environment.GetEnvironmentVariable("USE_MODELSCOPE_HUB") ?? "1";
process.OutputDataReceived += (sender, eventArgs) =>
{
Console.WriteLine($"{eventArgs.Data}");
OnLogMessageReceived(eventArgs.Data);
};
process.ErrorDataReceived += (sender, eventArgs) =>
{
Console.WriteLine($"{eventArgs.Data}");
OnLogMessageReceived(eventArgs.Data);
};
process.Start();
process.BeginOutputReadLine();
process.BeginErrorReadLine();
process.WaitForExit();
OnLogMessageReceived("--------------------完成--------------------");
}, TaskCreationOptions.LongRunning);
await cmdTask;
}
private void Process_OutputDataReceived(object sender, DataReceivedEventArgs e)
{
throw new NotImplementedException();
}
public string WaitForProcessExit()
{
process.WaitForExit();
return process.StandardOutput.ReadToEnd();
}
public void KillProcess()
{
try
{
Process[] processes = Process.GetProcesses();
foreach (Process process1 in processes)
{
if (process1.ProcessName.ToLower() == "python")
{
process1.Kill();
System.Console.WriteLine("kill python");
}
}
}
catch (InvalidOperationException ex)
{
// Process already exited.
}
}
public List<LLamaModel> GetLLamaFactoryModels()
{
if (modelList.Count==0)
{
string jsonString = File.ReadAllText(Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "modelList.json"));
// 反序列化 JSON 字符串到相应的 C# 对象
var Models = JsonConvert.DeserializeObject<List<LLamaFactoryModel>>(jsonString);
foreach (var model in Models)
{
foreach (var m in model.Models)
{
modelList.Add(new LLamaModel() { Name=m.Key, ModelScope=m.Value.MODELSCOPE });
}
}
}
return modelList;
}
}
}

View File

@@ -3,10 +3,6 @@
public class LLamaSharpOption
{
public static string RunType { get; set; }
public static string Chat { get; set; }
public static string Embedding { get; set; }
public static string FileDirectory { get; set; } = Directory.GetCurrentDirectory();
}
}

View File

@@ -44,6 +44,10 @@ namespace AntSK.Domain.Repositories
/// </summary>
public string? EmbeddingModelID { get; set; }
public string? RerankModelID { get; set; }
public string? ImageModelID { get; set; }
/// <summary>
/// 温度
/// </summary>
@@ -53,6 +57,7 @@ namespace AntSK.Domain.Repositories
/// <summary>
/// 提示词
/// </summary>
[SugarColumn(ColumnDataType = "varchar(2000)")]
public string? Prompt { get; set; }
/// <summary>
@@ -76,5 +81,31 @@ namespace AntSK.Domain.Repositories
/// API调用秘钥
/// </summary>
public string? SecretKey { get; set; }
/// <summary>
/// 相似度
/// </summary>
[SugarColumn(DefaultValue = "70")]
public double Relevance { get; set; } = 70f;
/// <summary>
/// 提问最大token数
/// </summary>
[SugarColumn(DefaultValue = "2048")]
public int MaxAskPromptSize { get; set; } = 2048;
/// <summary>
/// 向量匹配数
/// </summary>
[SugarColumn(DefaultValue = "3")]
public int MaxMatchesCount { get; set; } = 3;
[SugarColumn(DefaultValue = "20")]
public int RerankCount { get; set; } = 20;
/// <summary>
/// 回答最大token数
/// </summary>
[SugarColumn(DefaultValue = "2048")]
public int AnswerTokens { get; set; } = 2048;
}
}

View File

@@ -0,0 +1,41 @@
using AntSK.Domain.Domain.Model.Enum;
using SqlSugar;
using System.ComponentModel.DataAnnotations;
namespace AntSK.Domain.Repositories
{
[SugarTable("Chats")]
public partial class Chats
{
[SugarColumn(IsPrimaryKey = true)]
public string Id { get; set; }
/// <summary>
/// 用户名
/// </summary>
public string UserName { get; set; }
/// <summary>
/// 应用ID
/// </summary>
public string AppId { get; set; }
/// <summary>
/// 消息内容
/// </summary>
[SugarColumn(ColumnDataType = "varchar(4000)")]
public string Context { get; set; } = "";
/// <summary>
/// 发送是true 接收是false
/// </summary>
public bool IsSend { get; set; } = false;
/// <summary>
/// 创建事件
/// </summary>
public DateTime CreateTime { get; set; }
/// <summary>
/// 文件名
/// </summary>
public string? FileName { get; set; }
}
}

View File

@@ -0,0 +1,11 @@

using AntSK.Domain.Common.DependencyInjection;
using AntSK.Domain.Repositories.Base;
namespace AntSK.Domain.Repositories
{
[ServiceDescription(typeof(IChats_Repositories), ServiceLifetime.Scoped)]
public class Chats_Repositories : Repository<Chats>, IChats_Repositories
{
}
}

View File

@@ -0,0 +1,8 @@
using AntSK.Domain.Repositories.Base;
namespace AntSK.Domain.Repositories
{
public interface IChats_Repositories : IRepository<Chats>
{
}
}

View File

@@ -55,6 +55,7 @@ namespace AntSK.Domain.Repositories
[SugarColumn(DefaultValue = "49")]
public int OverlappingTokens { get; set; } = 49;
[SugarColumn(DefaultValue = "0")]
public int IsOCR { get; set; } = 0;
}
}

View File

@@ -0,0 +1,16 @@
using AntSK.Domain.Domain.Model.Enum;
using SqlSugar;
using System.ComponentModel.DataAnnotations;
namespace AntSK.Domain.Repositories
{
[SugarTable("Dics")]
public partial class Dics
{
[SugarColumn(IsPrimaryKey = true)]
public string Id { get; set; }
public string Type { get; set; }
public string Key { get; set; }
public string Value { get; set; }
}
}

View File

@@ -0,0 +1,11 @@

using AntSK.Domain.Common.DependencyInjection;
using AntSK.Domain.Repositories.Base;
namespace AntSK.Domain.Repositories
{
[ServiceDescription(typeof(IDics_Repositories), ServiceLifetime.Scoped)]
public class Dics_Repositories : Repository<Dics>, IDics_Repositories
{
}
}

View File

@@ -0,0 +1,8 @@
using AntSK.Domain.Repositories.Base;
namespace AntSK.Domain.Repositories
{
public interface IDics_Repositories : IRepository<Dics>
{
}
}

View File

@@ -250,5 +250,16 @@ namespace AntSK.Domain.Utils
return nameValueCollection.ToString();
}
/// <summary>
/// 忽略大小写匹配
/// </summary>
/// <param name="s"></param>
/// <param name="value"></param>
/// <returns></returns>
public static bool ComparisonIgnoreCase(this string s, string value)
{
return s.Equals(value, StringComparison.OrdinalIgnoreCase);
}
}
}

View File

@@ -0,0 +1,39 @@
using System;
using System.Collections.Generic;
using System.Drawing.Imaging;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace AntSK.Domain.Utils
{
public class ImageUtils
{
public static string BitmapToBase64(Bitmap bitmap)
{
using (MemoryStream memoryStream = new MemoryStream())
{
// 保存为JPEG格式也可以选择PngGif等等
bitmap.Save(memoryStream, ImageFormat.Jpeg);
// 获取内存流的字节数组
byte[] imageBytes = memoryStream.ToArray();
// 将字节转换为Base64字符串
string base64String = Convert.ToBase64String(imageBytes);
return base64String;
}
}
public static List<string> BitmapListToBase64(Bitmap[] bitmaps)
{
List<string> base64Strings = new List<string>();
foreach (Bitmap bitmap in bitmaps)
{
base64Strings.Add(BitmapToBase64(bitmap));
}
return base64Strings;
}
}
}

View File

@@ -0,0 +1,21 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net8.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
</PropertyGroup>
<ItemGroup>
<Content Include="llamafactory\**">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</Content>
</ItemGroup>
<ItemGroup>
<None Update="modelList.json">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Update="requirements.txt">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
</ItemGroup>
</Project>

View File

@@ -0,0 +1,27 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace AntSK.LLamaFactory.Model
{
public class ModelInfo
{
public string DEFAULT { get; set; }
public string MODELSCOPE { get; set; }
}
public class LLamaFactoryModel
{
public Dictionary<string, ModelInfo> Models { get; set; }
public string Template { get; set; }
}
public class LLamaModel
{
public string Name { get; set; }
public string ModelScope { get; set; }
}
}

View File

@@ -0,0 +1,16 @@
import os
import uvicorn
from llmtuner import ChatModel, create_app
def main():
chat_model = ChatModel()
app = create_app(chat_model)
print("Visit http://localhost:{}/docs for API document.".format(os.environ.get("API_PORT", 8000)))
uvicorn.run(app, host="0.0.0.0", port=int(os.environ.get("API_PORT", 8000)), workers=1)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,11 @@
# Level: api, webui > chat, eval, train > data, model > extras, hparams
from .api import create_app
from .chat import ChatModel
from .eval import Evaluator
from .train import export_model, run_exp
from .webui import create_ui, create_web_demo
__version__ = "0.5.3"
__all__ = ["create_app", "ChatModel", "Evaluator", "export_model", "run_exp", "create_ui", "create_web_demo"]

View File

@@ -0,0 +1,4 @@
from .app import create_app
__all__ = ["create_app"]

View File

@@ -0,0 +1,224 @@
import json
import os
from contextlib import asynccontextmanager
from typing import Any, Dict, Sequence
from pydantic import BaseModel
from ..chat import ChatModel
from ..data import Role as DataRole
from ..extras.misc import torch_gc
from ..extras.packages import is_fastapi_availble, is_starlette_available, is_uvicorn_available
from .protocol import (
ChatCompletionMessage,
ChatCompletionRequest,
ChatCompletionResponse,
ChatCompletionResponseChoice,
ChatCompletionResponseStreamChoice,
ChatCompletionResponseUsage,
ChatCompletionStreamResponse,
Finish,
Function,
FunctionCall,
ModelCard,
ModelList,
Role,
ScoreEvaluationRequest,
ScoreEvaluationResponse,
)
if is_fastapi_availble():
from fastapi import FastAPI, HTTPException, status
from fastapi.middleware.cors import CORSMiddleware
if is_starlette_available():
from sse_starlette import EventSourceResponse
if is_uvicorn_available():
import uvicorn
@asynccontextmanager
async def lifespan(app: "FastAPI"): # collects GPU memory
yield
torch_gc()
def dictify(data: "BaseModel") -> Dict[str, Any]:
try: # pydantic v2
return data.model_dump(exclude_unset=True)
except AttributeError: # pydantic v1
return data.dict(exclude_unset=True)
def jsonify(data: "BaseModel") -> str:
try: # pydantic v2
return json.dumps(data.model_dump(exclude_unset=True), ensure_ascii=False)
except AttributeError: # pydantic v1
return data.json(exclude_unset=True, ensure_ascii=False)
def create_app(chat_model: "ChatModel") -> "FastAPI":
app = FastAPI(lifespan=lifespan)
app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
role_mapping = {
Role.USER: DataRole.USER.value,
Role.ASSISTANT: DataRole.ASSISTANT.value,
Role.SYSTEM: DataRole.SYSTEM.value,
Role.FUNCTION: DataRole.FUNCTION.value,
Role.TOOL: DataRole.OBSERVATION.value,
}
@app.get("/v1/models", response_model=ModelList)
async def list_models():
model_card = ModelCard(id="gpt-3.5-turbo")
return ModelList(data=[model_card])
@app.post("/v1/chat/completions", response_model=ChatCompletionResponse, status_code=status.HTTP_200_OK)
async def create_chat_completion(request: ChatCompletionRequest):
if not chat_model.engine.can_generate:
raise HTTPException(status_code=status.HTTP_405_METHOD_NOT_ALLOWED, detail="Not allowed")
if len(request.messages) == 0:
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Invalid length")
if request.messages[0].role == Role.SYSTEM:
system = request.messages.pop(0).content
else:
system = ""
if len(request.messages) % 2 == 0:
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Only supports u/a/u/a/u...")
input_messages = []
for i, message in enumerate(request.messages):
if i % 2 == 0 and message.role not in [Role.USER, Role.TOOL]:
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Invalid role")
elif i % 2 == 1 and message.role not in [Role.ASSISTANT, Role.FUNCTION]:
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Invalid role")
input_messages.append({"role": role_mapping[message.role], "content": message.content})
tool_list = request.tools
if isinstance(tool_list, list) and len(tool_list):
try:
tools = json.dumps([tool["function"] for tool in tool_list], ensure_ascii=False)
except Exception:
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Invalid tools")
else:
tools = ""
if request.stream:
if tools:
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Cannot stream function calls.")
generate = stream_chat_completion(input_messages, system, tools, request)
return EventSourceResponse(generate, media_type="text/event-stream")
responses = await chat_model.achat(
input_messages,
system,
tools,
do_sample=request.do_sample,
temperature=request.temperature,
top_p=request.top_p,
max_new_tokens=request.max_tokens,
num_return_sequences=request.n,
)
prompt_length, response_length = 0, 0
choices = []
for i, response in enumerate(responses):
if tools:
result = chat_model.engine.template.format_tools.extract(response.response_text)
else:
result = response.response_text
if isinstance(result, tuple):
name, arguments = result
function = Function(name=name, arguments=arguments)
response_message = ChatCompletionMessage(
role=Role.ASSISTANT, tool_calls=[FunctionCall(function=function)]
)
finish_reason = Finish.TOOL
else:
response_message = ChatCompletionMessage(role=Role.ASSISTANT, content=result)
finish_reason = Finish.STOP if response.finish_reason == "stop" else Finish.LENGTH
choices.append(
ChatCompletionResponseChoice(index=i, message=response_message, finish_reason=finish_reason)
)
prompt_length = response.prompt_length
response_length += response.response_length
usage = ChatCompletionResponseUsage(
prompt_tokens=prompt_length,
completion_tokens=response_length,
total_tokens=prompt_length + response_length,
)
return ChatCompletionResponse(model=request.model, choices=choices, usage=usage)
async def stream_chat_completion(
messages: Sequence[Dict[str, str]], system: str, tools: str, request: ChatCompletionRequest
):
choice_data = ChatCompletionResponseStreamChoice(
index=0, delta=ChatCompletionMessage(role=Role.ASSISTANT, content=""), finish_reason=None
)
chunk = ChatCompletionStreamResponse(model=request.model, choices=[choice_data])
yield jsonify(chunk)
async for new_token in chat_model.astream_chat(
messages,
system,
tools,
do_sample=request.do_sample,
temperature=request.temperature,
top_p=request.top_p,
max_new_tokens=request.max_tokens,
):
if len(new_token) == 0:
continue
choice_data = ChatCompletionResponseStreamChoice(
index=0, delta=ChatCompletionMessage(content=new_token), finish_reason=None
)
chunk = ChatCompletionStreamResponse(model=request.model, choices=[choice_data])
yield jsonify(chunk)
choice_data = ChatCompletionResponseStreamChoice(
index=0, delta=ChatCompletionMessage(), finish_reason=Finish.STOP
)
chunk = ChatCompletionStreamResponse(model=request.model, choices=[choice_data])
yield jsonify(chunk)
yield "[DONE]"
@app.post("/v1/score/evaluation", response_model=ScoreEvaluationResponse, status_code=status.HTTP_200_OK)
async def create_score_evaluation(request: ScoreEvaluationRequest):
if chat_model.engine.can_generate:
raise HTTPException(status_code=status.HTTP_405_METHOD_NOT_ALLOWED, detail="Not allowed")
if len(request.messages) == 0:
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail="Invalid request")
scores = await chat_model.aget_scores(request.messages, max_length=request.max_length)
return ScoreEvaluationResponse(model=request.model, scores=scores)
return app
if __name__ == "__main__":
chat_model = ChatModel()
app = create_app(chat_model)
uvicorn.run(app, host="0.0.0.0", port=int(os.environ.get("API_PORT", 8000)), workers=1)

View File

@@ -0,0 +1,116 @@
import time
from enum import Enum, unique
from typing import List, Optional
from pydantic import BaseModel, Field
from typing_extensions import Literal
@unique
class Role(str, Enum):
USER = "user"
ASSISTANT = "assistant"
SYSTEM = "system"
FUNCTION = "function"
TOOL = "tool"
@unique
class Finish(str, Enum):
STOP = "stop"
LENGTH = "length"
TOOL = "tool_calls"
class ModelCard(BaseModel):
id: str
object: Literal["model"] = "model"
created: int = Field(default_factory=lambda: int(time.time()))
owned_by: Literal["owner"] = "owner"
class ModelList(BaseModel):
object: Literal["list"] = "list"
data: List[ModelCard] = []
class Function(BaseModel):
name: str
arguments: str
class FunctionCall(BaseModel):
id: Literal["call_default"] = "call_default"
type: Literal["function"] = "function"
function: Function
class ChatMessage(BaseModel):
role: Role
content: str
class ChatCompletionMessage(BaseModel):
role: Optional[Role] = None
content: Optional[str] = None
tool_calls: Optional[List[FunctionCall]] = None
class ChatCompletionRequest(BaseModel):
model: str
messages: List[ChatMessage]
tools: list = []
do_sample: bool = True
temperature: Optional[float] = None
top_p: Optional[float] = None
n: int = 1
max_tokens: Optional[int] = None
stream: bool = False
class ChatCompletionResponseChoice(BaseModel):
index: int
message: ChatCompletionMessage
finish_reason: Finish
class ChatCompletionResponseStreamChoice(BaseModel):
index: int
delta: ChatCompletionMessage
finish_reason: Optional[Finish] = None
class ChatCompletionResponseUsage(BaseModel):
prompt_tokens: int
completion_tokens: int
total_tokens: int
class ChatCompletionResponse(BaseModel):
id: Literal["chatcmpl-default"] = "chatcmpl-default"
object: Literal["chat.completion"] = "chat.completion"
created: int = Field(default_factory=lambda: int(time.time()))
model: str
choices: List[ChatCompletionResponseChoice]
usage: ChatCompletionResponseUsage
class ChatCompletionStreamResponse(BaseModel):
id: Literal["chatcmpl-default"] = "chatcmpl-default"
object: Literal["chat.completion.chunk"] = "chat.completion.chunk"
created: int = Field(default_factory=lambda: int(time.time()))
model: str
choices: List[ChatCompletionResponseStreamChoice]
class ScoreEvaluationRequest(BaseModel):
model: str
messages: List[str]
max_length: Optional[int] = None
class ScoreEvaluationResponse(BaseModel):
id: Literal["scoreeval-default"] = "scoreeval-default"
object: Literal["score.evaluation"] = "score.evaluation"
model: str
scores: List[float]

View File

@@ -0,0 +1,5 @@
from .base_engine import BaseEngine
from .chat_model import ChatModel
__all__ = ["BaseEngine", "ChatModel"]

View File

@@ -0,0 +1,69 @@
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import TYPE_CHECKING, Any, AsyncGenerator, Dict, List, Literal, Optional, Sequence, Union
if TYPE_CHECKING:
from transformers import PreTrainedModel, PreTrainedTokenizer
from ..data import Template
from ..extras.packages import is_vllm_available
from ..hparams import DataArguments, FinetuningArguments, GeneratingArguments, ModelArguments
if is_vllm_available():
from vllm import AsyncLLMEngine
@dataclass
class Response:
response_text: str
response_length: int
prompt_length: int
finish_reason: Literal["stop", "length"]
class BaseEngine(ABC):
model: Union["PreTrainedModel", "AsyncLLMEngine"]
tokenizer: "PreTrainedTokenizer"
can_generate: bool
template: "Template"
generating_args: Dict[str, Any]
@abstractmethod
def __init__(
self,
model_args: "ModelArguments",
data_args: "DataArguments",
finetuning_args: "FinetuningArguments",
generating_args: "GeneratingArguments",
) -> None: ...
@abstractmethod
async def start(
self,
) -> None: ...
@abstractmethod
async def chat(
self,
messages: Sequence[Dict[str, str]],
system: Optional[str] = None,
tools: Optional[str] = None,
**input_kwargs,
) -> List["Response"]: ...
@abstractmethod
async def stream_chat(
self,
messages: Sequence[Dict[str, str]],
system: Optional[str] = None,
tools: Optional[str] = None,
**input_kwargs,
) -> AsyncGenerator[str, None]: ...
@abstractmethod
async def get_scores(
self,
batch_input: List[str],
**input_kwargs,
) -> List[float]: ...

View File

@@ -0,0 +1,91 @@
import asyncio
from threading import Thread
from typing import TYPE_CHECKING, Any, AsyncGenerator, Dict, Generator, List, Optional, Sequence
from ..hparams import get_infer_args
from .hf_engine import HuggingfaceEngine
from .vllm_engine import VllmEngine
if TYPE_CHECKING:
from .base_engine import BaseEngine, Response
def _start_background_loop(loop: asyncio.AbstractEventLoop) -> None:
asyncio.set_event_loop(loop)
loop.run_forever()
class ChatModel:
def __init__(self, args: Optional[Dict[str, Any]] = None) -> None:
model_args, data_args, finetuning_args, generating_args = get_infer_args(args)
if model_args.infer_backend == "huggingface":
self.engine: "BaseEngine" = HuggingfaceEngine(model_args, data_args, finetuning_args, generating_args)
elif model_args.infer_backend == "vllm":
self.engine: "BaseEngine" = VllmEngine(model_args, data_args, finetuning_args, generating_args)
else:
raise NotImplementedError("Unknown backend: {}".format(model_args.infer_backend))
self._loop = asyncio.new_event_loop()
self._thread = Thread(target=_start_background_loop, args=(self._loop,), daemon=True)
self._thread.start()
asyncio.run_coroutine_threadsafe(self.engine.start(), self._loop)
def chat(
self,
messages: Sequence[Dict[str, str]],
system: Optional[str] = None,
tools: Optional[str] = None,
**input_kwargs,
) -> List["Response"]:
task = asyncio.run_coroutine_threadsafe(self.achat(messages, system, tools, **input_kwargs), self._loop)
return task.result()
async def achat(
self,
messages: Sequence[Dict[str, str]],
system: Optional[str] = None,
tools: Optional[str] = None,
**input_kwargs,
) -> List["Response"]:
return await self.engine.chat(messages, system, tools, **input_kwargs)
def stream_chat(
self,
messages: Sequence[Dict[str, str]],
system: Optional[str] = None,
tools: Optional[str] = None,
**input_kwargs,
) -> Generator[str, None, None]:
generator = self.astream_chat(messages, system, tools, **input_kwargs)
while True:
try:
task = asyncio.run_coroutine_threadsafe(generator.__anext__(), self._loop)
yield task.result()
except StopAsyncIteration:
break
async def astream_chat(
self,
messages: Sequence[Dict[str, str]],
system: Optional[str] = None,
tools: Optional[str] = None,
**input_kwargs,
) -> AsyncGenerator[str, None]:
async for new_token in self.engine.stream_chat(messages, system, tools, **input_kwargs):
yield new_token
def get_scores(
self,
batch_input: List[str],
**input_kwargs,
) -> List[float]:
task = asyncio.run_coroutine_threadsafe(self.aget_scores(batch_input, **input_kwargs), self._loop)
return task.result()
async def aget_scores(
self,
batch_input: List[str],
**input_kwargs,
) -> List[float]:
return await self.engine.get_scores(batch_input, **input_kwargs)

View File

@@ -0,0 +1,263 @@
import asyncio
import concurrent.futures
import os
from threading import Thread
from typing import TYPE_CHECKING, Any, AsyncGenerator, Callable, Dict, List, Optional, Sequence, Tuple
import torch
from transformers import GenerationConfig, TextIteratorStreamer
from ..data import get_template_and_fix_tokenizer
from ..extras.misc import get_logits_processor
from ..model import load_model_and_tokenizer
from .base_engine import BaseEngine, Response
if TYPE_CHECKING:
from transformers import PreTrainedModel, PreTrainedTokenizer
from trl import PreTrainedModelWrapper
from ..data import Template
from ..hparams import DataArguments, FinetuningArguments, GeneratingArguments, ModelArguments
class HuggingfaceEngine(BaseEngine):
def __init__(
self,
model_args: "ModelArguments",
data_args: "DataArguments",
finetuning_args: "FinetuningArguments",
generating_args: "GeneratingArguments",
) -> None:
self.can_generate = finetuning_args.stage == "sft"
self.model, self.tokenizer = load_model_and_tokenizer(
model_args, finetuning_args, is_trainable=False, add_valuehead=(not self.can_generate)
)
self.tokenizer.padding_side = "left" if self.can_generate else "right"
self.template = get_template_and_fix_tokenizer(self.tokenizer, data_args.template)
self.generating_args = generating_args.to_dict()
@staticmethod
def _process_args(
model: "PreTrainedModel",
tokenizer: "PreTrainedTokenizer",
template: "Template",
generating_args: Dict[str, Any],
messages: Sequence[Dict[str, str]],
system: Optional[str] = None,
tools: Optional[str] = None,
input_kwargs: Optional[Dict[str, Any]] = {},
) -> Tuple[Dict[str, Any], int]:
paired_messages = messages + [{"role": "assistant", "content": ""}]
prompt_ids, _ = template.encode_oneturn(
tokenizer=tokenizer, messages=paired_messages, system=system, tools=tools
)
prompt_length = len(prompt_ids)
inputs = torch.tensor([prompt_ids], device=model.device)
do_sample = input_kwargs.pop("do_sample", None)
temperature = input_kwargs.pop("temperature", None)
top_p = input_kwargs.pop("top_p", None)
top_k = input_kwargs.pop("top_k", None)
num_return_sequences = input_kwargs.pop("num_return_sequences", None)
repetition_penalty = input_kwargs.pop("repetition_penalty", None)
max_length = input_kwargs.pop("max_length", None)
max_new_tokens = input_kwargs.pop("max_new_tokens", None)
generating_args.update(
dict(
do_sample=do_sample if do_sample is not None else generating_args["do_sample"],
temperature=temperature or generating_args["temperature"],
top_p=top_p or generating_args["top_p"],
top_k=top_k or generating_args["top_k"],
num_return_sequences=num_return_sequences or 1,
repetition_penalty=repetition_penalty or generating_args["repetition_penalty"],
eos_token_id=[tokenizer.eos_token_id] + tokenizer.additional_special_tokens_ids,
pad_token_id=tokenizer.pad_token_id,
)
)
if isinstance(num_return_sequences, int) and num_return_sequences > 1:
generating_args["do_sample"] = True
if max_length:
generating_args.pop("max_new_tokens", None)
generating_args["max_length"] = max_length
if max_new_tokens:
generating_args.pop("max_length", None)
generating_args["max_new_tokens"] = max_new_tokens
gen_kwargs = dict(
inputs=inputs,
generation_config=GenerationConfig(**generating_args),
logits_processor=get_logits_processor(),
)
return gen_kwargs, prompt_length
@staticmethod
@torch.inference_mode()
def _chat(
model: "PreTrainedModel",
tokenizer: "PreTrainedTokenizer",
template: "Template",
generating_args: Dict[str, Any],
messages: Sequence[Dict[str, str]],
system: Optional[str] = None,
tools: Optional[str] = None,
input_kwargs: Optional[Dict[str, Any]] = {},
) -> List["Response"]:
gen_kwargs, prompt_length = HuggingfaceEngine._process_args(
model, tokenizer, template, generating_args, messages, system, tools, input_kwargs
)
generate_output = model.generate(**gen_kwargs)
response_ids = generate_output[:, prompt_length:]
response = tokenizer.batch_decode(response_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True)
results = []
for i in range(len(response)):
eos_index = (response_ids[i] == tokenizer.eos_token_id).nonzero()
response_length = (eos_index[0].item() + 1) if len(eos_index) else len(response_ids[i])
results.append(
Response(
response_text=response[i],
response_length=response_length,
prompt_length=prompt_length,
finish_reason="stop" if len(eos_index) else "length",
)
)
return results
@staticmethod
@torch.inference_mode()
def _stream_chat(
model: "PreTrainedModel",
tokenizer: "PreTrainedTokenizer",
template: "Template",
generating_args: Dict[str, Any],
messages: Sequence[Dict[str, str]],
system: Optional[str] = None,
tools: Optional[str] = None,
input_kwargs: Optional[Dict[str, Any]] = {},
) -> Callable[[], str]:
gen_kwargs, _ = HuggingfaceEngine._process_args(
model, tokenizer, template, generating_args, messages, system, tools, input_kwargs
)
streamer = TextIteratorStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
gen_kwargs["streamer"] = streamer
thread = Thread(target=model.generate, kwargs=gen_kwargs, daemon=True)
thread.start()
def stream():
try:
return streamer.__next__()
except StopIteration:
raise StopAsyncIteration()
return stream
@staticmethod
@torch.inference_mode()
def _get_scores(
model: "PreTrainedModelWrapper",
tokenizer: "PreTrainedTokenizer",
batch_input: List[str],
input_kwargs: Optional[Dict[str, Any]] = {},
) -> List[float]:
max_length = input_kwargs.pop("max_length", None)
device = getattr(model.pretrained_model, "device", "cuda")
inputs = tokenizer(
batch_input,
padding=True,
truncation=True,
max_length=max_length or getattr(model.config, "max_position_embeddings", 1024),
return_tensors="pt",
add_special_tokens=True,
).to(device)
input_ids: torch.Tensor = inputs["input_ids"]
_, _, values = model(**inputs, output_hidden_states=True, return_dict=True)
if getattr(model.config, "model_type", None) == "chatglm":
values = torch.transpose(values, 0, 1)
scores = []
for i in range(input_ids.size(0)):
end_indexes = (input_ids[i] != tokenizer.pad_token_id).nonzero()
end_index = end_indexes[-1].item() if len(end_indexes) else 0
scores.append(values[i, end_index].nan_to_num().item())
return scores
async def start(self) -> None:
self._semaphore = asyncio.Semaphore(int(os.environ.get("MAX_CONCURRENT", 1)))
async def chat(
self,
messages: Sequence[Dict[str, str]],
system: Optional[str] = None,
tools: Optional[str] = None,
**input_kwargs,
) -> List["Response"]:
if not self.can_generate:
raise ValueError("The current model does not support `chat`.")
loop = asyncio.get_running_loop()
input_args = (
self.model,
self.tokenizer,
self.template,
self.generating_args,
messages,
system,
tools,
input_kwargs,
)
async with self._semaphore:
with concurrent.futures.ThreadPoolExecutor() as pool:
return await loop.run_in_executor(pool, self._chat, *input_args)
async def stream_chat(
self,
messages: Sequence[Dict[str, str]],
system: Optional[str] = None,
tools: Optional[str] = None,
**input_kwargs,
) -> AsyncGenerator[str, None]:
if not self.can_generate:
raise ValueError("The current model does not support `stream_chat`.")
loop = asyncio.get_running_loop()
input_args = (
self.model,
self.tokenizer,
self.template,
self.generating_args,
messages,
system,
tools,
input_kwargs,
)
async with self._semaphore:
with concurrent.futures.ThreadPoolExecutor() as pool:
stream = self._stream_chat(*input_args)
while True:
try:
yield await loop.run_in_executor(pool, stream)
except StopAsyncIteration:
break
async def get_scores(
self,
batch_input: List[str],
**input_kwargs,
) -> List[float]:
if self.can_generate:
raise ValueError("Cannot get scores using an auto-regressive model.")
loop = asyncio.get_running_loop()
input_args = (self.model, self.tokenizer, batch_input, input_kwargs)
async with self._semaphore:
with concurrent.futures.ThreadPoolExecutor() as pool:
return await loop.run_in_executor(pool, self._get_scores, *input_args)

View File

@@ -0,0 +1,149 @@
import uuid
from typing import TYPE_CHECKING, AsyncGenerator, AsyncIterator, Dict, List, Optional, Sequence
from transformers.utils.versions import require_version
from ..data import get_template_and_fix_tokenizer
from ..extras.misc import get_device_count
from ..extras.packages import is_vllm_available
from ..model import load_tokenizer
from .base_engine import BaseEngine, Response
if is_vllm_available():
from vllm import AsyncEngineArgs, AsyncLLMEngine, RequestOutput, SamplingParams
if TYPE_CHECKING:
from ..hparams import DataArguments, FinetuningArguments, GeneratingArguments, ModelArguments
class VllmEngine(BaseEngine):
def __init__(
self,
model_args: "ModelArguments",
data_args: "DataArguments",
finetuning_args: "FinetuningArguments",
generating_args: "GeneratingArguments",
) -> None:
require_version("vllm>=0.3.3", "To fix: pip install vllm>=0.3.3")
self.can_generate = finetuning_args.stage == "sft"
engine_args = AsyncEngineArgs(
model=model_args.model_name_or_path,
trust_remote_code=True,
max_model_len=model_args.vllm_maxlen,
tensor_parallel_size=get_device_count() or 1,
gpu_memory_utilization=model_args.vllm_gpu_util,
disable_log_stats=True,
disable_log_requests=True,
enforce_eager=model_args.vllm_enforce_eager,
)
self.model = AsyncLLMEngine.from_engine_args(engine_args)
self.tokenizer = load_tokenizer(model_args)
self.tokenizer.padding_side = "left"
self.template = get_template_and_fix_tokenizer(self.tokenizer, data_args.template)
self.generating_args = generating_args.to_dict()
async def _generate(
self,
messages: Sequence[Dict[str, str]],
system: Optional[str] = None,
tools: Optional[str] = None,
**input_kwargs,
) -> AsyncIterator["RequestOutput"]:
request_id = "chatcmpl-{}".format(uuid.uuid4().hex)
paired_messages = messages + [{"role": "assistant", "content": ""}]
prompt_ids, _ = self.template.encode_oneturn(
tokenizer=self.tokenizer, messages=paired_messages, system=system, tools=tools
)
prompt_length = len(prompt_ids)
temperature = input_kwargs.pop("temperature", None)
top_p = input_kwargs.pop("top_p", None)
top_k = input_kwargs.pop("top_k", None)
num_return_sequences = input_kwargs.pop("num_return_sequences", None)
repetition_penalty = input_kwargs.pop("repetition_penalty", None)
max_length = input_kwargs.pop("max_length", None)
max_new_tokens = input_kwargs.pop("max_new_tokens", None)
generating_args = self.generating_args.copy()
generating_args.update(
dict(
temperature=temperature or generating_args["temperature"],
top_p=top_p or generating_args["top_p"],
top_k=top_k or generating_args["top_k"],
num_return_sequences=num_return_sequences or 1,
repetition_penalty=repetition_penalty or generating_args["repetition_penalty"],
)
)
if max_length:
generating_args["max_new_tokens"] = max_length - prompt_length
if max_new_tokens:
generating_args["max_new_tokens"] = max_new_tokens
sampling_params = SamplingParams(
n=generating_args["num_return_sequences"],
repetition_penalty=generating_args["repetition_penalty"],
temperature=generating_args["temperature"],
top_p=generating_args["top_p"],
top_k=generating_args["top_k"],
use_beam_search=generating_args["num_beams"] > 1,
length_penalty=generating_args["length_penalty"],
stop_token_ids=[self.tokenizer.eos_token_id] + self.tokenizer.additional_special_tokens_ids,
max_tokens=generating_args["max_new_tokens"],
skip_special_tokens=True,
)
result_generator = self.model.generate(
prompt=None, sampling_params=sampling_params, request_id=request_id, prompt_token_ids=prompt_ids
)
return result_generator
async def start(self) -> None:
pass
async def chat(
self,
messages: Sequence[Dict[str, str]],
system: Optional[str] = None,
tools: Optional[str] = None,
**input_kwargs,
) -> List["Response"]:
final_output = None
generator = await self._generate(messages, system, tools, **input_kwargs)
async for request_output in generator:
final_output = request_output
results = []
for output in final_output.outputs:
results.append(
Response(
response_text=output.text,
response_length=len(output.token_ids),
prompt_length=len(final_output.prompt_token_ids),
finish_reason=output.finish_reason,
)
)
return results
async def stream_chat(
self,
messages: Sequence[Dict[str, str]],
system: Optional[str] = None,
tools: Optional[str] = None,
**input_kwargs,
) -> AsyncGenerator[str, None]:
generated_text = ""
generator = await self._generate(messages, system, tools, **input_kwargs)
async for result in generator:
delta_text = result.outputs[0].text[len(generated_text) :]
generated_text = result.outputs[0].text
yield delta_text
async def get_scores(
self,
batch_input: List[str],
**input_kwargs,
) -> List[float]:
raise NotImplementedError("vLLM engine does not support get_scores.")

View File

@@ -0,0 +1,6 @@
from .loader import get_dataset
from .template import Template, get_template_and_fix_tokenizer, templates
from .utils import Role, split_dataset
__all__ = ["get_dataset", "Template", "get_template_and_fix_tokenizer", "templates", "Role", "split_dataset"]

View File

@@ -0,0 +1,133 @@
from functools import partial
from typing import TYPE_CHECKING, Any, Dict, List, Union
from datasets import Features
from .utils import Role
if TYPE_CHECKING:
from datasets import Dataset, IterableDataset
from ..hparams import DataArguments
from .parser import DatasetAttr
def convert_alpaca(examples: Dict[str, List[Any]], dataset_attr: "DatasetAttr") -> Dict[str, List[Any]]:
outputs = {"prompt": [], "response": [], "system": [], "tools": []}
for i in range(len(examples[dataset_attr.prompt])):
prompt = []
if dataset_attr.history and isinstance(examples[dataset_attr.history][i], list):
for old_prompt, old_response in examples[dataset_attr.history][i]:
prompt.append({"role": Role.USER.value, "content": old_prompt})
prompt.append({"role": Role.ASSISTANT.value, "content": old_response})
content = []
if dataset_attr.prompt and examples[dataset_attr.prompt][i]:
content.append(examples[dataset_attr.prompt][i])
if dataset_attr.query and examples[dataset_attr.query][i]:
content.append(examples[dataset_attr.query][i])
prompt.append({"role": Role.USER.value, "content": "\n".join(content)})
if dataset_attr.response and isinstance(examples[dataset_attr.response][i], list):
response = [
{"role": Role.ASSISTANT.value, "content": content} for content in examples[dataset_attr.response][i]
]
elif dataset_attr.response and isinstance(examples[dataset_attr.response][i], str):
response = [{"role": Role.ASSISTANT.value, "content": examples[dataset_attr.response][i]}]
else:
response = []
outputs["prompt"].append(prompt)
outputs["response"].append(response)
outputs["system"].append(examples[dataset_attr.system][i] if dataset_attr.system else "")
outputs["tools"].append("")
return outputs
def convert_sharegpt(examples: Dict[str, List[Any]], dataset_attr: "DatasetAttr") -> Dict[str, List[Any]]:
outputs = {"prompt": [], "response": [], "system": [], "tools": []}
tag_mapping = {
dataset_attr.user_tag: Role.USER.value,
dataset_attr.assistant_tag: Role.ASSISTANT.value,
dataset_attr.observation_tag: Role.OBSERVATION.value,
dataset_attr.function_tag: Role.FUNCTION.value,
dataset_attr.system_tag: Role.SYSTEM.value,
}
odd_tags = (dataset_attr.user_tag, dataset_attr.observation_tag)
even_tags = (dataset_attr.assistant_tag, dataset_attr.function_tag)
accept_tags = (odd_tags, even_tags)
for i, messages in enumerate(examples[dataset_attr.messages]):
if dataset_attr.system_tag and messages[0][dataset_attr.role_tag] == dataset_attr.system_tag:
system = messages[0][dataset_attr.content_tag]
messages = messages[1:]
else:
system = examples[dataset_attr.system][i] if dataset_attr.system else ""
messages = messages[: len(messages) // 2 * 2] # should be multiples of 2
if len(messages) == 0:
continue
aligned_messages = []
for turn_idx, message in enumerate(messages):
if message[dataset_attr.role_tag] not in accept_tags[turn_idx % 2]:
raise ValueError("Invalid role tag in {}.".format(messages))
aligned_messages.append(
{"role": tag_mapping[message[dataset_attr.role_tag]], "content": message[dataset_attr.content_tag]}
)
outputs["prompt"].append(aligned_messages[:-1])
outputs["response"].append(aligned_messages[-1:])
outputs["system"].append(system)
outputs["tools"].append(examples[dataset_attr.tools][i] if dataset_attr.tools else "")
return outputs
def align_dataset(
dataset: Union["Dataset", "IterableDataset"], dataset_attr: "DatasetAttr", data_args: "DataArguments"
) -> Union["Dataset", "IterableDataset"]:
r"""
Aligned dataset:
prompt: [{"role": "user", "content": "..."}] * (2T - 1)
response: [{"role": "assistant", "content": "..."}] * N (N > 1 for ranking dataset)
system: "..."
tools: "..."
"""
if dataset_attr.formatting == "alpaca":
convert_func = partial(convert_alpaca, dataset_attr=dataset_attr)
else:
convert_func = partial(convert_sharegpt, dataset_attr=dataset_attr)
column_names = list(next(iter(dataset)).keys())
features = Features.from_dict(
{
"prompt": [
{"role": {"dtype": "string", "_type": "Value"}, "content": {"dtype": "string", "_type": "Value"}}
],
"response": [
{"role": {"dtype": "string", "_type": "Value"}, "content": {"dtype": "string", "_type": "Value"}}
],
"system": {"dtype": "string", "_type": "Value"},
"tools": {"dtype": "string", "_type": "Value"},
}
)
kwargs = {}
if not data_args.streaming:
kwargs = dict(
num_proc=data_args.preprocessing_num_workers,
load_from_cache_file=(not data_args.overwrite_cache),
desc="Converting format of dataset",
)
return dataset.map(
convert_func,
batched=True,
remove_columns=column_names,
features=features,
**kwargs,
)

View File

@@ -0,0 +1,187 @@
import json
import re
from abc import ABC, abstractmethod
from dataclasses import dataclass, field
from typing import Any, Dict, List, Literal, Optional, Sequence, Set, Tuple, Union
SLOTS = Sequence[Union[str, Set[str], Dict[str, str]]]
JSON_FORMAT_PROMPT = (
""", in a JSON format representing the kwargs (e.g. ```{"input": "hello world", "num_beams": 5}```)"""
)
TOOL_SYSTEM_PROMPT = (
"You have access to the following tools:\n{tool_text}"
"Use the following format if using a tool:\n"
"```\n"
"Action: tool name (one of [{tool_names}]).\n"
"Action Input: the input to the tool{format_prompt}.\n"
"```\n"
)
def default_tool_formatter(tools: List[Dict[str, Any]]) -> str:
tool_text = ""
tool_names = []
for tool in tools:
param_text = ""
for name, param in tool["parameters"]["properties"].items():
required = ", required" if name in tool["parameters"].get("required", []) else ""
enum = ", should be one of [{}]".format(", ".join(param["enum"])) if param.get("enum", None) else ""
items = (
", where each item should be {}".format(param["items"].get("type", "")) if param.get("items") else ""
)
param_text += " - {name} ({type}{required}): {desc}{enum}{items}\n".format(
name=name,
type=param.get("type", ""),
required=required,
desc=param.get("description", ""),
enum=enum,
items=items,
)
tool_text += "> Tool Name: {name}\nTool Description: {desc}\nTool Args:\n{args}\n".format(
name=tool["name"], desc=tool.get("description", ""), args=param_text
)
tool_names.append(tool["name"])
return TOOL_SYSTEM_PROMPT.format(
tool_text=tool_text, tool_names=", ".join(tool_names), format_prompt=JSON_FORMAT_PROMPT
)
def default_tool_extractor(content: str) -> Union[str, Tuple[str, str]]:
regex = re.compile(r"Action:\s*([a-zA-Z0-9_]+).*?Action Input:\s*(.*)", re.DOTALL)
action_match = re.search(regex, content)
if not action_match:
return content
tool_name = action_match.group(1).strip()
tool_input = action_match.group(2).strip().strip('"').strip("```")
try:
arguments = json.loads(tool_input)
except json.JSONDecodeError:
return content
return tool_name, json.dumps(arguments, ensure_ascii=False)
@dataclass
class Formatter(ABC):
slots: SLOTS = field(default_factory=list)
tool_format: Optional[Literal["default"]] = None
@abstractmethod
def apply(self, **kwargs) -> SLOTS: ...
def extract(self, content: str) -> Union[str, Tuple[str, str]]:
raise NotImplementedError
@dataclass
class EmptyFormatter(Formatter):
def __post_init__(self):
has_placeholder = False
for slot in filter(lambda s: isinstance(s, str), self.slots):
if re.search(r"\{\{[a-zA-Z_][a-zA-Z0-9_]*\}\}", slot):
has_placeholder = True
if has_placeholder:
raise ValueError("Empty formatter should not contain any placeholder.")
def apply(self, **kwargs) -> SLOTS:
return self.slots
@dataclass
class StringFormatter(Formatter):
def __post_init__(self):
has_placeholder = False
for slot in filter(lambda s: isinstance(s, str), self.slots):
if re.search(r"\{\{[a-zA-Z_][a-zA-Z0-9_]*\}\}", slot):
has_placeholder = True
if not has_placeholder:
raise ValueError("A placeholder is required in the string formatter.")
def apply(self, **kwargs) -> SLOTS:
elements = []
for slot in self.slots:
if isinstance(slot, str):
for name, value in kwargs.items():
if not isinstance(value, str):
raise RuntimeError("Expected a string, got {}".format(value))
slot = slot.replace("{{" + name + "}}", value, 1)
elements.append(slot)
elif isinstance(slot, (dict, set)):
elements.append(slot)
else:
raise RuntimeError("Input must be string, set[str] or dict[str, str], got {}".format(type(slot)))
return elements
@dataclass
class FunctionFormatter(Formatter):
def __post_init__(self):
has_name, has_args = False, False
for slot in filter(lambda s: isinstance(s, str), self.slots):
if "{{name}}" in slot:
has_name = True
if "{{arguments}}" in slot:
has_args = True
if not has_name or not has_args:
raise ValueError("Name and arguments placeholders are required in the function formatter.")
def apply(self, **kwargs) -> SLOTS:
content = kwargs.pop("content")
try:
function = json.loads(content)
name = function["name"]
arguments = json.dumps(function["arguments"], ensure_ascii=False)
except Exception:
name, arguments = "", ""
elements = []
for slot in self.slots:
if isinstance(slot, str):
slot = slot.replace("{{name}}", name).replace("{{arguments}}", arguments)
elements.append(slot)
elif isinstance(slot, (dict, set)):
elements.append(slot)
else:
raise RuntimeError("Input must be string, set[str] or dict[str, str], got {}".format(type(slot)))
return elements
@dataclass
class ToolFormatter(Formatter):
def __post_init__(self):
if self.tool_format is None:
raise ValueError("Tool format was not found.")
def apply(self, **kwargs) -> SLOTS:
content = kwargs.pop("content")
try:
tools = json.loads(content)
if not len(tools):
return [""]
if self.tool_format == "default":
return [default_tool_formatter(tools)]
else:
raise NotImplementedError
except Exception:
return [""]
def extract(self, content: str) -> Union[str, Tuple[str, str]]:
if self.tool_format == "default":
return default_tool_extractor(content)
else:
raise NotImplementedError

View File

@@ -0,0 +1,170 @@
import inspect
import os
from typing import TYPE_CHECKING, Literal, Union
from datasets import load_dataset, load_from_disk
from ..extras.constants import FILEEXT2TYPE
from ..extras.logging import get_logger
from .aligner import align_dataset
from .parser import get_dataset_list
from .preprocess import get_preprocess_and_print_func
from .template import get_template_and_fix_tokenizer
from .utils import checksum, merge_dataset
if TYPE_CHECKING:
from datasets import Dataset, IterableDataset
from transformers import Seq2SeqTrainingArguments
from transformers.tokenization_utils import PreTrainedTokenizer
from ..hparams import DataArguments, ModelArguments
from .parser import DatasetAttr
logger = get_logger(__name__)
def load_single_dataset(
dataset_attr: "DatasetAttr",
model_args: "ModelArguments",
data_args: "DataArguments",
) -> Union["Dataset", "IterableDataset"]:
logger.info("Loading dataset {}...".format(dataset_attr))
data_path, data_name, data_dir, data_files = None, None, None, None
if dataset_attr.load_from in ["hf_hub", "ms_hub"]:
data_path = dataset_attr.dataset_name
data_name = dataset_attr.subset
data_dir = dataset_attr.folder
elif dataset_attr.load_from == "script":
data_path = os.path.join(data_args.dataset_dir, dataset_attr.dataset_name)
data_name = dataset_attr.subset
data_dir = dataset_attr.folder
elif dataset_attr.load_from == "file":
data_files = []
local_path = os.path.join(data_args.dataset_dir, dataset_attr.dataset_name)
if os.path.isdir(local_path): # is directory
for file_name in os.listdir(local_path):
data_files.append(os.path.join(local_path, file_name))
if data_path is None:
data_path = FILEEXT2TYPE.get(file_name.split(".")[-1], None)
elif data_path != FILEEXT2TYPE.get(file_name.split(".")[-1], None):
raise ValueError("File types should be identical.")
elif os.path.isfile(local_path): # is file
data_files.append(local_path)
data_path = FILEEXT2TYPE.get(local_path.split(".")[-1], None)
else:
raise ValueError("File not found.")
if data_path is None:
raise ValueError("File extension must be txt, csv, json or jsonl.")
checksum(data_files, dataset_attr.file_sha1)
else:
raise NotImplementedError
if dataset_attr.load_from == "ms_hub":
try:
from modelscope import MsDataset
from modelscope.utils.config_ds import MS_DATASETS_CACHE
cache_dir = model_args.cache_dir or MS_DATASETS_CACHE
dataset = MsDataset.load(
dataset_name=data_path,
subset_name=data_name,
data_dir=data_dir,
data_files=data_files,
split=data_args.split,
cache_dir=cache_dir,
token=model_args.ms_hub_token,
use_streaming=(data_args.streaming and (dataset_attr.load_from != "file")),
).to_hf_dataset()
except ImportError:
raise ImportError("Please install modelscope via `pip install modelscope -U`")
else:
if "trust_remote_code" in inspect.signature(load_dataset).parameters: # for datasets==2.16.0
kwargs = {"trust_remote_code": True}
else:
kwargs = {}
dataset = load_dataset(
path=data_path,
name=data_name,
data_dir=data_dir,
data_files=data_files,
split=data_args.split,
cache_dir=model_args.cache_dir,
token=model_args.hf_hub_token,
streaming=(data_args.streaming and (dataset_attr.load_from != "file")),
**kwargs,
)
if data_args.streaming and (dataset_attr.load_from == "file"): # faster than specifying streaming=True
dataset = dataset.to_iterable_dataset() # TODO: add num shards parameter
if data_args.max_samples is not None: # truncate dataset
num_samples = min(data_args.max_samples, len(dataset))
dataset = dataset.select(range(num_samples))
return align_dataset(dataset, dataset_attr, data_args)
def get_dataset(
tokenizer: "PreTrainedTokenizer",
model_args: "ModelArguments",
data_args: "DataArguments",
training_args: "Seq2SeqTrainingArguments",
stage: Literal["pt", "sft", "rm", "ppo"],
# split: Optional[str] = "train", # TODO: add split
) -> Union["Dataset", "IterableDataset"]:
template = get_template_and_fix_tokenizer(tokenizer, data_args.template)
if data_args.train_on_prompt and template.efficient_eos:
raise ValueError("Current template does not support `train_on_prompt`.")
# Load from cache
if data_args.cache_path is not None:
if os.path.exists(data_args.cache_path):
logger.warning("Loading dataset from disk will ignore other data arguments.")
dataset = load_from_disk(data_args.cache_path)
if data_args.streaming:
dataset = dataset.to_iterable_dataset()
return dataset
if data_args.streaming:
raise ValueError("Turn off `streaming` when saving dataset to disk.")
with training_args.main_process_first(desc="load dataset"):
all_datasets = []
for dataset_attr in get_dataset_list(data_args):
all_datasets.append(load_single_dataset(dataset_attr, model_args, data_args))
dataset = merge_dataset(all_datasets, data_args, training_args)
with training_args.main_process_first(desc="pre-process dataset"):
preprocess_func, print_function = get_preprocess_and_print_func(
tokenizer, template, data_args, training_args, stage
)
column_names = list(next(iter(dataset)).keys())
kwargs = {}
if not data_args.streaming:
kwargs = dict(
num_proc=data_args.preprocessing_num_workers,
load_from_cache_file=(not data_args.overwrite_cache),
desc="Running tokenizer on dataset",
)
dataset = dataset.map(preprocess_func, batched=True, remove_columns=column_names, **kwargs)
if data_args.cache_path is not None and not os.path.exists(data_args.cache_path):
if training_args.should_save:
dataset.save_to_disk(data_args.cache_path)
logger.info("Dataset cache saved at {}.".format(data_args.cache_path))
if training_args.should_log:
try:
print_function(next(iter(dataset)))
except StopIteration:
raise RuntimeError("Cannot find valid samples, check `data/README.md` for the data format.")
return dataset

View File

@@ -0,0 +1,119 @@
import json
import os
from dataclasses import dataclass
from typing import TYPE_CHECKING, Any, Dict, List, Literal, Optional
from ..extras.constants import DATA_CONFIG
from ..extras.misc import use_modelscope
if TYPE_CHECKING:
from ..hparams import DataArguments
@dataclass
class DatasetAttr:
r"""
Dataset attributes.
"""
""" basic configs """
load_from: Literal["hf_hub", "ms_hub", "script", "file"]
dataset_name: str
""" extra configs """
file_sha1: Optional[str] = None
subset: Optional[str] = None
folder: Optional[str] = None
ranking: bool = False
formatting: Literal["alpaca", "sharegpt"] = "alpaca"
""" columns """
system: Optional[str] = None
""" columns for the alpaca format """
prompt: Optional[str] = "instruction"
query: Optional[str] = "input"
response: Optional[str] = "output"
history: Optional[str] = None
""" columns for the sharegpt format """
messages: Optional[str] = "conversations"
tools: Optional[str] = None
""" tags for the sharegpt format """
role_tag: Optional[str] = "from"
content_tag: Optional[str] = "value"
user_tag: Optional[str] = "human"
assistant_tag: Optional[str] = "gpt"
observation_tag: Optional[str] = "observation"
function_tag: Optional[str] = "function_call"
system_tag: Optional[str] = "system"
def __repr__(self) -> str:
return self.dataset_name
def set_attr(self, key: str, obj: Dict[str, Any], default: Optional[Any] = None) -> None:
setattr(self, key, obj.get(key, default))
def get_dataset_list(data_args: "DataArguments") -> List["DatasetAttr"]:
dataset_names = [ds.strip() for ds in data_args.dataset.split(",")] if data_args.dataset is not None else []
try:
with open(os.path.join(data_args.dataset_dir, DATA_CONFIG), "r") as f:
dataset_info = json.load(f)
except Exception as err:
if data_args.dataset is not None:
raise ValueError(
"Cannot open {} due to {}.".format(os.path.join(data_args.dataset_dir, DATA_CONFIG), str(err))
)
dataset_info = None
if data_args.interleave_probs is not None:
data_args.interleave_probs = [float(prob.strip()) for prob in data_args.interleave_probs.split(",")]
dataset_list: List[DatasetAttr] = []
for name in dataset_names:
if name not in dataset_info:
raise ValueError("Undefined dataset {} in {}.".format(name, DATA_CONFIG))
has_hf_url = "hf_hub_url" in dataset_info[name]
has_ms_url = "ms_hub_url" in dataset_info[name]
if has_hf_url or has_ms_url:
if (use_modelscope() and has_ms_url) or (not has_hf_url):
dataset_attr = DatasetAttr("ms_hub", dataset_name=dataset_info[name]["ms_hub_url"])
else:
dataset_attr = DatasetAttr("hf_hub", dataset_name=dataset_info[name]["hf_hub_url"])
elif "script_url" in dataset_info[name]:
dataset_attr = DatasetAttr("script", dataset_name=dataset_info[name]["script_url"])
else:
dataset_attr = DatasetAttr("file", dataset_name=dataset_info[name]["file_name"])
dataset_attr.set_attr("file_sha1", dataset_info[name])
dataset_attr.set_attr("subset", dataset_info[name])
dataset_attr.set_attr("folder", dataset_info[name])
dataset_attr.set_attr("ranking", dataset_info[name], default=False)
dataset_attr.set_attr("formatting", dataset_info[name], default="alpaca")
if "columns" in dataset_info[name]:
column_names = ["system"]
if dataset_attr.formatting == "alpaca":
column_names.extend(["prompt", "query", "response", "history"])
else:
column_names.extend(["messages", "tools"])
for column_name in column_names:
dataset_attr.set_attr(column_name, dataset_info[name]["columns"])
if dataset_attr.formatting == "sharegpt" and "tags" in dataset_info[name]:
tag_names = (
"role_tag",
"content_tag",
"user_tag",
"assistant_tag",
"observation_tag",
"function_tag",
"system_tag",
)
for tag in tag_names:
dataset_attr.set_attr(tag, dataset_info[name]["tags"])
dataset_list.append(dataset_attr)
return dataset_list

View File

@@ -0,0 +1,276 @@
from functools import partial
from itertools import chain
from typing import TYPE_CHECKING, Any, Callable, Dict, List, Literal, Tuple
from ..extras.constants import IGNORE_INDEX
from ..extras.logging import get_logger
from .utils import Role
if TYPE_CHECKING:
from transformers import Seq2SeqTrainingArguments
from transformers.tokenization_utils import PreTrainedTokenizer
from ..hparams import DataArguments
from .template import Template
logger = get_logger(__name__)
def preprocess_pretrain_dataset(
examples: Dict[str, List[Any]], tokenizer: "PreTrainedTokenizer", data_args: "DataArguments"
) -> Dict[str, List[List[int]]]:
# build grouped texts with format `X1 X2 X3 ...` if packing is enabled
text_examples = [messages[0]["content"] + tokenizer.eos_token for messages in examples["prompt"]]
if not data_args.packing:
return tokenizer(text_examples, add_special_tokens=False, max_length=data_args.cutoff_len)
tokenized_examples = tokenizer(text_examples, add_special_tokens=False)
concatenated_examples = {k: list(chain(*tokenized_examples[k])) for k in tokenized_examples.keys()}
total_length = len(concatenated_examples[list(concatenated_examples.keys())[0]])
block_size = data_args.cutoff_len
# we drop the small remainder, and if the total_length < block_size, we exclude this batch
total_length = (total_length // block_size) * block_size
# split by chunks of cutoff_len
result = {
k: [t[i : i + block_size] for i in range(0, total_length, block_size)]
for k, t in concatenated_examples.items()
}
if data_args.template == "gemma":
for i in range(len(result["input_ids"])):
result["input_ids"][i][0] = tokenizer.bos_token_id
return result
def preprocess_supervised_dataset(
examples: Dict[str, List[Any]],
tokenizer: "PreTrainedTokenizer",
template: "Template",
data_args: "DataArguments",
) -> Dict[str, List[List[int]]]:
# build inputs with format `<bos> X Y <eos>` and labels with format `<ignore> ... <ignore> Y <eos>`
# for multiturn examples, we only mask the prompt part in each prompt-response pair.
model_inputs = {"input_ids": [], "attention_mask": [], "labels": []}
for i in range(len(examples["prompt"])):
if len(examples["prompt"][i]) % 2 != 1 or len(examples["response"][i]) != 1:
continue
messages = examples["prompt"][i] + examples["response"][i]
input_ids, labels = [], []
for turn_idx, (source_ids, target_ids) in enumerate(
template.encode_multiturn(
tokenizer,
messages,
examples["system"][i],
examples["tools"][i],
data_args.cutoff_len,
data_args.reserved_label_len,
)
):
if data_args.train_on_prompt:
source_mask = source_ids
elif turn_idx != 0 and template.efficient_eos:
source_mask = [tokenizer.eos_token_id] + [IGNORE_INDEX] * (len(source_ids) - 1)
else:
source_mask = [IGNORE_INDEX] * len(source_ids)
input_ids += source_ids + target_ids
labels += source_mask + target_ids
if template.efficient_eos:
input_ids += [tokenizer.eos_token_id]
labels += [tokenizer.eos_token_id]
model_inputs["input_ids"].append(input_ids)
model_inputs["attention_mask"].append([1] * len(input_ids))
model_inputs["labels"].append(labels)
return model_inputs
def preprocess_packed_supervised_dataset(
examples: Dict[str, List[Any]],
tokenizer: "PreTrainedTokenizer",
template: "Template",
data_args: "DataArguments",
) -> Dict[str, List[List[int]]]:
# build inputs with format `<bos> X1 Y1 <eos> <bos> X2 Y2 <eos>`
# and labels with format `<ignore> ... <ignore> Y1 <eos> <ignore> ... <ignore> Y2 <eos>`
model_inputs = {"input_ids": [], "attention_mask": [], "labels": []}
input_ids, labels = [], []
for i in range(len(examples["prompt"])):
if len(examples["prompt"][i]) % 2 != 1 or len(examples["response"][i]) != 1:
continue
messages = examples["prompt"][i] + examples["response"][i]
for source_ids, target_ids in template.encode_multiturn(
tokenizer, messages, examples["system"][i], examples["tools"][i]
):
if data_args.train_on_prompt:
source_mask = source_ids
elif len(input_ids) != 0 and template.efficient_eos:
source_mask = [tokenizer.eos_token_id] + [IGNORE_INDEX] * (len(source_ids) - 1)
else:
source_mask = [IGNORE_INDEX] * len(source_ids)
input_ids += source_ids + target_ids
labels += source_mask + target_ids
if template.efficient_eos:
input_ids += [tokenizer.eos_token_id]
labels += [tokenizer.eos_token_id]
total_length = len(input_ids)
block_size = data_args.cutoff_len
# we drop the small remainder, and if the total_length < block_size, we exclude this batch
total_length = (total_length // block_size) * block_size
# split by chunks of cutoff_len
for i in range(0, total_length, block_size):
if not all(label == IGNORE_INDEX for label in labels[i : i + block_size]):
model_inputs["input_ids"].append(input_ids[i : i + block_size])
model_inputs["attention_mask"].append([1] * block_size)
model_inputs["labels"].append(labels[i : i + block_size])
return model_inputs
def preprocess_unsupervised_dataset(
examples: Dict[str, List[Any]],
tokenizer: "PreTrainedTokenizer",
template: "Template",
data_args: "DataArguments",
) -> Dict[str, List[List[int]]]:
# build inputs with format `<bos> X` and labels with format `Y <eos>`
model_inputs = {"input_ids": [], "attention_mask": [], "labels": []}
for i in range(len(examples["prompt"])):
if len(examples["prompt"][i]) % 2 != 1:
continue
if len(examples["response"][i]) == 1:
messages = examples["prompt"][i] + examples["response"][i]
else:
messages = examples["prompt"][i] + [{"role": Role.ASSISTANT.value, "content": ""}]
input_ids, labels = template.encode_oneturn(
tokenizer,
messages,
examples["system"][i],
examples["tools"][i],
data_args.cutoff_len,
data_args.reserved_label_len,
)
if template.efficient_eos:
labels += [tokenizer.eos_token_id]
model_inputs["input_ids"].append(input_ids)
model_inputs["attention_mask"].append([1] * len(input_ids))
model_inputs["labels"].append(labels)
return model_inputs
def preprocess_pairwise_dataset(
examples: Dict[str, List[Any]],
tokenizer: "PreTrainedTokenizer",
template: "Template",
data_args: "DataArguments",
) -> Dict[str, List[List[int]]]:
# build input pairs with format `<bos> X`, `Y1 <eos>` and `Y2 <eos>`
model_inputs = {"prompt_ids": [], "chosen_ids": [], "rejected_ids": []}
for i in range(len(examples["prompt"])):
if len(examples["prompt"][i]) % 2 != 1 or len(examples["response"][i]) < 2:
continue
chosen_messages = examples["prompt"][i] + [examples["response"][i][0]]
rejected_messages = examples["prompt"][i] + [examples["response"][i][1]]
prompt_ids, chosen_ids = template.encode_oneturn(
tokenizer,
chosen_messages,
examples["system"][i],
examples["tools"][i],
data_args.cutoff_len,
data_args.reserved_label_len,
)
_, rejected_ids = template.encode_oneturn(
tokenizer,
rejected_messages,
examples["system"][i],
examples["tools"][i],
data_args.cutoff_len,
data_args.reserved_label_len,
)
if template.efficient_eos:
chosen_ids += [tokenizer.eos_token_id]
rejected_ids += [tokenizer.eos_token_id]
model_inputs["prompt_ids"].append(prompt_ids)
model_inputs["chosen_ids"].append(chosen_ids)
model_inputs["rejected_ids"].append(rejected_ids)
return model_inputs
def print_supervised_dataset_example(example: Dict[str, List[int]], tokenizer: "PreTrainedTokenizer") -> None:
print("input_ids:\n{}".format(example["input_ids"]))
print("inputs:\n{}".format(tokenizer.decode(example["input_ids"], skip_special_tokens=False)))
print("label_ids:\n{}".format(example["labels"]))
print(
"labels:\n{}".format(
tokenizer.decode(list(filter(lambda x: x != IGNORE_INDEX, example["labels"])), skip_special_tokens=False)
)
)
def print_pairwise_dataset_example(example: Dict[str, List[int]], tokenizer: "PreTrainedTokenizer") -> None:
print("prompt_ids:\n{}".format(example["prompt_ids"]))
print("prompt:\n{}".format(tokenizer.decode(example["prompt_ids"], skip_special_tokens=False)))
print("chosen_ids:\n{}".format(example["chosen_ids"]))
print("chosen:\n{}".format(tokenizer.decode(example["chosen_ids"], skip_special_tokens=False)))
print("rejected_ids:\n{}".format(example["rejected_ids"]))
print("rejected:\n{}".format(tokenizer.decode(example["rejected_ids"], skip_special_tokens=False)))
def print_unsupervised_dataset_example(example: Dict[str, List[int]], tokenizer: "PreTrainedTokenizer") -> None:
print("input_ids:\n{}".format(example["input_ids"]))
print("inputs:\n{}".format(tokenizer.decode(example["input_ids"], skip_special_tokens=False)))
def get_preprocess_and_print_func(
tokenizer: "PreTrainedTokenizer",
template: "Template",
data_args: "DataArguments",
training_args: "Seq2SeqTrainingArguments",
stage: Literal["pt", "sft", "rm", "ppo"],
) -> Tuple[Callable, Callable]:
if stage == "pt":
preprocess_func = partial(preprocess_pretrain_dataset, tokenizer=tokenizer, data_args=data_args)
print_function = partial(print_unsupervised_dataset_example, tokenizer=tokenizer)
elif stage == "sft" and not training_args.predict_with_generate:
if data_args.packing:
preprocess_func = partial(
preprocess_packed_supervised_dataset, tokenizer=tokenizer, template=template, data_args=data_args
)
else:
preprocess_func = partial(
preprocess_supervised_dataset, tokenizer=tokenizer, template=template, data_args=data_args
)
print_function = partial(print_supervised_dataset_example, tokenizer=tokenizer)
elif stage == "rm":
preprocess_func = partial(
preprocess_pairwise_dataset, tokenizer=tokenizer, template=template, data_args=data_args
)
print_function = partial(print_pairwise_dataset_example, tokenizer=tokenizer)
else:
preprocess_func = partial(
preprocess_unsupervised_dataset, tokenizer=tokenizer, template=template, data_args=data_args
)
print_function = partial(print_unsupervised_dataset_example, tokenizer=tokenizer)
return preprocess_func, print_function

View File

@@ -0,0 +1,773 @@
from dataclasses import dataclass
from typing import TYPE_CHECKING, Dict, List, Optional, Sequence, Tuple, Union
from ..extras.logging import get_logger
from .formatter import EmptyFormatter, FunctionFormatter, StringFormatter, ToolFormatter
from .utils import Role, infer_max_len
if TYPE_CHECKING:
from transformers import PreTrainedTokenizer
from .formatter import SLOTS, Formatter
logger = get_logger(__name__)
@dataclass
class Template:
format_user: "Formatter"
format_assistant: "Formatter"
format_system: "Formatter"
format_function: "Formatter"
format_observation: "Formatter"
format_tools: "Formatter"
format_separator: "Formatter"
default_system: str
stop_words: List[str]
efficient_eos: bool
replace_eos: bool
force_system: bool
def encode_oneturn(
self,
tokenizer: "PreTrainedTokenizer",
messages: List[Dict[str, str]],
system: Optional[str] = None,
tools: Optional[str] = None,
cutoff_len: int = 1_000_000,
reserved_label_len: int = 1,
) -> Tuple[List[int], List[int]]:
r"""
Returns a single pair of token ids representing prompt and response respectively.
"""
encoded_pairs = self._encode(tokenizer, messages, system, tools, cutoff_len, reserved_label_len)
prompt_ids = []
for query_ids, resp_ids in encoded_pairs[:-1]:
prompt_ids += query_ids + resp_ids
prompt_ids = prompt_ids + encoded_pairs[-1][0]
answer_ids = encoded_pairs[-1][1]
return prompt_ids, answer_ids
def encode_multiturn(
self,
tokenizer: "PreTrainedTokenizer",
messages: List[Dict[str, str]],
system: Optional[str] = None,
tools: Optional[str] = None,
cutoff_len: int = 1_000_000,
reserved_label_len: int = 1,
) -> Sequence[Tuple[List[int], List[int]]]:
r"""
Returns multiple pairs of token ids representing prompts and responses respectively.
"""
return self._encode(tokenizer, messages, system, tools, cutoff_len, reserved_label_len)
def _encode(
self,
tokenizer: "PreTrainedTokenizer",
messages: List[Dict[str, str]],
system: str,
tools: str,
cutoff_len: int,
reserved_label_len: int,
) -> Sequence[Tuple[List[int], List[int]]]:
r"""
Encodes formatted inputs to pairs of token ids.
Turn 0: system + query resp
Turn t: sep + query resp
"""
system = system or self.default_system
encoded_messages = []
for i, message in enumerate(messages):
elements = []
if i == 0 and (system or tools or self.force_system):
tool_text = self.format_tools.apply(content=tools)[0] if tools else ""
elements += self.format_system.apply(content=(system + tool_text))
elif i > 0 and i % 2 == 0:
elements += self.format_separator.apply()
if message["role"] == Role.USER.value:
elements += self.format_user.apply(content=message["content"], idx=str(i // 2))
elif message["role"] == Role.ASSISTANT.value:
elements += self.format_assistant.apply(content=message["content"])
elif message["role"] == Role.OBSERVATION.value:
elements += self.format_observation.apply(content=message["content"])
elif message["role"] == Role.FUNCTION.value:
elements += self.format_function.apply(content=message["content"])
else:
raise NotImplementedError("Unexpected role: {}".format(message["role"]))
encoded_messages.append(self._convert_elements_to_ids(tokenizer, elements))
return self._make_pairs(encoded_messages, cutoff_len, reserved_label_len)
def _convert_elements_to_ids(
self, tokenizer: "PreTrainedTokenizer", elements: List[Union[str, Dict[str, str]]]
) -> List[int]:
r"""
Converts elements to token ids.
"""
token_ids = []
for elem in elements:
if isinstance(elem, str):
if len(elem) != 0:
token_ids += tokenizer.encode(elem, add_special_tokens=False)
elif isinstance(elem, dict):
token_ids += [tokenizer.convert_tokens_to_ids(elem.get("token"))]
elif isinstance(elem, set):
if "bos_token" in elem and tokenizer.bos_token_id is not None:
token_ids += [tokenizer.bos_token_id]
elif "eos_token" in elem and tokenizer.eos_token_id is not None:
token_ids += [tokenizer.eos_token_id]
else:
raise ValueError("Input must be string, set[str] or dict[str, str], got {}".format(type(elem)))
return token_ids
def _make_pairs(
self,
encoded_messages: Sequence[List[int]],
cutoff_len: int,
reserved_label_len: int,
) -> Sequence[Tuple[List[int], List[int]]]:
encoded_pairs = []
total_length = 0
for i in range(0, len(encoded_messages), 2):
if total_length >= cutoff_len:
break
max_source_len, max_target_len = infer_max_len(
source_len=len(encoded_messages[i]),
target_len=len(encoded_messages[i + 1]),
max_len=(cutoff_len - total_length),
reserved_label_len=reserved_label_len,
)
source_ids = encoded_messages[i][:max_source_len]
target_ids = encoded_messages[i + 1][:max_target_len]
total_length += len(source_ids) + len(target_ids)
encoded_pairs.append((source_ids, target_ids))
return encoded_pairs
@dataclass
class Llama2Template(Template):
def _encode(
self,
tokenizer: "PreTrainedTokenizer",
messages: List[Dict[str, str]],
system: str,
tools: str,
cutoff_len: int,
reserved_label_len: int,
) -> Sequence[Tuple[List[int], List[int]]]:
r"""
Encodes formatted inputs to pairs of token ids.
Turn 0: system + query resp
Turn t: sep + query resp
"""
system = system or self.default_system
encoded_messages = []
for i, message in enumerate(messages):
elements = []
system_text = ""
if i == 0 and (system or tools or self.force_system):
tool_text = self.format_tools.apply(content=tools)[0] if tools else ""
system_text = self.format_system.apply(content=(system + tool_text))[0]
elif i > 0 and i % 2 == 0:
elements += self.format_separator.apply()
if message["role"] == Role.USER.value:
elements += self.format_user.apply(content=system_text + message["content"])
elif message["role"] == Role.ASSISTANT.value:
elements += self.format_assistant.apply(content=message["content"])
elif message["role"] == Role.OBSERVATION.value:
elements += self.format_observation.apply(content=message["content"])
elif message["role"] == Role.FUNCTION.value:
elements += self.format_function.apply(content=message["content"])
else:
raise NotImplementedError("Unexpected role: {}".format(message["role"]))
encoded_messages.append(self._convert_elements_to_ids(tokenizer, elements))
return self._make_pairs(encoded_messages, cutoff_len, reserved_label_len)
templates: Dict[str, Template] = {}
def _register_template(
name: str,
format_user: Optional["Formatter"] = None,
format_assistant: Optional["Formatter"] = None,
format_system: Optional["Formatter"] = None,
format_function: Optional["Formatter"] = None,
format_observation: Optional["Formatter"] = None,
format_tools: Optional["Formatter"] = None,
format_separator: Optional["Formatter"] = None,
default_system: str = "",
stop_words: List[str] = [],
efficient_eos: bool = False,
replace_eos: bool = False,
force_system: bool = False,
) -> None:
r"""
Registers a chat template.
To add the following chat template:
```
[HUMAN]:
user prompt here
[AI]:
model response here
[HUMAN]:
user prompt here
[AI]:
model response here
```
The corresponding code should be:
```
_register_template(
name="custom",
format_user=StringFormatter(slots=["[HUMAN]:\n{{content}}\n[AI]:\n"]),
format_separator=EmptyFormatter(slots=["\n\n"]),
efficient_eos=True,
)
```
"""
eos_slots = [] if efficient_eos else [{"eos_token"}]
template_class = Llama2Template if name.startswith("llama2") else Template
default_user_formatter = StringFormatter(slots=["{{content}}"])
default_assistant_formatter = StringFormatter(slots=["{{content}}"] + eos_slots)
default_function_formatter = FunctionFormatter(slots=["Action: {{name}}\nAction Input: {{arguments}}"] + eos_slots)
default_tool_formatter = ToolFormatter(tool_format="default")
default_separator_formatter = EmptyFormatter()
templates[name] = template_class(
format_user=format_user or default_user_formatter,
format_assistant=format_assistant or default_assistant_formatter,
format_system=format_system or default_user_formatter,
format_function=format_function or default_function_formatter,
format_observation=format_observation or format_user or default_user_formatter,
format_tools=format_tools or default_tool_formatter,
format_separator=format_separator or default_separator_formatter,
default_system=default_system,
stop_words=stop_words,
efficient_eos=efficient_eos,
replace_eos=replace_eos,
force_system=force_system,
)
def _add_or_replace_eos_token(tokenizer: "PreTrainedTokenizer", eos_token: str) -> None:
is_added = tokenizer.eos_token_id is None
num_added_tokens = tokenizer.add_special_tokens({"eos_token": eos_token})
if is_added:
logger.info("Add eos token: {}".format(tokenizer.eos_token))
else:
logger.info("Replace eos token: {}".format(tokenizer.eos_token))
if num_added_tokens > 0:
logger.warning("New tokens have been added, make sure `resize_vocab` is True.")
def _jinja_escape(content: str) -> str:
return content.replace("\n", r"\n").replace("'", r"\'")
def _convert_slots_to_jinja(slots: "SLOTS", tokenizer: "PreTrainedTokenizer", placeholder: str = "content") -> str:
slot_items = []
for slot in slots:
if isinstance(slot, str):
slot_pieces = slot.split("{{content}}")
if slot_pieces[0]:
slot_items.append("'" + _jinja_escape(slot_pieces[0]) + "'")
if len(slot_pieces) > 1:
slot_items.append(placeholder)
if slot_pieces[1]:
slot_items.append("'" + _jinja_escape(slot_pieces[1]) + "'")
elif isinstance(slot, set):
if "bos_token" in slot:
slot_items.append("'" + tokenizer.bos_token + "'")
elif "eos_token" in slot: # do not use {{ eos_token }} since it may be replaced
slot_items.append("'" + tokenizer.eos_token + "'")
elif isinstance(slot, dict):
raise ValueError("Dict is not supported.")
return " + ".join(slot_items)
def _get_jinja_template(template: "Template", tokenizer: "PreTrainedTokenizer") -> str:
jinja_template = ""
if template.default_system:
jinja_template += "{% set system_message = '" + _jinja_escape(template.default_system) + "' %}"
jinja_template += (
"{% if messages[0]['role'] == 'system' %}" "{% set system_message = messages[0]['content'] %}" "{% endif %}"
)
system_message = _convert_slots_to_jinja(template.format_system.apply(), tokenizer, placeholder="system_message")
if isinstance(template, Llama2Template):
pass
elif template.force_system:
jinja_template += "{{ " + system_message + " }}"
else:
jinja_template += "{% if system_message is defined %}{{ " + system_message + " }}{% endif %}"
jinja_template += "{% for message in messages %}"
jinja_template += "{% set content = message['content'] %}"
if isinstance(template, Llama2Template):
jinja_template += "{% if loop.index0 == 0 and system_message is defined %}"
jinja_template += "{% set content = " + system_message + " + message['content'] %}"
jinja_template += "{% endif %}"
jinja_template += "{% if message['role'] == 'user' %}"
user_message = _convert_slots_to_jinja(template.format_user.apply(), tokenizer)
jinja_template += "{{ " + user_message + " }}"
jinja_template += "{% elif message['role'] == 'assistant' %}"
assistant_message = _convert_slots_to_jinja(
template.format_assistant.apply() + template.format_separator.apply(), tokenizer
)
jinja_template += "{{ " + assistant_message + " }}"
jinja_template += "{% endif %}"
jinja_template += "{% endfor %}"
return jinja_template
def get_template_and_fix_tokenizer(
tokenizer: "PreTrainedTokenizer",
name: Optional[str] = None,
) -> Template:
if name is None:
template = templates["vanilla"] # placeholder
else:
template = templates.get(name, None)
if template is None:
raise ValueError("Template {} does not exist.".format(name))
stop_words = template.stop_words
if template.replace_eos:
if not stop_words:
raise ValueError("Stop words are required to replace the EOS token.")
_add_or_replace_eos_token(tokenizer, eos_token=stop_words[0])
stop_words = stop_words[1:]
if tokenizer.eos_token_id is None:
_add_or_replace_eos_token(tokenizer, eos_token="<|endoftext|>")
if tokenizer.pad_token_id is None:
tokenizer.pad_token = tokenizer.eos_token
logger.info("Add pad token: {}".format(tokenizer.pad_token))
if stop_words:
num_added_tokens = tokenizer.add_special_tokens(
dict(additional_special_tokens=stop_words), replace_additional_special_tokens=False
)
logger.info("Add {} to stop words.".format(",".join(stop_words)))
if num_added_tokens > 0:
logger.warning("New tokens have been added, make sure `resize_vocab` is True.")
try:
tokenizer.chat_template = _get_jinja_template(template, tokenizer)
except ValueError:
logger.info("Cannot add this chat template to tokenizer.")
return template
_register_template(
name="alpaca",
format_user=StringFormatter(slots=["### Instruction:\n{{content}}\n\n### Response:\n"]),
format_separator=EmptyFormatter(slots=["\n\n"]),
default_system=(
"Below is an instruction that describes a task. " "Write a response that appropriately completes the request."
),
)
_register_template(
name="aquila",
format_user=StringFormatter(slots=["Human: {{content}}###Assistant:"]),
format_separator=EmptyFormatter(slots=["###"]),
default_system=(
"A chat between a curious human and an artificial intelligence assistant. "
"The assistant gives helpful, detailed, and polite answers to the human's questions."
),
stop_words=["</s>"],
efficient_eos=True,
)
_register_template(
name="atom",
format_user=StringFormatter(
slots=[{"bos_token"}, "Human: {{content}}\n", {"eos_token"}, {"bos_token"}, "Assistant:"]
),
format_assistant=StringFormatter(slots=["{{content}}\n", {"eos_token"}]),
)
_register_template(
name="baichuan",
format_user=StringFormatter(slots=["<reserved_102>{{content}}<reserved_103>"]),
efficient_eos=True,
)
_register_template(
name="baichuan2",
format_user=StringFormatter(slots=["<reserved_106>{{content}}<reserved_107>"]),
efficient_eos=True,
)
_register_template(
name="belle",
format_user=StringFormatter(slots=["Human: {{content}}\n\nBelle: "]),
format_system=StringFormatter(slots=[{"bos_token"}, "{{content}}"]),
format_separator=EmptyFormatter(slots=["\n\n"]),
force_system=True,
)
_register_template(
name="bluelm",
format_user=StringFormatter(slots=[{"token": "[|Human|]:"}, "{{content}}", {"token": "[|AI|]:"}]),
)
_register_template(
name="chatglm2",
format_user=StringFormatter(slots=["[Round {{idx}}]\n\n问:{{content}}\n\n答:"]),
format_system=StringFormatter(slots=[{"token": "[gMASK]"}, {"token": "sop"}, "{{content}}"]),
format_separator=EmptyFormatter(slots=["\n\n"]),
efficient_eos=True,
force_system=True,
)
_register_template(
name="chatglm3",
format_user=StringFormatter(slots=[{"token": "<|user|>"}, "\n", "{{content}}", {"token": "<|assistant|>"}]),
format_assistant=StringFormatter(slots=["\n", "{{content}}"]),
format_system=StringFormatter(slots=[{"token": "[gMASK]"}, {"token": "sop"}, "{{content}}"]),
format_function=FunctionFormatter(slots=["{{name}}\n{{arguments}}"]),
format_observation=StringFormatter(
slots=[{"token": "<|observation|>"}, "\n", "{{content}}", {"token": "<|assistant|>"}]
),
stop_words=["<|user|>", "<|observation|>"],
efficient_eos=True,
force_system=True,
)
_register_template(
name="chatglm3_system",
format_user=StringFormatter(slots=[{"token": "<|user|>"}, "\n", "{{content}}", {"token": "<|assistant|>"}]),
format_assistant=StringFormatter(slots=["\n", "{{content}}"]),
format_system=StringFormatter(
slots=[{"token": "[gMASK]"}, {"token": "sop"}, {"token": "<|system|>"}, "\n", "{{content}}"]
),
format_function=FunctionFormatter(slots=["{{name}}\n{{arguments}}"]),
format_observation=StringFormatter(
slots=[{"token": "<|observation|>"}, "\n", "{{content}}", {"token": "<|assistant|>"}]
),
default_system=(
"You are ChatGLM3, a large language model trained by Zhipu.AI. "
"Follow the user's instructions carefully. Respond using markdown."
),
stop_words=["<|user|>", "<|observation|>"],
efficient_eos=True,
)
_register_template(
name="chatml",
format_user=StringFormatter(slots=["<|im_start|>user\n{{content}}<|im_end|>\n<|im_start|>assistant\n"]),
format_system=StringFormatter(slots=["<|im_start|>system\n{{content}}<|im_end|>\n"]),
format_separator=EmptyFormatter(slots=["\n"]),
stop_words=["<|im_end|>", "<|im_start|>"],
replace_eos=True,
)
_register_template(
name="chatml_de",
format_user=StringFormatter(slots=["<|im_start|>user\n{{content}}<|im_end|>\n<|im_start|>assistant\n"]),
format_system=StringFormatter(slots=["<|im_start|>system\n{{content}}<|im_end|>\n"]),
format_separator=EmptyFormatter(slots=["\n"]),
default_system="Du bist ein freundlicher und hilfsbereiter KI-Assistent.",
stop_words=["<|im_end|>", "<|im_start|>"],
replace_eos=True,
)
_register_template(
name="codegeex2",
format_system=StringFormatter(slots=[{"token": "[gMASK]"}, {"token": "sop"}, "{{content}}"]),
force_system=True,
)
_register_template(
name="cpm",
format_user=StringFormatter(slots=["<用户>{{content}}<AI>"]),
format_system=StringFormatter(slots=[{"bos_token"}, "{{content}}"]),
force_system=True,
)
_register_template(
name="deepseek",
format_user=StringFormatter(slots=["User: {{content}}\n\nAssistant:"]),
format_system=StringFormatter(slots=[{"bos_token"}, "{{content}}"]),
force_system=True,
)
_register_template(
name="deepseekcoder",
format_user=StringFormatter(slots=["### Instruction:\n{{content}}\n### Response:"]),
format_assistant=StringFormatter(slots=["\n", "{{content}}"]),
format_separator=EmptyFormatter(slots=["\n<|EOT|>\n"]),
default_system=(
"You are an AI programming assistant, utilizing the Deepseek Coder model, "
"developed by Deepseek Company, and you only answer questions related to computer science. "
"For politically sensitive questions, security and privacy issues, "
"and other non-computer science questions, you will refuse to answer\n"
),
stop_words=["<|EOT|>"],
efficient_eos=True,
)
_register_template(
name="default",
format_user=StringFormatter(slots=["Human: {{content}}\nAssistant: "]),
format_system=StringFormatter(slots=["{{content}}\n"]),
format_separator=EmptyFormatter(slots=["\n"]),
)
_register_template(
name="falcon",
format_user=StringFormatter(slots=["User: {{content}}\nFalcon:"]),
format_separator=EmptyFormatter(slots=["\n"]),
efficient_eos=True,
)
_register_template(
name="gemma",
format_user=StringFormatter(slots=["<start_of_turn>user\n{{content}}<end_of_turn>\n<start_of_turn>model\n"]),
format_system=StringFormatter(slots=[{"bos_token"}, "{{content}}"]),
format_separator=EmptyFormatter(slots=["<end_of_turn>\n"]),
efficient_eos=True,
force_system=True,
)
_register_template(
name="intern",
format_user=StringFormatter(slots=["<|User|>:{{content}}", {"token": "<eoh>"}, "\n<|Bot|>:"]),
format_separator=EmptyFormatter(slots=[{"token": "<eoa>"}, "\n"]),
stop_words=["<eoa>"],
efficient_eos=True,
)
_register_template(
name="intern2",
format_user=StringFormatter(slots=["<|im_start|>user\n{{content}}<|im_end|>\n<|im_start|>assistant\n"]),
format_system=StringFormatter(slots=[{"bos_token"}, "<|im_start|>system\n{{content}}<|im_end|>\n"]),
format_separator=EmptyFormatter(slots=["\n"]),
default_system=(
"You are an AI assistant whose name is InternLM (书生·浦语).\n"
"- InternLM (书生·浦语) is a conversational language model that is developed "
"by Shanghai AI Laboratory (上海人工智能实验室). It is designed to be helpful, honest, and harmless.\n"
"- InternLM (书生·浦语) can understand and communicate fluently in the language chosen "
"by the user such as English and 中文."
),
stop_words=["<|im_end|>"],
efficient_eos=True, # internlm2 tokenizer cannot set eos_token_id
)
_register_template(
name="llama2",
format_user=StringFormatter(slots=[{"bos_token"}, "[INST] {{content}} [/INST]"]),
format_system=StringFormatter(slots=["<<SYS>>\n{{content}}\n<</SYS>>\n\n"]),
default_system=(
"You are a helpful, respectful and honest assistant. "
"Always answer as helpfully as possible, while being safe. "
"Your answers should not include any harmful, unethical, "
"racist, sexist, toxic, dangerous, or illegal content. "
"Please ensure that your responses are socially unbiased and positive in nature.\n\n"
"If a question does not make any sense, or is not factually coherent, "
"explain why instead of answering something not correct. "
"If you don't know the answer to a question, please don't share false information."
),
)
_register_template(
name="llama2_zh",
format_user=StringFormatter(slots=[{"bos_token"}, "[INST] {{content}} [/INST]"]),
format_system=StringFormatter(slots=["<<SYS>>\n{{content}}\n<</SYS>>\n\n"]),
default_system="You are a helpful assistant. 你是一个乐于助人的助手。",
)
_register_template(
name="mistral",
format_user=StringFormatter(slots=["[INST] {{content}} [/INST]"]),
format_system=StringFormatter(slots=[{"bos_token"}, "{{content}}"]),
force_system=True,
)
_register_template(
name="olmo",
format_user=StringFormatter(slots=["<|user|>\n{{content}}<|assistant|>"]),
format_assistant=StringFormatter(slots=["{{content}}", {"eos_token"}]),
format_system=StringFormatter(slots=[{"eos_token"}, "{{content}}"]),
force_system=True,
)
_register_template(
name="openchat",
format_user=StringFormatter(slots=["GPT4 Correct User: {{content}}", {"eos_token"}, "GPT4 Correct Assistant:"]),
format_assistant=StringFormatter(slots=["{{content}}", {"eos_token"}]),
format_system=StringFormatter(slots=[{"bos_token"}, "{{content}}"]),
force_system=True,
)
_register_template(
name="orion",
format_user=StringFormatter(slots=["Human: {{content}}\n\nAssistant: ", {"eos_token"}]),
format_system=StringFormatter(slots=[{"bos_token"}, "{{content}}"]),
force_system=True,
)
_register_template(
name="qwen",
format_user=StringFormatter(slots=["<|im_start|>user\n{{content}}<|im_end|>\n<|im_start|>assistant\n"]),
format_system=StringFormatter(slots=["<|im_start|>system\n{{content}}<|im_end|>\n"]),
format_separator=EmptyFormatter(slots=["\n"]),
default_system="You are a helpful assistant.",
stop_words=["<|im_end|>"],
replace_eos=True,
)
_register_template(
name="solar",
format_user=StringFormatter(slots=["### User:\n{{content}}\n\n### Assistant:\n"]),
format_system=StringFormatter(slots=["### System:\n{{content}}\n\n"]),
efficient_eos=True,
)
_register_template(
name="starchat",
format_user=StringFormatter(slots=["<|user|>\n{{content}}<|end|>\n<|assistant|>"]),
format_system=StringFormatter(slots=["<|system|>\n{{content}}<|end|>\n"]),
format_separator=EmptyFormatter(slots=["\n"]),
stop_words=["<|end|>"],
replace_eos=True,
force_system=True,
)
_register_template(
name="vanilla",
)
_register_template(
name="vicuna",
format_user=StringFormatter(slots=["USER: {{content}} ASSISTANT:"]),
default_system=(
"A chat between a curious user and an artificial intelligence assistant. "
"The assistant gives helpful, detailed, and polite answers to the user's questions."
),
)
_register_template(
name="xuanyuan",
format_user=StringFormatter(slots=["Human: {{content}} Assistant:"]),
default_system=(
"以下是用户和人工智能助手之间的对话。用户以Human开头人工智能助手以Assistant开头"
"会对人类提出的问题给出有帮助、高质量、详细和礼貌的回答,并且总是拒绝参与与不道德、"
"不安全、有争议、政治敏感等相关的话题、问题和指示。\n"
),
)
_register_template(
name="xverse",
format_user=StringFormatter(slots=["Human: {{content}}\n\nAssistant: "]),
)
_register_template(
name="yayi",
format_user=StringFormatter(slots=[{"token": "<|Human|>"}, ":\n{{content}}\n\n", {"token": "<|YaYi|>"}, ":"]),
format_system=StringFormatter(slots=[{"token": "<|System|>"}, ":\n{{content}}\n\n"]),
format_separator=EmptyFormatter(slots=["\n\n"]),
default_system=(
"You are a helpful, respectful and honest assistant named YaYi "
"developed by Beijing Wenge Technology Co.,Ltd. "
"Always answer as helpfully as possible, while being safe. "
"Your answers should not include any harmful, unethical, "
"racist, sexist, toxic, dangerous, or illegal content. "
"Please ensure that your responses are socially unbiased and positive in nature.\n\n"
"If a question does not make any sense, or is not factually coherent, "
"explain why instead of answering something not correct. "
"If you don't know the answer to a question, please don't share false information."
),
stop_words=["<|End|>"],
)
_register_template(
name="yi",
format_user=StringFormatter(slots=["<|im_start|>user\n{{content}}<|im_end|>\n<|im_start|>assistant\n"]),
format_separator=EmptyFormatter(slots=["\n"]),
stop_words=["<|im_end|>"],
replace_eos=True,
)
_register_template(
name="yuan",
format_user=StringFormatter(slots=["{{content}}", {"token": "<sep>"}]),
format_separator=EmptyFormatter(slots=["\n"]),
stop_words=["<eod>"],
replace_eos=True,
)
_register_template(
name="zephyr",
format_user=StringFormatter(slots=["<|user|>\n{{content}}", {"eos_token"}, "<|assistant|>"]),
format_assistant=StringFormatter(slots=["\n{{content}}", {"eos_token"}]),
format_system=StringFormatter(slots=["<|system|>\n{{content}}", {"eos_token"}]),
default_system="You are a friendly chatbot who always responds in the style of a pirate",
)
_register_template(
name="ziya",
format_user=StringFormatter(slots=["<human>:{{content}}\n<bot>:"]),
format_separator=EmptyFormatter(slots=["\n"]),
)

View File

@@ -0,0 +1,94 @@
import hashlib
from enum import Enum, unique
from typing import TYPE_CHECKING, Dict, List, Optional, Tuple, Union
from datasets import concatenate_datasets, interleave_datasets
from ..extras.logging import get_logger
if TYPE_CHECKING:
from datasets import Dataset, IterableDataset
from transformers import Seq2SeqTrainingArguments
from llmtuner.hparams import DataArguments
logger = get_logger(__name__)
@unique
class Role(str, Enum):
USER = "user"
ASSISTANT = "assistant"
SYSTEM = "system"
FUNCTION = "function"
OBSERVATION = "observation"
def checksum(data_files: List[str], file_sha1: Optional[str] = None) -> None:
if file_sha1 is None:
logger.warning("Checksum failed: missing SHA-1 hash value in dataset_info.json.")
return
if len(data_files) != 1:
logger.warning("Checksum failed: too many files.")
return
with open(data_files[0], "rb") as f:
sha1 = hashlib.sha1(f.read()).hexdigest()
if sha1 != file_sha1:
logger.warning("Checksum failed: mismatched SHA-1 hash value at {}.".format(data_files[0]))
def infer_max_len(source_len: int, target_len: int, max_len: int, reserved_label_len: int) -> Tuple[int, int]:
max_target_len = int(max_len * (target_len / (source_len + target_len)))
max_target_len = max(max_target_len, reserved_label_len)
max_source_len = max_len - max_target_len
return max_source_len, max_target_len
def merge_dataset(
all_datasets: List[Union["Dataset", "IterableDataset"]],
data_args: "DataArguments",
training_args: "Seq2SeqTrainingArguments",
) -> Union["Dataset", "IterableDataset"]:
if len(all_datasets) == 1:
return all_datasets[0]
elif data_args.mix_strategy == "concat":
if data_args.streaming:
logger.warning("The samples between different datasets will not be mixed in streaming mode.")
return concatenate_datasets(all_datasets)
elif data_args.mix_strategy.startswith("interleave"):
if not data_args.streaming:
logger.warning("We recommend using `mix_strategy=concat` in non-streaming mode.")
return interleave_datasets(
datasets=all_datasets,
probabilities=data_args.interleave_probs,
seed=training_args.seed,
stopping_strategy="first_exhausted" if data_args.mix_strategy.endswith("under") else "all_exhausted",
)
else:
raise ValueError("Unknown mixing strategy.")
def split_dataset(
dataset: Union["Dataset", "IterableDataset"], data_args: "DataArguments", training_args: "Seq2SeqTrainingArguments"
) -> Dict[str, "Dataset"]:
if training_args.do_train:
if data_args.val_size > 1e-6: # Split the dataset
if data_args.streaming:
val_set = dataset.take(int(data_args.val_size))
train_set = dataset.skip(int(data_args.val_size))
dataset = dataset.shuffle(buffer_size=data_args.buffer_size, seed=training_args.seed)
return {"train_dataset": train_set, "eval_dataset": val_set}
else:
val_size = int(data_args.val_size) if data_args.val_size > 1 else data_args.val_size
dataset = dataset.train_test_split(test_size=val_size, seed=training_args.seed)
return {"train_dataset": dataset["train"], "eval_dataset": dataset["test"]}
else:
if data_args.streaming:
dataset = dataset.shuffle(buffer_size=data_args.buffer_size, seed=training_args.seed)
return {"train_dataset": dataset}
else: # do_eval or do_predict
return {"eval_dataset": dataset}

View File

@@ -0,0 +1,4 @@
from .evaluator import Evaluator
__all__ = ["Evaluator"]

View File

@@ -0,0 +1,122 @@
# Inspired by: https://github.com/hendrycks/test/blob/master/evaluate_flan.py
import inspect
import json
import os
from typing import Any, Dict, List, Optional
import numpy as np
import torch
from datasets import load_dataset
from tqdm import tqdm, trange
from transformers.utils import cached_file
from ..data import get_template_and_fix_tokenizer
from ..extras.constants import CHOICES, SUBJECTS
from ..hparams import get_eval_args
from ..model import load_model_and_tokenizer
from .template import get_eval_template
class Evaluator:
def __init__(self, args: Optional[Dict[str, Any]] = None) -> None:
self.model_args, self.data_args, self.eval_args, finetuning_args = get_eval_args(args)
self.model, self.tokenizer = load_model_and_tokenizer(self.model_args, finetuning_args)
self.tokenizer.padding_side = "right" # avoid overflow issue in batched inference for llama2
self.template = get_template_and_fix_tokenizer(self.tokenizer, self.data_args.template)
self.eval_template = get_eval_template(self.eval_args.lang)
self.choice_inputs = [
self.tokenizer.encode(self.eval_template.prefix + ch, add_special_tokens=False)[-1] for ch in CHOICES
]
@torch.inference_mode()
def batch_inference(self, batch_input: Dict[str, torch.Tensor]) -> List[str]:
logits = self.model(**batch_input).logits
lengths = torch.sum(batch_input["attention_mask"], dim=-1)
word_probs = torch.stack([logits[i, lengths[i] - 1] for i in range(len(lengths))], dim=0)
choice_probs = torch.nn.functional.softmax(word_probs[:, self.choice_inputs], dim=-1).detach()
return [chr(ord("A") + offset.item()) for offset in torch.argmax(choice_probs, dim=-1)]
def eval(self) -> None:
mapping = cached_file(
path_or_repo_id=os.path.join(self.eval_args.task_dir, self.eval_args.task),
filename="mapping.json",
cache_dir=self.model_args.cache_dir,
token=self.model_args.hf_hub_token,
)
with open(mapping, "r", encoding="utf-8") as f:
categorys: Dict[str, Dict[str, str]] = json.load(f)
category_corrects = {subj: np.array([], dtype="bool") for subj in SUBJECTS}
pbar = tqdm(categorys.keys(), desc="Processing subjects", position=0)
results = {}
for subject in pbar:
if "trust_remote_code" in inspect.signature(load_dataset).parameters: # for datasets==2.16.0
kwargs = {"trust_remote_code": True}
else:
kwargs = {}
dataset = load_dataset(
path=os.path.join(self.eval_args.task_dir, self.eval_args.task),
name=subject,
cache_dir=self.model_args.cache_dir,
download_mode=self.eval_args.download_mode,
token=self.model_args.hf_hub_token,
**kwargs,
)
pbar.set_postfix_str(categorys[subject]["name"])
inputs, outputs, labels = [], [], []
for i in trange(len(dataset[self.data_args.split]), desc="Formatting batches", position=1, leave=False):
support_set = (
dataset["train"].shuffle().select(range(min(self.eval_args.n_shot, len(dataset["train"]))))
)
messages = self.eval_template.format_example(
target_data=dataset[self.data_args.split][i],
support_set=support_set,
subject_name=categorys[subject]["name"],
)
input_ids, _ = self.template.encode_oneturn(tokenizer=self.tokenizer, messages=messages)
inputs.append({"input_ids": input_ids, "attention_mask": [1] * len(input_ids)})
labels.append(messages[-1]["content"])
for i in trange(
0, len(inputs), self.eval_args.batch_size, desc="Predicting batches", position=1, leave=False
):
batch_input = self.tokenizer.pad(
inputs[i : i + self.eval_args.batch_size], return_attention_mask=True, return_tensors="pt"
).to(self.model.device)
preds = self.batch_inference(batch_input)
outputs += preds
corrects = np.array(outputs) == np.array(labels)
category_name = categorys[subject]["category"]
category_corrects[category_name] = np.concatenate([category_corrects[category_name], corrects], axis=0)
category_corrects["Average"] = np.concatenate([category_corrects["Average"], corrects], axis=0)
results[subject] = {str(i): outputs[i] for i in range(len(outputs))}
pbar.close()
self._save_results(category_corrects, results)
def _save_results(self, category_corrects: Dict[str, np.ndarray], results: Dict[str, Dict[int, str]]) -> None:
score_info = "\n".join(
[
"{:>15}: {:.2f}".format(category_name, 100 * np.mean(category_correct))
for category_name, category_correct in category_corrects.items()
if len(category_correct)
]
)
print(score_info)
if self.eval_args.save_dir is not None:
os.makedirs(self.eval_args.save_dir, exist_ok=False)
with open(os.path.join(self.eval_args.save_dir, "results.json"), "w", encoding="utf-8", newline="\n") as f:
json.dump(results, f, indent=2)
with open(os.path.join(self.eval_args.save_dir, "results.log"), "w", encoding="utf-8", newline="\n") as f:
f.write(score_info)
if __name__ == "__main__":
evaluator = Evaluator()
evaluator.eval()

View File

@@ -0,0 +1,67 @@
from dataclasses import dataclass
from typing import TYPE_CHECKING, Dict, List, Tuple
from ..data import Role
from ..extras.constants import CHOICES
if TYPE_CHECKING:
from datasets import Dataset
@dataclass
class EvalTemplate:
system: str
choice: str
answer: str
prefix: str
def parse_example(self, example: Dict[str, str]) -> Tuple[str, str]:
candidates = [self.choice.format(choice=ch, content=example[ch]) for ch in CHOICES if ch in example]
return "".join([example["question"]] + candidates + [self.answer]), example["answer"]
def format_example(
self, target_data: Dict[str, str], support_set: "Dataset", subject_name: str
) -> List[Dict[str, str]]:
messages = []
for k in range(len(support_set)):
prompt, response = self.parse_example(support_set[k])
messages.append({"role": Role.USER, "content": prompt})
messages.append({"role": Role.ASSISTANT, "content": response})
prompt, response = self.parse_example(target_data)
messages.append({"role": Role.USER, "content": prompt})
messages.append({"role": Role.ASSISTANT, "content": response})
messages[0]["content"] = self.system.format(subject=subject_name) + messages[0]["content"]
return messages
eval_templates: Dict[str, "EvalTemplate"] = {}
def register_eval_template(name: str, system: str, choice: str, answer: str, prefix: str) -> None:
eval_templates[name] = EvalTemplate(system=system, choice=choice, answer=answer, prefix=prefix)
def get_eval_template(name: str) -> "EvalTemplate":
eval_template = eval_templates.get(name, None)
assert eval_template is not None, "Template {} does not exist.".format(name)
return eval_template
register_eval_template(
name="en",
system="The following are multiple choice questions (with answers) about {subject}.\n\n",
choice="\n{choice}. {content}",
answer="\nAnswer: ",
prefix=" ",
)
register_eval_template(
name="zh",
system="以下是中国关于{subject}考试的单项选择题,请选出其中的正确答案。\n\n",
choice="\n{choice}. {content}",
answer="\n答案:",
prefix="\n",
)

View File

@@ -0,0 +1,153 @@
import json
import os
import time
from datetime import timedelta
from typing import TYPE_CHECKING
from transformers import TrainerCallback
from transformers.trainer_utils import PREFIX_CHECKPOINT_DIR, has_length
from .constants import LOG_FILE_NAME
from .logging import get_logger
from .misc import fix_valuehead_checkpoint
if TYPE_CHECKING:
from transformers import TrainerControl, TrainerState, TrainingArguments
logger = get_logger(__name__)
class FixValueHeadModelCallback(TrainerCallback):
def on_save(self, args: "TrainingArguments", state: "TrainerState", control: "TrainerControl", **kwargs):
r"""
Event called after a checkpoint save.
"""
if args.should_save:
fix_valuehead_checkpoint(
model=kwargs.pop("model"),
output_dir=os.path.join(args.output_dir, "{}-{}".format(PREFIX_CHECKPOINT_DIR, state.global_step)),
safe_serialization=args.save_safetensors,
)
class LogCallback(TrainerCallback):
def __init__(self, runner=None):
self.runner = runner
self.in_training = False
self.start_time = time.time()
self.cur_steps = 0
self.max_steps = 0
self.elapsed_time = ""
self.remaining_time = ""
def timing(self):
cur_time = time.time()
elapsed_time = cur_time - self.start_time
avg_time_per_step = elapsed_time / self.cur_steps if self.cur_steps != 0 else 0
remaining_time = (self.max_steps - self.cur_steps) * avg_time_per_step
self.elapsed_time = str(timedelta(seconds=int(elapsed_time)))
self.remaining_time = str(timedelta(seconds=int(remaining_time)))
def on_train_begin(self, args: "TrainingArguments", state: "TrainerState", control: "TrainerControl", **kwargs):
r"""
Event called at the beginning of training.
"""
if state.is_local_process_zero:
self.in_training = True
self.start_time = time.time()
self.max_steps = state.max_steps
if os.path.exists(os.path.join(args.output_dir, LOG_FILE_NAME)) and args.overwrite_output_dir:
logger.warning("Previous log file in this folder will be deleted.")
os.remove(os.path.join(args.output_dir, LOG_FILE_NAME))
def on_train_end(self, args: "TrainingArguments", state: "TrainerState", control: "TrainerControl", **kwargs):
r"""
Event called at the end of training.
"""
if state.is_local_process_zero:
self.in_training = False
self.cur_steps = 0
self.max_steps = 0
def on_substep_end(self, args: "TrainingArguments", state: "TrainerState", control: "TrainerControl", **kwargs):
r"""
Event called at the end of an substep during gradient accumulation.
"""
if state.is_local_process_zero and self.runner is not None and self.runner.aborted:
control.should_epoch_stop = True
control.should_training_stop = True
def on_step_end(self, args: "TrainingArguments", state: "TrainerState", control: "TrainerControl", **kwargs):
r"""
Event called at the end of a training step.
"""
if state.is_local_process_zero:
self.cur_steps = state.global_step
self.timing()
if self.runner is not None and self.runner.aborted:
control.should_epoch_stop = True
control.should_training_stop = True
def on_evaluate(self, args: "TrainingArguments", state: "TrainerState", control: "TrainerControl", **kwargs):
r"""
Event called after an evaluation phase.
"""
if state.is_local_process_zero and not self.in_training:
self.cur_steps = 0
self.max_steps = 0
def on_predict(
self, args: "TrainingArguments", state: "TrainerState", control: "TrainerControl", *other, **kwargs
):
r"""
Event called after a successful prediction.
"""
if state.is_local_process_zero and not self.in_training:
self.cur_steps = 0
self.max_steps = 0
def on_log(self, args: "TrainingArguments", state: "TrainerState", control: "TrainerControl", **kwargs) -> None:
r"""
Event called after logging the last logs.
"""
if not state.is_local_process_zero:
return
logs = dict(
current_steps=self.cur_steps,
total_steps=self.max_steps,
loss=state.log_history[-1].get("loss", None),
eval_loss=state.log_history[-1].get("eval_loss", None),
predict_loss=state.log_history[-1].get("predict_loss", None),
reward=state.log_history[-1].get("reward", None),
learning_rate=state.log_history[-1].get("learning_rate", None),
epoch=state.log_history[-1].get("epoch", None),
percentage=round(self.cur_steps / self.max_steps * 100, 2) if self.max_steps != 0 else 100,
elapsed_time=self.elapsed_time,
remaining_time=self.remaining_time,
)
if self.runner is not None:
logger.info(
"{{'loss': {:.4f}, 'learning_rate': {:2.4e}, 'epoch': {:.2f}}}".format(
logs["loss"] or 0, logs["learning_rate"] or 0, logs["epoch"] or 0
)
)
os.makedirs(args.output_dir, exist_ok=True)
with open(os.path.join(args.output_dir, "trainer_log.jsonl"), "a", encoding="utf-8") as f:
f.write(json.dumps(logs) + "\n")
def on_prediction_step(
self, args: "TrainingArguments", state: "TrainerState", control: "TrainerControl", **kwargs
):
r"""
Event called after a prediction step.
"""
eval_dataloader = kwargs.pop("eval_dataloader", None)
if state.is_local_process_zero and has_length(eval_dataloader) and not self.in_training:
if self.max_steps == 0:
self.max_steps = len(eval_dataloader)
self.cur_steps += 1
self.timing()

View File

@@ -0,0 +1,952 @@
from collections import OrderedDict, defaultdict
from enum import Enum
from typing import Dict, Optional
CHOICES = ["A", "B", "C", "D"]
DATA_CONFIG = "dataset_info.json"
DEFAULT_MODULE = defaultdict(str)
DEFAULT_TEMPLATE = defaultdict(str)
FILEEXT2TYPE = {
"arrow": "arrow",
"csv": "csv",
"json": "json",
"jsonl": "json",
"parquet": "parquet",
"txt": "text",
}
IGNORE_INDEX = -100
LAYERNORM_NAMES = {"norm", "ln"}
LOG_FILE_NAME = "trainer_log.jsonl"
METHODS = ["full", "freeze", "lora"]
PEFT_METHODS = ["lora"]
SUBJECTS = ["Average", "STEM", "Social Sciences", "Humanities", "Other"]
SUPPORTED_MODELS = OrderedDict()
TRAINING_STAGES = {
"Supervised Fine-Tuning": "sft",
"Reward Modeling": "rm",
"PPO": "ppo",
"DPO": "dpo",
"Pre-Training": "pt",
}
V_HEAD_WEIGHTS_NAME = "value_head.bin"
V_HEAD_SAFE_WEIGHTS_NAME = "value_head.safetensors"
class DownloadSource(str, Enum):
DEFAULT = "hf"
MODELSCOPE = "ms"
def register_model_group(
models: Dict[str, Dict[DownloadSource, str]],
module: Optional[str] = None,
template: Optional[str] = None,
) -> None:
prefix = None
for name, path in models.items():
if prefix is None:
prefix = name.split("-")[0]
else:
assert prefix == name.split("-")[0], "prefix should be identical."
SUPPORTED_MODELS[name] = path
if module is not None:
DEFAULT_MODULE[prefix] = module
if template is not None:
DEFAULT_TEMPLATE[prefix] = template
register_model_group(
models={
"Baichuan-7B-Base": {
DownloadSource.DEFAULT: "baichuan-inc/Baichuan-7B",
DownloadSource.MODELSCOPE: "baichuan-inc/baichuan-7B",
},
"Baichuan-13B-Base": {
DownloadSource.DEFAULT: "baichuan-inc/Baichuan-13B-Base",
DownloadSource.MODELSCOPE: "baichuan-inc/Baichuan-13B-Base",
},
"Baichuan-13B-Chat": {
DownloadSource.DEFAULT: "baichuan-inc/Baichuan-13B-Chat",
DownloadSource.MODELSCOPE: "baichuan-inc/Baichuan-13B-Chat",
},
},
module="W_pack",
template="baichuan",
)
register_model_group(
models={
"Baichuan2-7B-Base": {
DownloadSource.DEFAULT: "baichuan-inc/Baichuan2-7B-Base",
DownloadSource.MODELSCOPE: "baichuan-inc/Baichuan2-7B-Base",
},
"Baichuan2-13B-Base": {
DownloadSource.DEFAULT: "baichuan-inc/Baichuan2-13B-Base",
DownloadSource.MODELSCOPE: "baichuan-inc/Baichuan2-13B-Base",
},
"Baichuan2-7B-Chat": {
DownloadSource.DEFAULT: "baichuan-inc/Baichuan2-7B-Chat",
DownloadSource.MODELSCOPE: "baichuan-inc/Baichuan2-7B-Chat",
},
"Baichuan2-13B-Chat": {
DownloadSource.DEFAULT: "baichuan-inc/Baichuan2-13B-Chat",
DownloadSource.MODELSCOPE: "baichuan-inc/Baichuan2-13B-Chat",
},
},
module="W_pack",
template="baichuan2",
)
register_model_group(
models={
"BLOOM-560M": {
DownloadSource.DEFAULT: "bigscience/bloom-560m",
DownloadSource.MODELSCOPE: "AI-ModelScope/bloom-560m",
},
"BLOOM-3B": {
DownloadSource.DEFAULT: "bigscience/bloom-3b",
DownloadSource.MODELSCOPE: "AI-ModelScope/bloom-3b",
},
"BLOOM-7B1": {
DownloadSource.DEFAULT: "bigscience/bloom-7b1",
DownloadSource.MODELSCOPE: "AI-ModelScope/bloom-7b1",
},
},
module="query_key_value",
)
register_model_group(
models={
"BLOOMZ-560M": {
DownloadSource.DEFAULT: "bigscience/bloomz-560m",
DownloadSource.MODELSCOPE: "AI-ModelScope/bloomz-560m",
},
"BLOOMZ-3B": {
DownloadSource.DEFAULT: "bigscience/bloomz-3b",
DownloadSource.MODELSCOPE: "AI-ModelScope/bloomz-3b",
},
"BLOOMZ-7B1-mt": {
DownloadSource.DEFAULT: "bigscience/bloomz-7b1-mt",
DownloadSource.MODELSCOPE: "AI-ModelScope/bloomz-7b1-mt",
},
},
module="query_key_value",
)
register_model_group(
models={
"BlueLM-7B-Base": {
DownloadSource.DEFAULT: "vivo-ai/BlueLM-7B-Base",
DownloadSource.MODELSCOPE: "vivo-ai/BlueLM-7B-Base",
},
"BlueLM-7B-Chat": {
DownloadSource.DEFAULT: "vivo-ai/BlueLM-7B-Chat",
DownloadSource.MODELSCOPE: "vivo-ai/BlueLM-7B-Chat",
},
},
template="bluelm",
)
register_model_group(
models={
"ChatGLM2-6B-Chat": {
DownloadSource.DEFAULT: "THUDM/chatglm2-6b",
DownloadSource.MODELSCOPE: "ZhipuAI/chatglm2-6b",
}
},
module="query_key_value",
template="chatglm2",
)
register_model_group(
models={
"ChatGLM3-6B-Base": {
DownloadSource.DEFAULT: "THUDM/chatglm3-6b-base",
DownloadSource.MODELSCOPE: "ZhipuAI/chatglm3-6b-base",
},
"ChatGLM3-6B-Chat": {
DownloadSource.DEFAULT: "THUDM/chatglm3-6b",
DownloadSource.MODELSCOPE: "ZhipuAI/chatglm3-6b",
},
},
module="query_key_value",
template="chatglm3",
)
register_model_group(
models={
"ChineseLLaMA2-1.3B": {
DownloadSource.DEFAULT: "hfl/chinese-llama-2-1.3b",
DownloadSource.MODELSCOPE: "AI-ModelScope/chinese-llama-2-1.3b",
},
"ChineseLLaMA2-7B": {
DownloadSource.DEFAULT: "hfl/chinese-llama-2-7b",
DownloadSource.MODELSCOPE: "AI-ModelScope/chinese-llama-2-7b",
},
"ChineseLLaMA2-13B": {
DownloadSource.DEFAULT: "hfl/chinese-llama-2-13b",
DownloadSource.MODELSCOPE: "AI-ModelScope/chinese-llama-2-13b",
},
"ChineseLLaMA2-1.3B-Chat": {
DownloadSource.DEFAULT: "hfl/chinese-alpaca-2-1.3b",
DownloadSource.MODELSCOPE: "AI-ModelScope/chinese-alpaca-2-1.3b",
},
"ChineseLLaMA2-7B-Chat": {
DownloadSource.DEFAULT: "hfl/chinese-alpaca-2-7b",
DownloadSource.MODELSCOPE: "AI-ModelScope/chinese-alpaca-2-7b",
},
"ChineseLLaMA2-13B-Chat": {
DownloadSource.DEFAULT: "hfl/chinese-alpaca-2-13b",
DownloadSource.MODELSCOPE: "AI-ModelScope/chinese-alpaca-2-13b",
},
},
template="llama2_zh",
)
register_model_group(
models={
"DeepSeek-LLM-7B-Base": {
DownloadSource.DEFAULT: "deepseek-ai/deepseek-llm-7b-base",
DownloadSource.MODELSCOPE: "deepseek-ai/deepseek-llm-7b-base",
},
"DeepSeek-LLM-67B-Base": {
DownloadSource.DEFAULT: "deepseek-ai/deepseek-llm-67b-base",
DownloadSource.MODELSCOPE: "deepseek-ai/deepseek-llm-67b-base",
},
"DeepSeek-LLM-7B-Chat": {
DownloadSource.DEFAULT: "deepseek-ai/deepseek-llm-7b-chat",
DownloadSource.MODELSCOPE: "deepseek-ai/deepseek-llm-7b-chat",
},
"DeepSeek-LLM-67B-Chat": {
DownloadSource.DEFAULT: "deepseek-ai/deepseek-llm-67b-chat",
DownloadSource.MODELSCOPE: "deepseek-ai/deepseek-llm-67b-chat",
},
"DeepSeek-Math-7B-Base": {
DownloadSource.DEFAULT: "deepseek-ai/deepseek-math-7b-base",
},
"DeepSeek-Math-7B-Chat": {
DownloadSource.DEFAULT: "deepseek-ai/deepseek-math-7b-instruct",
},
"DeepSeek-MoE-16B-Base": {
DownloadSource.DEFAULT: "deepseek-ai/deepseek-moe-16b-base",
DownloadSource.MODELSCOPE: "deepseek-ai/deepseek-moe-16b-base",
},
"DeepSeek-MoE-16B-Chat": {
DownloadSource.DEFAULT: "deepseek-ai/deepseek-moe-16b-chat",
DownloadSource.MODELSCOPE: "deepseek-ai/deepseek-moe-16b-chat",
},
},
template="deepseek",
)
register_model_group(
models={
"DeepSeekCoder-6.7B-Base": {
DownloadSource.DEFAULT: "deepseek-ai/deepseek-coder-6.7b-base",
DownloadSource.MODELSCOPE: "deepseek-ai/deepseek-coder-6.7b-base",
},
"DeepSeekCoder-7B-Base": {
DownloadSource.DEFAULT: "deepseek-ai/deepseek-coder-7b-base-v1.5",
},
"DeepSeekCoder-33B-Base": {
DownloadSource.DEFAULT: "deepseek-ai/deepseek-coder-33b-base",
DownloadSource.MODELSCOPE: "deepseek-ai/deepseek-coder-33b-base",
},
"DeepSeekCoder-6.7B-Chat": {
DownloadSource.DEFAULT: "deepseek-ai/deepseek-coder-6.7b-instruct",
DownloadSource.MODELSCOPE: "deepseek-ai/deepseek-coder-6.7b-instruct",
},
"DeepSeekCoder-7B-Chat": {
DownloadSource.DEFAULT: "deepseek-ai/deepseek-coder-7b-instruct-v1.5",
},
"DeepSeekCoder-33B-Chat": {
DownloadSource.DEFAULT: "deepseek-ai/deepseek-coder-33b-instruct",
DownloadSource.MODELSCOPE: "deepseek-ai/deepseek-coder-33b-instruct",
},
},
template="deepseekcoder",
)
register_model_group(
models={
"Falcon-7B": {
DownloadSource.DEFAULT: "tiiuae/falcon-7b",
DownloadSource.MODELSCOPE: "AI-ModelScope/falcon-7b",
},
"Falcon-40B": {
DownloadSource.DEFAULT: "tiiuae/falcon-40b",
DownloadSource.MODELSCOPE: "AI-ModelScope/falcon-40b",
},
"Falcon-180B": {
DownloadSource.DEFAULT: "tiiuae/falcon-180b",
DownloadSource.MODELSCOPE: "modelscope/falcon-180B",
},
"Falcon-7B-Chat": {
DownloadSource.DEFAULT: "tiiuae/falcon-7b-instruct",
DownloadSource.MODELSCOPE: "AI-ModelScope/falcon-7b-instruct",
},
"Falcon-40B-Chat": {
DownloadSource.DEFAULT: "tiiuae/falcon-40b-instruct",
DownloadSource.MODELSCOPE: "AI-ModelScope/falcon-40b-instruct",
},
"Falcon-180B-Chat": {
DownloadSource.DEFAULT: "tiiuae/falcon-180b-chat",
DownloadSource.MODELSCOPE: "modelscope/falcon-180B-chat",
},
},
module="query_key_value",
template="falcon",
)
register_model_group(
models={
"Gemma-2B": {
DownloadSource.DEFAULT: "google/gemma-2b",
DownloadSource.MODELSCOPE: "AI-ModelScope/gemma-2b",
},
"Gemma-7B": {
DownloadSource.DEFAULT: "google/gemma-7b",
DownloadSource.MODELSCOPE: "AI-ModelScope/gemma-2b-it",
},
"Gemma-2B-Chat": {
DownloadSource.DEFAULT: "google/gemma-2b-it",
DownloadSource.MODELSCOPE: "AI-ModelScope/gemma-7b",
},
"Gemma-7B-Chat": {
DownloadSource.DEFAULT: "google/gemma-7b-it",
DownloadSource.MODELSCOPE: "AI-ModelScope/gemma-7b-it",
},
},
template="gemma",
)
register_model_group(
models={
"InternLM-7B": {
DownloadSource.DEFAULT: "internlm/internlm-7b",
DownloadSource.MODELSCOPE: "Shanghai_AI_Laboratory/internlm-7b",
},
"InternLM-20B": {
DownloadSource.DEFAULT: "internlm/internlm-20b",
DownloadSource.MODELSCOPE: "Shanghai_AI_Laboratory/internlm-20b",
},
"InternLM-7B-Chat": {
DownloadSource.DEFAULT: "internlm/internlm-chat-7b",
DownloadSource.MODELSCOPE: "Shanghai_AI_Laboratory/internlm-chat-7b",
},
"InternLM-20B-Chat": {
DownloadSource.DEFAULT: "internlm/internlm-chat-20b",
DownloadSource.MODELSCOPE: "Shanghai_AI_Laboratory/internlm-chat-20b",
},
},
template="intern",
)
register_model_group(
models={
"InternLM2-7B": {
DownloadSource.DEFAULT: "internlm/internlm2-7b",
DownloadSource.MODELSCOPE: "Shanghai_AI_Laboratory/internlm2-7b",
},
"InternLM2-20B": {
DownloadSource.DEFAULT: "internlm/internlm2-20b",
DownloadSource.MODELSCOPE: "Shanghai_AI_Laboratory/internlm2-20b",
},
"InternLM2-7B-Chat": {
DownloadSource.DEFAULT: "internlm/internlm2-chat-7b",
DownloadSource.MODELSCOPE: "Shanghai_AI_Laboratory/internlm2-chat-7b",
},
"InternLM2-20B-Chat": {
DownloadSource.DEFAULT: "internlm/internlm2-chat-20b",
DownloadSource.MODELSCOPE: "Shanghai_AI_Laboratory/internlm2-chat-20b",
},
},
module="wqkv",
template="intern2",
)
register_model_group(
models={
"LingoWhale-8B": {
DownloadSource.DEFAULT: "deeplang-ai/LingoWhale-8B",
DownloadSource.MODELSCOPE: "DeepLang/LingoWhale-8B",
}
},
module="qkv_proj",
)
register_model_group(
models={
"LLaMA-7B": {
DownloadSource.DEFAULT: "huggyllama/llama-7b",
DownloadSource.MODELSCOPE: "skyline2006/llama-7b",
},
"LLaMA-13B": {
DownloadSource.DEFAULT: "huggyllama/llama-13b",
DownloadSource.MODELSCOPE: "skyline2006/llama-13b",
},
"LLaMA-30B": {
DownloadSource.DEFAULT: "huggyllama/llama-30b",
DownloadSource.MODELSCOPE: "skyline2006/llama-30b",
},
"LLaMA-65B": {
DownloadSource.DEFAULT: "huggyllama/llama-65b",
DownloadSource.MODELSCOPE: "skyline2006/llama-65b",
},
}
)
register_model_group(
models={
"LLaMA2-7B": {
DownloadSource.DEFAULT: "meta-llama/Llama-2-7b-hf",
DownloadSource.MODELSCOPE: "modelscope/Llama-2-7b-ms",
},
"LLaMA2-13B": {
DownloadSource.DEFAULT: "meta-llama/Llama-2-13b-hf",
DownloadSource.MODELSCOPE: "modelscope/Llama-2-13b-ms",
},
"LLaMA2-70B": {
DownloadSource.DEFAULT: "meta-llama/Llama-2-70b-hf",
DownloadSource.MODELSCOPE: "modelscope/Llama-2-70b-ms",
},
"LLaMA2-7B-Chat": {
DownloadSource.DEFAULT: "meta-llama/Llama-2-7b-chat-hf",
DownloadSource.MODELSCOPE: "modelscope/Llama-2-7b-chat-ms",
},
"LLaMA2-13B-Chat": {
DownloadSource.DEFAULT: "meta-llama/Llama-2-13b-chat-hf",
DownloadSource.MODELSCOPE: "modelscope/Llama-2-13b-chat-ms",
},
"LLaMA2-70B-Chat": {
DownloadSource.DEFAULT: "meta-llama/Llama-2-70b-chat-hf",
DownloadSource.MODELSCOPE: "modelscope/Llama-2-70b-chat-ms",
},
},
template="llama2",
)
register_model_group(
models={
"Mistral-7B": {
DownloadSource.DEFAULT: "mistralai/Mistral-7B-v0.1",
DownloadSource.MODELSCOPE: "AI-ModelScope/Mistral-7B-v0.1",
},
"Mistral-7B-Chat": {
DownloadSource.DEFAULT: "mistralai/Mistral-7B-Instruct-v0.1",
DownloadSource.MODELSCOPE: "AI-ModelScope/Mistral-7B-Instruct-v0.1",
},
"Mistral-7B-v0.2-Chat": {
DownloadSource.DEFAULT: "mistralai/Mistral-7B-Instruct-v0.2",
DownloadSource.MODELSCOPE: "AI-ModelScope/Mistral-7B-Instruct-v0.2",
},
},
template="mistral",
)
register_model_group(
models={
"Mixtral-8x7B": {
DownloadSource.DEFAULT: "mistralai/Mixtral-8x7B-v0.1",
DownloadSource.MODELSCOPE: "AI-ModelScope/Mixtral-8x7B-v0.1",
},
"Mixtral-8x7B-Chat": {
DownloadSource.DEFAULT: "mistralai/Mixtral-8x7B-Instruct-v0.1",
DownloadSource.MODELSCOPE: "AI-ModelScope/Mixtral-8x7B-Instruct-v0.1",
},
},
template="mistral",
)
register_model_group(
models={
"OLMo-1B": {
DownloadSource.DEFAULT: "allenai/OLMo-1B",
},
"OLMo-7B": {
DownloadSource.DEFAULT: "allenai/OLMo-7B",
DownloadSource.MODELSCOPE: "AI-ModelScope/OLMo-7B",
},
"OLMo-7B-Chat": {
DownloadSource.DEFAULT: "allenai/OLMo-7B-Instruct",
},
},
module="att_proj",
template="olmo",
)
register_model_group(
models={
"OpenChat3.5-7B-Chat": {
DownloadSource.DEFAULT: "openchat/openchat-3.5-0106",
DownloadSource.MODELSCOPE: "myxiongmodel/openchat_3.5",
}
},
template="openchat",
)
register_model_group(
models={
"Orion-14B-Base": {
DownloadSource.DEFAULT: "OrionStarAI/Orion-14B-Base",
DownloadSource.MODELSCOPE: "OrionStarAI/Orion-14B-Base",
},
"Orion-14B-Chat": {
DownloadSource.DEFAULT: "OrionStarAI/Orion-14B-Chat",
DownloadSource.MODELSCOPE: "OrionStarAI/Orion-14B-Chat",
},
"Orion-14B-Long-Chat": {
DownloadSource.DEFAULT: "OrionStarAI/Orion-14B-LongChat",
DownloadSource.MODELSCOPE: "OrionStarAI/Orion-14B-LongChat",
},
"Orion-14B-RAG-Chat": {
DownloadSource.DEFAULT: "OrionStarAI/Orion-14B-Chat-RAG",
DownloadSource.MODELSCOPE: "OrionStarAI/Orion-14B-Chat-RAG",
},
"Orion-14B-Plugin-Chat": {
DownloadSource.DEFAULT: "OrionStarAI/Orion-14B-Chat-Plugin",
DownloadSource.MODELSCOPE: "OrionStarAI/Orion-14B-Chat-Plugin",
},
},
template="orion",
)
register_model_group(
models={
"Phi-1.5-1.3B": {
DownloadSource.DEFAULT: "microsoft/phi-1_5",
DownloadSource.MODELSCOPE: "allspace/PHI_1-5",
},
"Phi-2-2.7B": {
DownloadSource.DEFAULT: "microsoft/phi-2",
DownloadSource.MODELSCOPE: "AI-ModelScope/phi-2",
},
}
)
register_model_group(
models={
"Qwen-1.8B": {
DownloadSource.DEFAULT: "Qwen/Qwen-1_8B",
DownloadSource.MODELSCOPE: "qwen/Qwen-1_8B",
},
"Qwen-7B": {
DownloadSource.DEFAULT: "Qwen/Qwen-7B",
DownloadSource.MODELSCOPE: "qwen/Qwen-7B",
},
"Qwen-14B": {
DownloadSource.DEFAULT: "Qwen/Qwen-14B",
DownloadSource.MODELSCOPE: "qwen/Qwen-14B",
},
"Qwen-72B": {
DownloadSource.DEFAULT: "Qwen/Qwen-72B",
DownloadSource.MODELSCOPE: "qwen/Qwen-72B",
},
"Qwen-1.8B-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen-1_8B-Chat",
DownloadSource.MODELSCOPE: "qwen/Qwen-1_8B-Chat",
},
"Qwen-7B-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen-7B-Chat",
DownloadSource.MODELSCOPE: "qwen/Qwen-7B-Chat",
},
"Qwen-14B-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen-14B-Chat",
DownloadSource.MODELSCOPE: "qwen/Qwen-14B-Chat",
},
"Qwen-72B-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen-72B-Chat",
DownloadSource.MODELSCOPE: "qwen/Qwen-72B-Chat",
},
"Qwen-1.8B-int8-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen-1_8B-Chat-Int8",
DownloadSource.MODELSCOPE: "qwen/Qwen-1_8B-Chat-Int8",
},
"Qwen-1.8B-int4-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen-1_8B-Chat-Int4",
DownloadSource.MODELSCOPE: "qwen/Qwen-1_8B-Chat-Int4",
},
"Qwen-7B-int8-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen-7B-Chat-Int8",
DownloadSource.MODELSCOPE: "qwen/Qwen-7B-Chat-Int8",
},
"Qwen-7B-int4-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen-7B-Chat-Int4",
DownloadSource.MODELSCOPE: "qwen/Qwen-7B-Chat-Int4",
},
"Qwen-14B-int8-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen-14B-Chat-Int8",
DownloadSource.MODELSCOPE: "qwen/Qwen-14B-Chat-Int8",
},
"Qwen-14B-int4-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen-14B-Chat-Int4",
DownloadSource.MODELSCOPE: "qwen/Qwen-14B-Chat-Int4",
},
"Qwen-72B-int8-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen-72B-Chat-Int8",
DownloadSource.MODELSCOPE: "qwen/Qwen-72B-Chat-Int8",
},
"Qwen-72B-int4-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen-72B-Chat-Int4",
DownloadSource.MODELSCOPE: "qwen/Qwen-72B-Chat-Int4",
},
},
module="c_attn",
template="qwen",
)
register_model_group(
models={
"Qwen1.5-0.5B": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-0.5B",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-0.5B",
},
"Qwen1.5-1.8B": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-1.8B",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-1.8B",
},
"Qwen1.5-4B": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-4B",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-4B",
},
"Qwen1.5-7B": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-7B",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-7B",
},
"Qwen1.5-14B": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-14B",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-14B",
},
"Qwen1.5-72B": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-72B",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-72B",
},
"Qwen1.5-0.5B-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-0.5B-Chat",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-0.5B-Chat",
},
"Qwen1.5-1.8B-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-1.8B-Chat",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-1.8B-Chat",
},
"Qwen1.5-4B-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-4B-Chat",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-4B-Chat",
},
"Qwen1.5-7B-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-7B-Chat",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-7B-Chat",
},
"Qwen1.5-14B-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-14B-Chat",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-14B-Chat",
},
"Qwen1.5-72B-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-72B-Chat",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-72B-Chat",
},
"Qwen1.5-0.5B-int8-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-0.5B-Chat-GPTQ-Int8",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-0.5B-Chat-GPTQ-Int8",
},
"Qwen1.5-0.5B-int4-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-0.5B-Chat-AWQ",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-0.5B-Chat-AWQ",
},
"Qwen1.5-1.8B-int8-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-1.8B-Chat-GPTQ-Int8",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-1.8B-Chat-GPTQ-Int8",
},
"Qwen1.5-1.8B-int4-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-1.8B-Chat-AWQ",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-1.8B-Chat-AWQ",
},
"Qwen1.5-4B-int8-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-4B-Chat-GPTQ-Int8",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-4B-Chat-GPTQ-Int8",
},
"Qwen1.5-4B-int4-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-4B-Chat-AWQ",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-4B-Chat-AWQ",
},
"Qwen1.5-7B-int8-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-7B-Chat-GPTQ-Int8",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-7B-Chat-GPTQ-Int8",
},
"Qwen1.5-7B-int4-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-7B-Chat-AWQ",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-7B-Chat-AWQ",
},
"Qwen1.5-14B-int8-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-14B-Chat-GPTQ-Int8",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-14B-Chat-GPTQ-Int8",
},
"Qwen1.5-14B-int4-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-14B-Chat-AWQ",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-14B-Chat-AWQ",
},
"Qwen1.5-72B-int8-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-72B-Chat-GPTQ-Int8",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-72B-Chat-GPTQ-Int8",
},
"Qwen1.5-72B-int4-Chat": {
DownloadSource.DEFAULT: "Qwen/Qwen1.5-72B-Chat-AWQ",
DownloadSource.MODELSCOPE: "qwen/Qwen1.5-72B-Chat-AWQ",
},
},
template="qwen",
)
register_model_group(
models={
"SOLAR-10.7B": {
DownloadSource.DEFAULT: "upstage/SOLAR-10.7B-v1.0",
},
"SOLAR-10.7B-Chat": {
DownloadSource.DEFAULT: "upstage/SOLAR-10.7B-Instruct-v1.0",
DownloadSource.MODELSCOPE: "AI-ModelScope/SOLAR-10.7B-Instruct-v1.0",
},
},
template="solar",
)
register_model_group(
models={
"Skywork-13B-Base": {
DownloadSource.DEFAULT: "Skywork/Skywork-13B-base",
DownloadSource.MODELSCOPE: "skywork/Skywork-13B-base",
}
}
)
register_model_group(
models={
"StarCoder2-3B": {
DownloadSource.DEFAULT: "bigcode/starcoder2-3b",
},
"StarCoder2-7B": {
DownloadSource.DEFAULT: "bigcode/starcoder2-7b",
},
"StarCoder2-15B": {
DownloadSource.DEFAULT: "bigcode/starcoder2-15b",
},
}
)
register_model_group(
models={
"Vicuna1.5-7B-Chat": {
DownloadSource.DEFAULT: "lmsys/vicuna-7b-v1.5",
DownloadSource.MODELSCOPE: "Xorbits/vicuna-7b-v1.5",
},
"Vicuna1.5-13B-Chat": {
DownloadSource.DEFAULT: "lmsys/vicuna-13b-v1.5",
DownloadSource.MODELSCOPE: "Xorbits/vicuna-13b-v1.5",
},
},
template="vicuna",
)
register_model_group(
models={
"XuanYuan-70B": {
DownloadSource.DEFAULT: "Duxiaoman-DI/XuanYuan-70B",
},
"XuanYuan-70B-Chat": {
DownloadSource.DEFAULT: "Duxiaoman-DI/XuanYuan-70B-Chat",
},
"XuanYuan-70B-int8-Chat": {
DownloadSource.DEFAULT: "Duxiaoman-DI/XuanYuan-70B-Chat-8bit",
},
"XuanYuan-70B-int4-Chat": {
DownloadSource.DEFAULT: "Duxiaoman-DI/XuanYuan-70B-Chat-4bit",
},
},
template="xuanyuan",
)
register_model_group(
models={
"XVERSE-7B": {
DownloadSource.DEFAULT: "xverse/XVERSE-7B",
DownloadSource.MODELSCOPE: "xverse/XVERSE-7B",
},
"XVERSE-13B": {
DownloadSource.DEFAULT: "xverse/XVERSE-13B",
DownloadSource.MODELSCOPE: "xverse/XVERSE-13B",
},
"XVERSE-65B": {
DownloadSource.DEFAULT: "xverse/XVERSE-65B",
DownloadSource.MODELSCOPE: "xverse/XVERSE-65B",
},
"XVERSE-65B-2": {
DownloadSource.DEFAULT: "xverse/XVERSE-65B-2",
DownloadSource.MODELSCOPE: "xverse/XVERSE-65B-2",
},
"XVERSE-7B-Chat": {
DownloadSource.DEFAULT: "xverse/XVERSE-7B-Chat",
DownloadSource.MODELSCOPE: "xverse/XVERSE-7B-Chat",
},
"XVERSE-13B-Chat": {
DownloadSource.DEFAULT: "xverse/XVERSE-13B-Chat",
DownloadSource.MODELSCOPE: "xverse/XVERSE-13B-Chat",
},
"XVERSE-65B-Chat": {
DownloadSource.DEFAULT: "xverse/XVERSE-65B-Chat",
DownloadSource.MODELSCOPE: "xverse/XVERSE-65B-Chat",
},
},
template="xverse",
)
register_model_group(
models={
"Yayi-7B": {
DownloadSource.DEFAULT: "wenge-research/yayi-7b-llama2",
DownloadSource.MODELSCOPE: "AI-ModelScope/yayi-7b-llama2",
},
"Yayi-13B": {
DownloadSource.DEFAULT: "wenge-research/yayi-13b-llama2",
DownloadSource.MODELSCOPE: "AI-ModelScope/yayi-13b-llama2",
},
},
template="yayi",
)
register_model_group(
models={
"Yi-6B": {
DownloadSource.DEFAULT: "01-ai/Yi-6B",
DownloadSource.MODELSCOPE: "01ai/Yi-6B",
},
"Yi-9B": {
DownloadSource.DEFAULT: "01-ai/Yi-9B",
DownloadSource.MODELSCOPE: "01ai/Yi-9B",
},
"Yi-34B": {
DownloadSource.DEFAULT: "01-ai/Yi-34B",
DownloadSource.MODELSCOPE: "01ai/Yi-34B",
},
"Yi-6B-Chat": {
DownloadSource.DEFAULT: "01-ai/Yi-6B-Chat",
DownloadSource.MODELSCOPE: "01ai/Yi-6B-Chat",
},
"Yi-34B-Chat": {
DownloadSource.DEFAULT: "01-ai/Yi-34B-Chat",
DownloadSource.MODELSCOPE: "01ai/Yi-34B-Chat",
},
"Yi-6B-int8-Chat": {
DownloadSource.DEFAULT: "01-ai/Yi-6B-Chat-8bits",
DownloadSource.MODELSCOPE: "01ai/Yi-6B-Chat-8bits",
},
"Yi-6B-int4-Chat": {
DownloadSource.DEFAULT: "01-ai/Yi-6B-Chat-4bits",
DownloadSource.MODELSCOPE: "01ai/Yi-6B-Chat-4bits",
},
"Yi-34B-int8-Chat": {
DownloadSource.DEFAULT: "01-ai/Yi-34B-Chat-8bits",
DownloadSource.MODELSCOPE: "01ai/Yi-34B-Chat-8bits",
},
"Yi-34B-int4-Chat": {
DownloadSource.DEFAULT: "01-ai/Yi-34B-Chat-4bits",
DownloadSource.MODELSCOPE: "01ai/Yi-34B-Chat-4bits",
},
},
template="yi",
)
register_model_group(
models={
"Yuan2-2B-Chat": {
DownloadSource.DEFAULT: "IEITYuan/Yuan2-2B-hf",
DownloadSource.MODELSCOPE: "YuanLLM/Yuan2.0-2B-hf",
},
"Yuan2-51B-Chat": {
DownloadSource.DEFAULT: "IEITYuan/Yuan2-51B-hf",
DownloadSource.MODELSCOPE: "YuanLLM/Yuan2.0-51B-hf",
},
"Yuan2-102B-Chat": {
DownloadSource.DEFAULT: "IEITYuan/Yuan2-102B-hf",
DownloadSource.MODELSCOPE: "YuanLLM/Yuan2.0-102B-hf",
},
},
template="yuan",
)
register_model_group(
models={
"Zephyr-7B-Alpha-Chat": {
DownloadSource.DEFAULT: "HuggingFaceH4/zephyr-7b-alpha",
DownloadSource.MODELSCOPE: "AI-ModelScope/zephyr-7b-alpha",
},
"Zephyr-7B-Beta-Chat": {
DownloadSource.DEFAULT: "HuggingFaceH4/zephyr-7b-beta",
DownloadSource.MODELSCOPE: "modelscope/zephyr-7b-beta",
},
},
template="zephyr",
)
register_model_group(
models={
"Atom-7B": {
DownloadSource.DEFAULT: "FlagAlpha/Atom-7B",
DownloadSource.MODELSCOPE: "FlagAlpha/Atom-7B",
},
"Atom-7B-Chat": {
DownloadSource.DEFAULT: "FlagAlpha/Atom-7B-Chat",
DownloadSource.MODELSCOPE: "FlagAlpha/Atom-7B-Chat",
},
},
template="atom",
)

View File

@@ -0,0 +1,48 @@
import logging
import sys
class LoggerHandler(logging.Handler):
r"""
Logger handler used in Web UI.
"""
def __init__(self):
super().__init__()
self.log = ""
def reset(self):
self.log = ""
def emit(self, record):
if record.name == "httpx":
return
log_entry = self.format(record)
self.log += log_entry
self.log += "\n\n"
def get_logger(name: str) -> logging.Logger:
r"""
Gets a standard logger with a stream hander to stdout.
"""
formatter = logging.Formatter(
fmt="%(asctime)s - %(levelname)s - %(name)s - %(message)s", datefmt="%m/%d/%Y %H:%M:%S"
)
handler = logging.StreamHandler(sys.stdout)
handler.setFormatter(formatter)
logger = logging.getLogger(name)
logger.setLevel(logging.INFO)
logger.addHandler(handler)
return logger
def reset_logging() -> None:
r"""
Removes basic config of root logger. (unused in script)
"""
root = logging.getLogger()
list(map(root.removeHandler, root.handlers))
list(map(root.removeFilter, root.filters))

View File

@@ -0,0 +1,216 @@
import gc
import os
from typing import TYPE_CHECKING, Dict, Tuple
import torch
from peft import PeftModel
from transformers import InfNanRemoveLogitsProcessor, LogitsProcessorList, PreTrainedModel
from transformers.utils import (
SAFE_WEIGHTS_NAME,
WEIGHTS_NAME,
is_torch_bf16_gpu_available,
is_torch_cuda_available,
is_torch_mps_available,
is_torch_npu_available,
is_torch_xpu_available,
)
from transformers.utils.versions import require_version
from .constants import V_HEAD_SAFE_WEIGHTS_NAME, V_HEAD_WEIGHTS_NAME
from .logging import get_logger
_is_fp16_available = is_torch_npu_available() or is_torch_cuda_available()
try:
_is_bf16_available = is_torch_bf16_gpu_available()
except Exception:
_is_bf16_available = False
if TYPE_CHECKING:
from trl import AutoModelForCausalLMWithValueHead
from llmtuner.hparams import ModelArguments
logger = get_logger(__name__)
class AverageMeter:
r"""
Computes and stores the average and current value.
"""
def __init__(self):
self.reset()
def reset(self):
self.val = 0
self.avg = 0
self.sum = 0
self.count = 0
def update(self, val, n=1):
self.val = val
self.sum += val * n
self.count += n
self.avg = self.sum / self.count
def check_dependencies() -> None:
if int(os.environ.get("DISABLE_VERSION_CHECK", "0")):
logger.warning("Version checking has been disabled, may lead to unexpected behaviors.")
else:
require_version("transformers>=4.37.2", "To fix: pip install transformers>=4.37.2")
require_version("datasets>=2.14.3", "To fix: pip install datasets>=2.14.3")
require_version("accelerate>=0.27.2", "To fix: pip install accelerate>=0.27.2")
require_version("peft>=0.9.0", "To fix: pip install peft>=0.9.0")
require_version("trl>=0.7.11", "To fix: pip install trl>=0.7.11")
def count_parameters(model: torch.nn.Module) -> Tuple[int, int]:
r"""
Returns the number of trainable parameters and number of all parameters in the model.
"""
trainable_params, all_param = 0, 0
for param in model.parameters():
num_params = param.numel()
# if using DS Zero 3 and the weights are initialized empty
if num_params == 0 and hasattr(param, "ds_numel"):
num_params = param.ds_numel
# Due to the design of 4bit linear layers from bitsandbytes, multiply the number of parameters by 2
if param.__class__.__name__ == "Params4bit":
num_params = num_params * 2
all_param += num_params
if param.requires_grad:
trainable_params += num_params
return trainable_params, all_param
def fix_valuehead_checkpoint(
model: "AutoModelForCausalLMWithValueHead", output_dir: str, safe_serialization: bool
) -> None:
r"""
The model is already unwrapped.
There are three cases:
1. full tuning without ds_zero3: state_dict = {"model.layers.*": ..., "v_head.summary.*": ...}
2. lora tuning without ds_zero3: state_dict = {"v_head.summary.*": ...}
3. under deepspeed zero3: state_dict = {"pretrained_model.model.layers.*": ..., "v_head.summary.*": ...}
We assume `stage3_gather_16bit_weights_on_model_save=true`.
"""
if not isinstance(model.pretrained_model, (PreTrainedModel, PeftModel)):
return
if safe_serialization:
from safetensors import safe_open
from safetensors.torch import save_file
path_to_checkpoint = os.path.join(output_dir, SAFE_WEIGHTS_NAME)
with safe_open(path_to_checkpoint, framework="pt", device="cpu") as f:
state_dict: Dict[str, torch.Tensor] = {key: f.get_tensor(key) for key in f.keys()}
else:
path_to_checkpoint = os.path.join(output_dir, WEIGHTS_NAME)
state_dict: Dict[str, torch.Tensor] = torch.load(path_to_checkpoint, map_location="cpu")
decoder_state_dict = {}
v_head_state_dict = {}
for name, param in state_dict.items():
if name.startswith("v_head."):
v_head_state_dict[name] = param
else:
decoder_state_dict[name.replace("pretrained_model.", "")] = param
os.remove(path_to_checkpoint)
model.pretrained_model.save_pretrained(
output_dir, state_dict=decoder_state_dict or None, safe_serialization=safe_serialization
)
if safe_serialization:
save_file(v_head_state_dict, os.path.join(output_dir, V_HEAD_SAFE_WEIGHTS_NAME), metadata={"format": "pt"})
else:
torch.save(v_head_state_dict, os.path.join(output_dir, V_HEAD_WEIGHTS_NAME))
logger.info("Value head model saved at: {}".format(output_dir))
def get_current_device() -> torch.device:
r"""
Gets the current available device.
"""
if is_torch_xpu_available():
device = "xpu:{}".format(os.environ.get("LOCAL_RANK", "0"))
elif is_torch_npu_available():
device = "npu:{}".format(os.environ.get("LOCAL_RANK", "0"))
elif is_torch_mps_available():
device = "mps:{}".format(os.environ.get("LOCAL_RANK", "0"))
elif is_torch_cuda_available():
device = "cuda:{}".format(os.environ.get("LOCAL_RANK", "0"))
else:
device = "cpu"
return torch.device(device)
def get_device_count() -> int:
r"""
Gets the number of available GPU devices.
"""
if not torch.cuda.is_available():
return 0
return torch.cuda.device_count()
def get_logits_processor() -> "LogitsProcessorList":
r"""
Gets logits processor that removes NaN and Inf logits.
"""
logits_processor = LogitsProcessorList()
logits_processor.append(InfNanRemoveLogitsProcessor())
return logits_processor
def infer_optim_dtype(model_dtype: torch.dtype) -> torch.dtype:
r"""
Infers the optimal dtype according to the model_dtype and device compatibility.
"""
if _is_bf16_available and model_dtype == torch.bfloat16:
return torch.bfloat16
elif _is_fp16_available:
return torch.float16
else:
return torch.float32
def torch_gc() -> None:
r"""
Collects GPU memory.
"""
gc.collect()
if torch.cuda.is_available():
torch.cuda.empty_cache()
torch.cuda.ipc_collect()
def try_download_model_from_ms(model_args: "ModelArguments") -> None:
if not use_modelscope() or os.path.exists(model_args.model_name_or_path):
return
try:
from modelscope import snapshot_download
revision = "master" if model_args.model_revision == "main" else model_args.model_revision
model_args.model_name_or_path = snapshot_download(
model_args.model_name_or_path, revision=revision, cache_dir=model_args.cache_dir
)
except ImportError:
raise ImportError("Please install modelscope via `pip install modelscope -U`")
def use_modelscope() -> bool:
return bool(int(os.environ.get("USE_MODELSCOPE_HUB", "0")))

View File

@@ -0,0 +1,61 @@
import importlib.metadata
import importlib.util
def _is_package_available(name: str) -> bool:
return importlib.util.find_spec(name) is not None
def _get_package_version(name: str) -> str:
try:
return importlib.metadata.version(name)
except Exception:
return "0.0.0"
def is_fastapi_availble():
return _is_package_available("fastapi")
def is_flash_attn2_available():
return _is_package_available("flash_attn") and _get_package_version("flash_attn").startswith("2")
def is_galore_available():
return _is_package_available("galore_torch")
def is_jieba_available():
return _is_package_available("jieba")
def is_matplotlib_available():
return _is_package_available("matplotlib")
def is_nltk_available():
return _is_package_available("nltk")
def is_requests_available():
return _is_package_available("requests")
def is_rouge_available():
return _is_package_available("rouge_chinese")
def is_starlette_available():
return _is_package_available("sse_starlette")
def is_unsloth_available():
return _is_package_available("unsloth")
def is_uvicorn_available():
return _is_package_available("uvicorn")
def is_vllm_available():
return _is_package_available("vllm")

View File

@@ -0,0 +1,197 @@
import math
from typing import Optional, Tuple
import torch
import torch.nn as nn
from transformers.models.llama.modeling_llama import (
Cache,
LlamaAttention,
LlamaFlashAttention2,
apply_rotary_pos_emb,
repeat_kv,
)
from transformers.utils import logging
logger = logging.get_logger(__name__)
# Modified from: https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py
def llama_torch_attn_forward(
self: "LlamaAttention",
hidden_states: torch.Tensor,
attention_mask: Optional[torch.Tensor] = None,
position_ids: Optional[torch.LongTensor] = None,
past_key_value: Optional["Cache"] = None,
output_attentions: bool = False,
**kwargs,
) -> Tuple[torch.Tensor, Optional[torch.Tensor], Optional[Tuple[torch.Tensor]]]:
bsz, q_len, _ = hidden_states.size()
query_states = self.q_proj(hidden_states)
key_states = self.k_proj(hidden_states)
value_states = self.v_proj(hidden_states)
query_states = query_states.view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
key_states = key_states.view(bsz, q_len, self.num_key_value_heads, self.head_dim).transpose(1, 2)
value_states = value_states.view(bsz, q_len, self.num_key_value_heads, self.head_dim).transpose(1, 2)
kv_seq_len = key_states.shape[-2]
if past_key_value is not None:
kv_seq_len += past_key_value.get_usable_length(kv_seq_len, self.layer_idx)
cos, sin = self.rotary_emb(value_states, seq_len=kv_seq_len)
query_states, key_states = apply_rotary_pos_emb(query_states, key_states, cos, sin, position_ids)
if past_key_value is not None:
cache_kwargs = {"sin": sin, "cos": cos} # Specific to RoPE models
key_states, value_states = past_key_value.update(key_states, value_states, self.layer_idx, cache_kwargs)
key_states = repeat_kv(key_states, self.num_key_value_groups)
value_states = repeat_kv(value_states, self.num_key_value_groups)
if getattr(self.config, "group_size_ratio", None) and self.training: # shift
groupsz = int(q_len * getattr(self.config, "group_size_ratio"))
assert q_len % groupsz == 0, "q_len {} should be divisible by group size {}.".format(q_len, groupsz)
num_groups = q_len // groupsz
def shift(state: torch.Tensor) -> torch.Tensor:
state = state.transpose(1, 2) # output: (bsz, seq_len, n_heads, head_dim)
state = torch.cat(
(state[:, :, : self.num_heads // 2], state[:, :, self.num_heads // 2 :].roll(-groupsz // 2, dims=1)),
dim=2,
)
return state.reshape(bsz * num_groups, groupsz, self.num_heads, self.head_dim).transpose(1, 2)
query_states, key_states, value_states = shift(query_states), shift(key_states), shift(value_states)
if attention_mask is not None:
attention_mask = attention_mask[:, :, :groupsz, :groupsz].repeat(num_groups, 1, 1, 1)
attn_weights = torch.matmul(query_states, key_states.transpose(2, 3)) / math.sqrt(self.head_dim)
if attention_mask is not None:
attn_weights = attn_weights + attention_mask
# upcast attention to fp32
attn_weights = nn.functional.softmax(attn_weights, dim=-1, dtype=torch.float32).to(query_states.dtype)
attn_weights = nn.functional.dropout(attn_weights, p=self.attention_dropout, training=self.training)
attn_output = torch.matmul(attn_weights, value_states) # (bsz, :, seq_len, :) or (bsz*n_group, :, groupsz, :)
attn_output = attn_output.transpose(1, 2).contiguous()
if getattr(self.config, "group_size_ratio", None) and self.training: # shift back
attn_output.reshape(bsz, q_len, self.num_heads, self.head_dim)
attn_output = torch.cat(
(
attn_output[:, :, : self.num_heads // 2],
attn_output[:, :, self.num_heads // 2 :].roll(groupsz // 2, dims=1),
)
)
attn_output = attn_output.reshape(bsz, q_len, self.hidden_size)
attn_output = self.o_proj(attn_output)
if not output_attentions:
attn_weights = None
return attn_output, attn_weights, past_key_value
# Modified from: https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py
def llama_flash_attn_forward(
self: "LlamaFlashAttention2",
hidden_states: torch.Tensor,
attention_mask: Optional[torch.Tensor] = None,
position_ids: Optional[torch.LongTensor] = None,
past_key_value: Optional[Tuple[torch.Tensor]] = None,
output_attentions: bool = False,
**kwargs,
) -> Tuple[torch.Tensor, Optional[torch.Tensor], Optional[Tuple[torch.Tensor]]]:
# LlamaFlashAttention2 attention does not support output_attentions
output_attentions = False
bsz, q_len, _ = hidden_states.size()
query_states = self.q_proj(hidden_states)
key_states = self.k_proj(hidden_states)
value_states = self.v_proj(hidden_states)
# FlashAttention requires the input to have the shape (bsz, seq_len, n_heads, head_dim)
query_states = query_states.view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
key_states = key_states.view(bsz, q_len, self.num_key_value_heads, self.head_dim).transpose(1, 2)
value_states = value_states.view(bsz, q_len, self.num_key_value_heads, self.head_dim).transpose(1, 2)
kv_seq_len = key_states.shape[-2]
if past_key_value is not None:
kv_seq_len += past_key_value.get_usable_length(kv_seq_len, self.layer_idx)
cos, sin = self.rotary_emb(value_states, seq_len=kv_seq_len)
query_states, key_states = apply_rotary_pos_emb(query_states, key_states, cos, sin, position_ids)
if past_key_value is not None:
cache_kwargs = {"sin": sin, "cos": cos} # Specific to RoPE models
key_states, value_states = past_key_value.update(key_states, value_states, self.layer_idx, cache_kwargs)
key_states = repeat_kv(key_states, self.num_key_value_groups)
value_states = repeat_kv(value_states, self.num_key_value_groups)
query_states = query_states.transpose(1, 2) # (bsz, seq_len, n_heads, head_dim)
key_states = key_states.transpose(1, 2) # (bsz, seq_len, n_heads, head_dim)
value_states = value_states.transpose(1, 2) # (bsz, seq_len, n_heads, head_dim)
dropout_rate = self.attention_dropout if self.training else 0.0
input_dtype = query_states.dtype
if input_dtype == torch.float32:
if torch.is_autocast_enabled():
target_dtype = torch.get_autocast_gpu_dtype()
elif hasattr(self.config, "_pre_quantization_dtype"):
target_dtype = self.config._pre_quantization_dtype
else:
target_dtype = self.q_proj.weight.dtype
logger.warning_once("The input hidden states seems to be silently casted in float32.")
query_states = query_states.to(target_dtype)
key_states = key_states.to(target_dtype)
value_states = value_states.to(target_dtype)
if getattr(self.config, "group_size_ratio", None) and self.training: # shift
groupsz = int(q_len * getattr(self.config, "group_size_ratio"))
assert q_len % groupsz == 0, "q_len {} should be divisible by group size {}.".format(q_len, groupsz)
num_groups = q_len // groupsz
def shift(state: torch.Tensor) -> torch.Tensor:
state = torch.cat(
(state[:, :, : self.num_heads // 2], state[:, :, self.num_heads // 2 :].roll(-groupsz // 2, dims=1)),
dim=2,
)
return state.reshape(bsz * num_groups, groupsz, self.num_heads, self.head_dim)
query_states, key_states, value_states = shift(query_states), shift(key_states), shift(value_states)
if attention_mask is not None:
attention_mask = attention_mask[:, :, :groupsz, :groupsz].repeat(num_groups, 1, 1, 1)
attn_output: torch.Tensor = self._flash_attention_forward(
query_states, key_states, value_states, attention_mask, q_len, dropout=dropout_rate
)
if getattr(self.config, "group_size_ratio", None) and self.training: # shift back
attn_output.reshape(bsz, q_len, self.num_heads, self.head_dim)
attn_output = torch.cat(
(
attn_output[:, :, : self.num_heads // 2],
attn_output[:, :, self.num_heads // 2 :].roll(groupsz // 2, dims=1),
)
)
attn_output = attn_output.reshape(bsz, q_len, self.hidden_size).contiguous()
attn_output = self.o_proj(attn_output)
if not output_attentions:
attn_weights = None
return attn_output, attn_weights, past_key_value
def apply_llama_patch() -> None:
LlamaAttention.forward = llama_torch_attn_forward
LlamaFlashAttention2.forward = llama_flash_attn_forward

Some files were not shown because too many files have changed in this diff Show More