Commit Graph

48 Commits

Author SHA1 Message Date
Bowen Liang 063191889d chore: apply ruff's pyupgrade linter rules to modernize Python code with targeted version (#2419) 2024-02-09 15:21:33 +08:00
crazywoola 243ca5b1e2 fix: typo in package path of core.splitter (#2411) 2024-02-07 15:34:02 +08:00
Bowen Liang 843280f82b enhancement: introduce Ruff for Python linter for reordering and removing unused imports with automated pre-commit and sytle check (#2366) 2024-02-06 13:21:13 +08:00
takatost 9f637ead38 bump version to 0.5.3 (#2306) 2024-02-01 18:11:57 +08:00
KVOJJJin 89fcf4ea7c Feat: chunk overlap supported (#2209)
Co-authored-by: jyong <jyong@dify.ai>
2024-01-26 13:24:40 +08:00
takatost 6cf93379b3 fix: split chunks return empty strings (#2197) 2024-01-25 13:59:18 +08:00
Jyong 869690c485 fix notion estimate (#2090)
Co-authored-by: jyong <jyong@dify.ai>
2024-01-19 13:27:12 +08:00
Jyong cb7a608d75 ascii filter Unicode U+FFFE (#2038)
Co-authored-by: jyong <jyong@dify.ai>
2024-01-15 16:52:18 +08:00
Jyong a63a9c7d45 text spliter length method use default embedding model tokenizer (#2011)
Co-authored-by: jyong <jyong@dify.ai>
2024-01-12 18:45:34 +08:00
Bowen Liang cc9e74123c improve: introduce isort for linting Python imports (#1983) 2024-01-12 12:34:01 +08:00
Jyong 24bdedf802 fix get embedding model provider in empty dataset (#1986)
Co-authored-by: jyong <jyong@dify.ai>
2024-01-10 20:48:16 +08:00
Jyong 4a3d15b6de fix customer spliter character (#1915)
Co-authored-by: jyong <jyong@dify.ai>
2024-01-04 16:21:48 +08:00
takatost a938e1f184 fix: notion_indexing_estimate embedding_model_instance NPE (#1907) 2024-01-04 13:28:52 +08:00
Yeuoly 9134849744 fix: remove tiktoken from text splitter (#1876) 2024-01-03 13:02:56 +08:00
takatost d069c668f8 Model Runtime (#1858)
Co-authored-by: StyleZhang <jasonapring2015@outlook.com>
Co-authored-by: Garfield Dai <dai.hai@foxmail.com>
Co-authored-by: chenhe <guchenhe@gmail.com>
Co-authored-by: jyong <jyong@dify.ai>
Co-authored-by: Joel <iamjoel007@gmail.com>
Co-authored-by: Yeuoly <admin@srmxy.cn>
2024-01-02 23:42:00 +08:00
Jyong df1509983c ppt & pptx improve (#1790)
Co-authored-by: jyong <jyong@dify.ai>
2023-12-19 18:11:27 +08:00
Jyong 5e34f938c1 Feat/add unstructured support (#1780)
Co-authored-by: jyong <jyong@dify.ai>
2023-12-18 23:24:06 +08:00
crazywoola 994fceece3 fix: qa regex (#1738) 2023-12-11 15:53:37 +08:00
Pascal M bc54cdc537 refactor: typo in dataset docstore (#1711) 2023-12-07 09:24:52 +08:00
Pascal M 5d10cf0fe6 fix: error Class 'builtins.list' is not mapped (#1710) 2023-12-07 09:24:39 +08:00
Jyong 4588831bff Feat/add retriever rerank (#1560)
Co-authored-by: jyong <jyong@dify.ai>
2023-11-17 22:13:37 +08:00
crazywoola d0e1ea8f06 1506 remove duplicated code (#1511) 2023-11-13 19:05:32 +08:00
Garfield Dai 42a5b3ec17 feat: advanced prompt backend (#1301)
Co-authored-by: takatost <takatost@gmail.com>
2023-10-12 10:13:10 -05:00
Jyong 289c93d081 Feat/improve document delete logic (#1325)
Co-authored-by: jyong <jyong@dify.ai>
2023-10-12 13:30:44 +08:00
yezhwi 8b8e510bfe fix: handle AttributeError for datasets and index (#1052) 2023-08-30 11:14:16 +08:00
Jyong a55ba6e614 Fix/ignore economy dataset (#1043)
Co-authored-by: jyong <jyong@dify.ai>
2023-08-29 03:37:45 +08:00
Jyong 2d604d9330 Fix/filter empty segment (#1004)
Co-authored-by: jyong <jyong@dify.ai>
2023-08-25 15:50:29 +08:00
Jyong 5623839c71 update document segment (#950)
Co-authored-by: jyong <jyong@dify.ai>
2023-08-22 17:59:24 +08:00
takatost 3a0a9e2d8f fix: embedding get price definition missing (#922) 2023-08-19 21:31:40 +08:00
Krasus.Chen fd0fc8f4fe Fix/price calc (#862) 2023-08-19 16:41:35 +08:00
Jyong db7156dafd Feature/mutil embedding model (#908)
Co-authored-by: JzoNg <jzongcode@gmail.com>
Co-authored-by: jyong <jyong@dify.ai>
Co-authored-by: StyleZhang <jasonapring2015@outlook.com>
2023-08-18 17:37:31 +08:00
Jyong f207e180df fix multi thread app context (#868)
Co-authored-by: jyong <jyong@dify.ai>
2023-08-16 15:39:31 +08:00
takatost f18ce203b5 feat: optimize error logging (#808) 2023-08-12 02:22:43 +08:00
takatost 5fa2161b05 feat: server multi models support (#799) 2023-08-12 00:57:00 +08:00
Jyong 174ebb51db add qa thread control (#677) 2023-07-29 17:49:18 +08:00
Jyong 9eaae770a6 Feat/add thread control (#675) 2023-07-29 17:00:21 +08:00
Jyong ca60610306 logging qa error (#672) 2023-07-29 01:51:18 +08:00
Jyong 082f8b17ab Feat/milvus support (#671)
Co-authored-by: StyleZhang <jasonapring2015@outlook.com>
Co-authored-by: JzoNg <jzongcode@gmail.com>
2023-07-28 22:19:39 +08:00
KVOJJJin cf93d8d6e2 Feat: Q&A format segmentation support (#668)
Co-authored-by: jyong <718720800@qq.com>
Co-authored-by: StyleZhang <jasonapring2015@outlook.com>
2023-07-28 20:47:15 +08:00
crazywoola 998f819b04 use sub to operate all (#475) 2023-06-28 14:58:40 +08:00
Jyong 2eea114ac0 fix special code (#473) 2023-06-28 13:58:36 +08:00
John Wang 3241e4015b feat: upgrade langchain (#430)
Co-authored-by: jyong <718720800@qq.com>
2023-06-25 16:49:14 +08:00
Jyong 9253f72dea Feat/dataset notion import (#392)
Co-authored-by: StyleZhang <jasonapring2015@outlook.com>
Co-authored-by: JzoNg <jzongcode@gmail.com>
2023-06-16 21:47:51 +08:00
lisaifei@cvte.com 0abd67288b feat: support xlsx file parsing (#304)
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
2023-06-09 15:57:19 +08:00
Jyong 2bf48514bc fix markdown parser (#230) 2023-06-06 19:51:40 +08:00
John Wang 0587ff0fba fix: remove empty segment in splitter (#68) 2023-05-17 15:02:58 +08:00
John Wang 815f794eef feat: optimize split rule when use custom split segment identifier (#35) 2023-05-16 12:57:25 +08:00
John Wang db896255d6 Initial commit 2023-05-15 08:51:32 +08:00