Commit Graph

396 Commits

Author SHA1 Message Date
yunqiqiliang 7686fec8de Fix ClickZetta stability and reduce logging noise (#23632)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-08 22:57:47 +08:00
湛露先生 23de79b9c8 word extractor cleans. (#20926)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-08-08 09:37:51 +08:00
yunqiqiliang e9258e2528 fix: ensure vector database cleanup on dataset deletion regardless of document presence (affects all 33 vector databases) (#23574)
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-08 09:18:43 +08:00
Qiang Lee 2cbb0f5b9f Fix: Apply Metadata Filters Correctly in Full-Text Search Mode for Tencent Cloud Vector Database (#23564) 2025-08-07 05:36:06 -07:00
yunqiqiliang 6e050da786 feat: Add Clickzetta Lakehouse vector database integration (#22551)
Co-authored-by: Claude <noreply@anthropic.com>
2025-08-07 14:21:46 +08:00
Yongtao Huang 64de1ed5b9 Remove unnecessary issubclass check (#23455)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
2025-08-06 13:43:55 +08:00
Yongtao Huang e560c6a58b Fix version comparison with imported_version (#23326)
Signed-off-by: Yongtao Huang <yongtaoh2022@gmail.com>
2025-08-04 10:40:49 +08:00
wanttobeamaster 00508b3224 chore: tablestore full text search support score normalization (#23255)
Co-authored-by: xiaozhiqing.xzq <xiaozhiqing.xzq@alibaba-inc.com>
2025-08-01 14:14:11 +08:00
Aurelius Huang 89de8d5271 feat(notion): Notion Database extracts Rows content in row order and appends Row Page URL (#22646)
Co-authored-by: Aurelius Huang <cm.huang@aftership.com>
2025-07-30 21:35:20 +08:00
kenwoodjw b3a3eb5fe0 feat: support metadata condition filter string array (#23111)
Signed-off-by: kenwoodjw <blackxin55+@gmail.com>
2025-07-30 16:13:45 +08:00
rhochman af6a72300a Fix: Support for Elasticsearch Cloud Connector (#23017)
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-30 11:12:16 +08:00
Yongtao Huang a6296b2247 Chore: remove duplicate TYPE_CHECKING import (#23013)
Signed-off-by: Yongtao Huang <yongtaoh2022@gmail.com>
2025-07-28 10:04:45 +08:00
Asuka Minato 9bc8bb6e91 make logging not use f-str, change others to f-str (#22882) 2025-07-25 10:32:48 +08:00
Asuka Minato 8b2d7fc78e orm filter -> where (#22801)
Signed-off-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-07-24 00:57:45 +08:00
wanttobeamaster 070560670b fix tablestore full text search bug (#22853) 2025-07-23 19:31:47 +08:00
wanttobeamaster 51dec72dcd fix: tablestore TypeError when vector is missing (#22843)
Co-authored-by: xiaozhiqing.xzq <xiaozhiqing.xzq@alibaba-inc.com>
2025-07-23 18:59:16 +08:00
wlleiiwang 65eba98eb4 FEAT: Tencent Vector search supports backward compatibility with the previous score calculation approach. (#22820)
Co-authored-by: wlleiiwang <wlleiiwang@tencent.com>
2025-07-23 15:38:31 +08:00
Asuka Minato b90ba47bec Mapped column (#22644)
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-23 00:39:59 +08:00
wanttobeamaster 7a2f43cf68 fix: tablestore vdb support metadata filter (#22774)
Co-authored-by: xiaozhiqing.xzq <xiaozhiqing.xzq@alibaba-inc.com>
2025-07-22 16:48:59 +08:00
issac2e fd792d7c5e Optimize tencent_vector knowledge base deletion error handling with batch processing support (#22726)
Co-authored-by: liuchen15 <liuchen15@gaotu.cn>
Co-authored-by: crazywoola <427733928@qq.com>
2025-07-22 08:21:41 +08:00
uply23333 7d55dc359d fix: improve document filtering in full text search(elasticsearch) (#22683) 2025-07-21 15:59:37 +08:00
8bitpd a17ec07406 fix: update analyticdb vector to do filter by metadata (#22698)
Co-authored-by: xiaozeyu <xiaozeyu.xzy@alibaba-inc.com>
2025-07-21 15:03:37 +08:00
znn 7adf1a64ec fix text splitter (#22596) 2025-07-18 13:51:58 +08:00
-LAN- 2ad05e003c refactor: decouple Node and NodeData (#22581)
Signed-off-by: -LAN- <laipz8200@outlook.com>
Co-authored-by: QuantumGhost <obelisk.reg+git@gmail.com>
2025-07-18 10:08:51 +08:00
helojo 37a4ff2b67 Fix: the pict type picture was not processed in the docx (#19305)
Co-authored-by: zqgame <zqgame@zqgame.local>
2025-07-17 22:53:35 +08:00
yihong f5afd34990 fix: drop dead code phase2 unused class (#22042)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-07-17 09:33:07 +08:00
wanttobeamaster c8236637e3 tablestore vector support more method (#22225)
Co-authored-by: xiaozhiqing.xzq <xiaozhiqing.xzq@alibaba-inc.com>
2025-07-15 09:58:48 +08:00
Jacky Wu d16d203390 fix: close session before doing long latency operation (#22306) 2025-07-14 15:16:10 +08:00
luckylhb90 070fc6a118 optimize: batch embedding and qdrant write_consistency_factor parameter (#21776)
Co-authored-by: hobo.l <hobo.l@binance.com>
2025-07-10 10:16:59 +08:00
wlleiiwang 6a53cb20eb Optimize the memory usage of Tencent Vector Database (#22079)
Co-authored-by: wlleiiwang <wlleiiwang@tencent.com>
2025-07-09 15:53:06 +08:00
baonudesifeizhai b71eee95b6 fix: prevent timeout in file encoding detection for large files (#21453)
Co-authored-by: crazywoola <427733928@qq.com>
2025-07-03 17:06:49 +08:00
efrey kong c2c558f294 Fix: prevent SQL errors when metadata filter Constant value is None or blank (#21803) 2025-07-02 14:43:01 +08:00
Dongyu Li d31b6574ed Feat/kb index (#20868)
Co-authored-by: twwu <twwu@dify.ai>
2025-06-25 17:52:59 +08:00
Jin 372048a02b fix: markdown_extractor lost chunks if it starts without a header(#21308) (#21309) 2025-06-21 23:10:00 +08:00
LiuBo 518608bd2b feat: add support for Matrixone database (#20714) 2025-06-19 10:20:12 +08:00
NeatGuyCoding 762c2d8d9e Translation fix (#21194) 2025-06-19 09:36:56 +08:00
NeatGuyCoding 5c619754a7 Minor Improvements for File Validation and Configuration Handling #21179 (#21171)
Co-authored-by: tech <cto@sb>
2025-06-18 18:33:28 +08:00
Ademílson Tonato b02f595f6a feat: add search endpoint for Firecrawl Integration (#20521)
Co-authored-by: crazywoola <427733928@qq.com>
2025-06-18 14:37:03 +08:00
Rain Wang 5161e82c0d Fixes #20748 KnowledgeRetrievalNode return all external documents when reranker disabled even top-k configed (#20762) 2025-06-18 14:35:12 +08:00
kazuya-awano d1b2c6d8b0 feat: add pagenation to notion extractor (#20919) 2025-06-18 11:30:55 +08:00
kurokobo 620912252f fix: shorten connection timeout to pypi.org for deprecation check for weaviate client (#21131) 2025-06-18 09:25:52 +08:00
Bowen Liang 3b15fd919f test: run vdb test of oceanbase with docker compose in CI tests (#20945) 2025-06-16 11:05:19 +08:00
Bowen Liang f7c7f5d942 chore: bump mypy to 1.16 (#20608) 2025-06-11 01:01:33 +08:00
QuantumGhost 66042d8153 refactor(api): Decouple ParameterExtractorNode from LLMNode (#20843)
- Extract methods used by `ParameterExtractorNode` from `LLMNode` into a separate file.
- Convert `ParameterExtractorNode` into a subclass of `BaseNode`.
- Refactor code referencing the extracted methods to ensure functionality and clarity.
- Fixes the issue that `ParameterExtractorNode` returns error when executed.
- Fix relevant test cases.

Closes #20840.
2025-06-10 11:47:50 +08:00
yihong 5a185c4ce9 fix: clean up two unreachable code (#20773)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-06-07 23:06:46 +08:00
jefferyvvv e230915fcf fix: opensearch vector search falls back to keyword search (#20723)
Co-authored-by: wenjun.gu <wenjun.gu@envision-energy.com>
2025-06-06 16:29:15 +08:00
jefferyvvv 66818f4312 fix: opensearch metadata filtering returns empty (#20701)
Co-authored-by: wenjun.gu <wenjun.gu@envision-energy.com>
Co-authored-by: crazywoola <427733928@qq.com>
2025-06-06 09:10:01 +08:00
jefferyvvv 5755965bd0 fix: opensearch fulltext search with metadata filtering dsl error (#20702)
Co-authored-by: wenjun.gu <wenjun.gu@envision-energy.com>
2025-06-05 23:09:00 +08:00
kenwoodjw 00ea85cbe5 fix: autocorrect everything in web (#20605)
Signed-off-by: kenwoodjw <blackxin55+@gmail.com>
2025-06-04 14:12:24 +08:00
zhaobingshuang 154f7aa58b fix: #20560 When elasticsearch is used as the vector database, the Retrieval Test fails to filter the data after setting the Score Threshold, and the score of the recalled results is empty (#20561) 2025-06-03 13:24:26 +08:00