10677 Commits

Author SHA1 Message Date
Spade A
9873e0ee78
fix: fix text match index / json key stats index leak when segment released (#42655)
Ref https://github.com/milvus-io/milvus/issues/42626

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-06-13 04:28:37 +08:00
cai.zhang
4ca1a231ad
fix: Add precheck for unsupport datatype cast (#42677)
issue: #42527

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-12 21:14:36 +08:00
congqixia
c9bc70f272
fix: [AddField] Use shared_ptr of schema in plan fixing dangling ref (#42693)
Related to #42640

The search/query plan holded a reference to schema, which could be
destructed after schema change. This PR make plan hold a shared ptr to
it fixing dangling reference problem under concurrent read & schema
change.

This PR also remove field binlog check for loading index for old segment
with old schema may have binlog lack.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-12 20:46:36 +08:00
yihao.dai
86876682da
enhance: Enhance import integration tests and logs (#42612)
1. Optimize the import process: skip subsequent steps and mark the task
as complete if the number of imported rows is 0.
2. Improve import integration tests:
 a. Add a test to verify that autoIDs are not duplicated
 b. Add a test for the corner case where all data is deleted
 c. Shorten test execution time
3. Enhance import logging:
 a. Print imported segment information upon completion
 b. Include file name in failure logs

issue: https://github.com/milvus-io/milvus/issues/42488,
https://github.com/milvus-io/milvus/issues/42518

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-12 20:02:35 +08:00
Xianhui Lin
98067f5fc6
fix: datacoord stop get stuck After upgrading from 2.5 to 2.6 (#42674)
datacoord stop get stuck After upgrading from 2.5 to 2.6
issue:https://github.com/milvus-io/milvus/issues/42656

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-06-12 16:56:36 +08:00
Spade A
911a8df17c
feat: impl StructArray -- data storage support in segcore (#42406)
Ref https://github.com/milvus-io/milvus/issues/42148
This PR mainly enables segcore to support array of vector (read and
write, but not indexing). Now only float vector as the element type is
supported.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-06-12 14:38:35 +08:00
cai.zhang
57c60af00d
fix: Unsorted small segments should not be considered as indexed (#42614)
issue: #42143

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-12 14:30:35 +08:00
Buqian Zheng
8511ede5f8
feat: add back queryNode.cache.warmup for compatibility (#42621)
issue: https://github.com/milvus-io/milvus/issues/41435

also make ChunkTranslator to load in parallel

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-06-12 10:56:40 +08:00
Bingyi Sun
6c16d3dbee
enhance: Add bulk api for json data (#42407)
issue: https://github.com/milvus-io/milvus/issues/42409

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-12 10:40:39 +08:00
foxspy
58f9278db7
fix: fix build interim index failures (#42679)
issue: #42028 

W20250522 09:52:55.785657 12779 ChunkedSegmentSealedImpl.cpp:1752]
[SERVER][generate_interim_index][CGO_LOAD][]fail to generate binlog
index, because bad optional access

After the cachelayer is added, num_rows_ can not be obtained before
interim index generated , and an external parameter pass is required

Signed-off-by: foxspy <xianliang.li@zilliz.com>
2025-06-12 05:12:39 +08:00
yihao.dai
a72463c619
enhance: Optimize memory usage during garbage collection (#42593)
Defer clone and decompress operations until just before removing from
meta, instead of eagerly applying them to all segments in advance.

issue: https://github.com/milvus-io/milvus/issues/42592

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-11 20:40:39 +08:00
foxspy
9af6c16ea0
fix: add describeIndex timestamp for restful interface (#42104)
issue: #41431

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-06-11 15:26:38 +08:00
yihao.dai
e6da4a64b5
fix: Pre-check import message to prevent pipeline block indefinitely (#42415)
Pre-check import message to prevent pipeline block indefinitely.

issue: https://github.com/milvus-io/milvus/issues/42414

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: chyezh <chyezh@outlook.com>
2025-06-11 13:40:38 +08:00
wei liu
e7c0a6ffbb
enhance: Refine QueryNode task parallelism based on CPU core count (#42166)
issue: #42165
Implement dynamic task execution capacity calculation based on QueryNode
CPU core count instead of static configuration for better resource
utilization.

Changes include:
- Add CpuCoreNum() method and WithCpuCoreNum() option to NodeInfo
- Implement GetTaskExecutionCap() for dynamic capacity calculation
- Add QueryNodeTaskParallelismFactor parameter for tuning
- Update proto definition to include cpu_core_num field
- Add unit tests for new functionality

This allows QueryCoord to automatically adjust task parallelism based on
actual hardware resources.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-11 13:20:35 +08:00
congqixia
499e9a0a73
fix: [AddField] Use corresponding datatype for int8/int16 def val (#42633)
Related to #42629

This PR handles converting default value to int8/int18 scalar with int32
default value definition

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-11 11:54:34 +08:00
Xianhui Lin
d5c41acec1
fix: compatibility with old sessions upgrade from 2.5 to 2.6 in standalone mode (#42645)
compatibility with old sessions upgrade from 2.5 to 2.6 in standalone
mode
issue:https://github.com/milvus-io/milvus/issues/42602

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-06-11 10:58:35 +08:00
Zhen Ye
43f0c56ce7
fix: limit the concurency of zstd compression and decrease the memory usage of binlog generation (#42630)
issue: #42028

- limit the concurrency of zstd compression.
- zstd.go modified from
`github.com/apache/arrow/go/v17/parquet/compress/ztsd.go`
- may be related to #42129

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-11 09:06:34 +08:00
Bingyi Sun
fbf5cb4e62
feat: Add json flat index (#39917)
issue: https://github.com/milvus-io/milvus/issues/35528

This PR introduces a JSON flat index that allows indexing JSON fields
and dynamic fields in the same way as other field types.

In a previous PR (#36750), we implemented a JSON index that requires
specifying a JSON path and casting a type. The only distinction lies in
the json_cast_type parameter. When json_cast_type is set to JSON type,
Milvus automatically creates a JSON flat index.

For details on how Tantivy interprets JSON data, refer to the [tantivy
documentation](https://github.com/quickwit-oss/tantivy/blob/main/doc/src/json.md#pitfalls-limitation-and-corner-cases).

Limitations
Array handling: Arrays do not function as nested objects. See the
[limitations
section](https://github.com/quickwit-oss/tantivy/blob/main/doc/src/json.md#arrays-do-not-work-like-nested-object)
for more details.

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-10 19:14:35 +08:00
XuanYang-cn
83877b9faf
enhance: remove extra get collection (#42042)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-06-10 18:34:35 +08:00
junjiejiangjjj
f1a4526bac
enhance: refactor rrf and weighted rerank (#42154)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-06-10 18:08:35 +08:00
wei liu
f3fe117840
fix: Use delete checkpoint to prevent delete record loss in L0 refactoring (#42628)
issue: #39333 #41570
Fix delete record missing issue introduced in PR #39552 L0 refactoring:
- Use delete checkpoint as consume start position when deleteCP <
channelCP
- Add logging when delete checkpoint is used instead of seek position
- Prevent delete record loss when deleteCP is earlier than default
channelCP

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-10 17:34:35 +08:00
yihao.dai
ed55b14484
fix: Release data memory after sync task completes (#42627)
Release data memory after sync task completes to prevent datanode oom
during import.

issue: https://github.com/milvus-io/milvus/issues/42608

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-10 16:28:34 +08:00
cqy123456
c9680a5b56
fix: avoid load index or create interim index in ChunkedSegmentSealedImpl::HasRawData() (#42622)
issue: https://github.com/milvus-io/milvus/issues/42526

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-06-10 14:54:34 +08:00
Zhen Ye
af0881ee5d
fix: timetick cannot push forward when upgrading (#42567)
issue #42492

- streamingcoord start before old rootcoord.
- streaming balancer will check the node session synchronously to avoid
redundant operation when cluster startup.
- ddl operation will check if streaming enabled, if the streaming is not
enabled, it will use msgstream.
- msgstream will initialize if streaming is not enabled, and stop when
streaming is enabled.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-10 14:52:42 +08:00
cqy123456
317bbfbf81
enhance: milvus support minhash vector and mhjaccard metric (#42036)
issue:
https://github.com/issues/assigned?issue=milvus-io%7Cmilvus%7C41746

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-06-10 14:38:34 +08:00
Bingyi Sun
b3ecf77a66
fix: Fix the bug of valid data write corruption (#42556)
issue: https://github.com/milvus-io/milvus/issues/42554

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-10 14:22:34 +08:00
zhagnlu
2861096734
fix: Add explicit move semantics to get_batch_view interface (#42403)
#42401

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-06-10 13:06:35 +08:00
sthuang
9439eaef52
fix: [StorageV2] sync with int8 vector data type core dumped (#42616)
related: https://github.com/milvus-io/milvus/issues/42613, #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-10 11:42:35 +08:00
aoiasd
13330bd466
fix: add concurrency and close protect for bm25 function (#42597)
relate: https://github.com/milvus-io/milvus/issues/42576

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-10 11:36:34 +08:00
sthuang
89c3afb12e
fix: [StorageV2] index/stats task level storage v2 fs (#42191)
related: #39173

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-10 11:06:35 +08:00
aoiasd
fd6e2b52ff
enhance: use english name as language name for all type language identifier (#42600)
Set whatlang detect return language name as english name.
Make sure same with lingua.

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-10 10:24:35 +08:00
congqixia
a9aaa86193
enhance: [StorageV2] Pass bucket name for compaction readers (#42607)
Related to #39173

Like logic in #41919, storage v2 fs shall use complete paths with
bucketName prefix to be compatible with its definition. This PR fills
bucket name from config when creating reader for compaction tasks.

NOTE: the bucket name shall be read from task params config for
compaction task pooling.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-10 10:20:35 +08:00
congqixia
118684afbb
enhance: [storageV2] Pass nullable converting insertMsg fieldData (#42584)
Related to #39173

`nullable` flag is crucial for serde logic of v2 writer, missing this
flag causes logic bug for v2 nullalbe data.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-10 10:06:34 +08:00
Bingyi Sun
ffb2877992
enhance: support auto index type for json index (#42071)
issue: https://github.com/milvus-io/milvus/issues/42070

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-09 21:22:34 +08:00
wei liu
317e7999da
fix: ReleasePartition cause delegator unserviceable. (#42423)
issue: #42098 #42404
related to: ##42009 #41937

Implement new method to handle partition removal from next target
without directly modifying current target.

Changes include:
- Add RemovePartitionFromNextTarget method and deprecate RemovePartition
- Update target_observer to use new method for ReleasePartition
operations
- Add unit tests and mock methods for new functionality

This ensures that all changes to next target will propagates to
delegator's query view.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-09 19:02:34 +08:00
Bingyi Sun
6404e02d99
fix: Check cast type is array for json contains expr (#42184)
issue: https://github.com/milvus-io/milvus/issues/42181

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-09 17:04:33 +08:00
congqixia
f1188b6781
enhance: [storagev2] Support partition key isolation index (#42574)
Related to #39173

This patch make storage v2 support partition key isolation index feature

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-09 14:02:33 +08:00
yihao.dai
837349dead
enhance: Adjust default import buffer size (#42541)
Increase insert buffer size from 16MB to 64MB, while keeping delete
buffer size at 16MB.

issue: https://github.com/milvus-io/milvus/issues/42518

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-09 13:02:33 +08:00
sthuang
b136f85ca0
fix: storage v2 write mmap file per field per cell (#42180)
Each cell of a field should be written to its own mmap file, rather than
writing all cells of the field into a single mmap file.
related: #39173

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-09 11:48:33 +08:00
aoiasd
6e16653597
fix: update tantivy commit version to fix stemmer panic (#42171)
relate: https://github.com/milvus-io/milvus/issues/42168

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-09 10:34:33 +08:00
Xianhui Lin
7e46fc6618
feat: implement batch commit for JSON Stats (#42494)
implement batch commit for JSON Stats
issue:https://github.com/milvus-io/milvus/issues/41616

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-06-08 19:58:33 +08:00
Buqian Zheng
b4d549d96a
fix: pipeline/delegator leak (#42582)
the manager's logging lambda should not capture the pipeline object

this creates a circular reference between the manager and the pipeline
object, making it impossible for both to be GC-ed.

issue: https://github.com/milvus-io/milvus/issues/42581

Signed-off-by: Buqian Zheng <buqianzheng@Buqians-MacBook-Air.local>
Co-authored-by: Buqian Zheng <buqianzheng@Buqians-MacBook-Air.local>
2025-06-06 22:00:32 +08:00
wei liu
8511881d3f
enhance: Increase search/query retry times on proxy before timeout (#40438)
issue: #39379

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-06 18:12:32 +08:00
congqixia
b50c4a7973
enhance: Make segcore thread name set correctly (#42497)
Previous PR: #42017 did not work due to following updated points by this
PR:

- Initialize the `name_map`, which not touched at all before
- Trim the thread name under 15 characters to fit syscall limit

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-06 16:26:32 +08:00
Bingyi Sun
cc5ac1c220
enhance: Support cast function for json index (#41949)
issue: #41948

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-05 19:42:32 +08:00
zhagnlu
0c4b12565e
fix: fix is null bug for marisa index (#42420)
#42255

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-06-05 16:40:32 +08:00
cai.zhang
e299c533be
fix: Just trigger stats task for Flushed segment (#42424)
issue: #42419

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-05 15:42:32 +08:00
aoiasd
b1f86f6556
enhance: run analyzer should get database name from grpc context (#42398)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-05 14:26:31 +08:00
aoiasd
2eb24fbe7c
fix: analyzer memory leak because function runner not close (#41839)
relate: https://github.com/milvus-io/milvus/issues/41213

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-05 14:24:40 +08:00
congqixia
373deba0bd
fix: Pass cluster id tranforming drop task to drop job request (#42531)
Related to #42530

The cluster id is missing when drop worker drop causing redoing task on
report duplicated task error.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-05 13:20:32 +08:00