1442 Commits

Author SHA1 Message Date
cai.zhang
c54a04c71c
fix: L2 segments remain as L2 even after sort compaction (#43237)
issue: #43186

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-11 11:30:48 +08:00
congqixia
5a9efb3f81
enhance: [StorageV2] Refine storage rw option usage & validation (#43175)
Related to #39173

This PR:
- Make all datanode task passes storage config via storage config option
- Remove legacy comments, rootPath & bucketName parameters
- Fix clustering compaction option behavior
- Add validation logic for `rwOptions`
- Use correct storageType from storageConfig
- Add storage config in sync task

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-11 01:14:48 +08:00
cai.zhang
3ffd44f302
fix: Fix remaining issues with Datanode pooling and StorageV2 (#43147)
issue: #43146

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-10 14:26:48 +08:00
yihao.dai
ee9a95189a
enhance: Print segments info after import done (#43200)
issue: https://github.com/milvus-io/milvus/issues/42488

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-10 12:38:47 +08:00
cai.zhang
47144429bf
fix: Fix regeneratePartitionStats failed after restore clusteringCompactionTask (#43205)
issue: #43186

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-10 10:40:47 +08:00
cai.zhang
6989e18599
enhance: Move sort stats task to sort compaction (#42562)
issue: #42560

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-08 20:22:47 +08:00
yihao.dai
9cbd194c6b
fix: Prevent import from generating small binlogs (#43132)
- Introduce dynamic buffer sizing to avoid generating small binlogs
during import
- Refactor import slot calculation based on CPU and memory constraints
- Implement dynamic pool sizing for sync manager and import tasks
according to CPU core count

issue: https://github.com/milvus-io/milvus/issues/43131

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-07 21:32:47 +08:00
Zhen Ye
e97e44d56e
enhance: limit the gc concurrency when cpu is high (#43059)
issue: #42833

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-04 09:22:43 +08:00
cai.zhang
f6b2a71c95
enhance: Remove chunkmanager-related dependencies from datanode (#43021)
issue: #41611

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-03 14:44:45 +08:00
Spade A
26ec841feb
feat: optimize Like query with n-gram (#41803)
Ref #42053

This is the first PR for optimizing `LIKE` with ngram inverted index.
Now, only VARCHAR data type is supported and only InnerMatch LIKE
(%xxx%) query is supported.


How to use it:
```
milvus_client = MilvusClient("http://localhost:19530")
schema = milvus_client.create_schema()
...
schema.add_field("content_ngram", DataType.VARCHAR, max_length=10000)
...
index_params = milvus_client.prepare_index_params()
index_params.add_index(field_name="content_ngram", index_type="NGRAM", index_name="ngram_index", min_gram=2, max_gram=3)
milvus_client.create_collection(COLLECTION_NAME, ...)
```

min_gram and max_gram controls how we tokenize the documents. For
example, for min_gram=2 and max_gram=4, we will tokenize each document
with 2-gram, 3-gram and 4-gram.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-07-01 10:08:44 +08:00
Zhen Ye
2d73e6eaa8
fix: mixcoord will not handle timetick anymore (#42965)
issue: #42954

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-26 19:14:42 +08:00
yihao.dai
d7c9914eff
fix: Consider fields number when preallocating ids for import (#42810)
In corner cases where there are many fields but only a small number of
rows to import, the default preallocated IDs may be insufficient. To
address this, consider the number of fields when preallocating IDs.

issue: https://github.com/milvus-io/milvus/issues/42518

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-25 23:38:41 +08:00
XuanYang-cn
0adf44e6f8
enhance: Check if segment has too many deletions together (#42668)
This PR moves the deltalog file count check inside hasTooManyDeletions
check. Unifies the logic on checking if a segment has too many deletions
including: delta log count, deleted rows ratio and deltalog size.

This change removes several uncessary traverse through segment's binlogs
and deltalogs. And add more clear trigger logs

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-06-24 16:30:49 +08:00
Zhen Ye
2fd8f910b0
fix: data duplicated when msgdispatcher make splitting (#42827)
issue: #41570

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-19 16:32:39 +08:00
cai.zhang
a9dcd4a380
enhance: ChunkManager is no longer created during datanode initialization (#42791)
issue: #41611

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-17 17:06:38 +08:00
yihao.dai
9acba25fad
enhance: Replace pointer-based map key with id in garbage collector (#42647)
issue: https://github.com/milvus-io/milvus/issues/42592

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-13 20:50:36 +08:00
Bingyi Sun
1bf960b1a8
enhance: Check loaded segments before gc (#42639)
issue: https://github.com/milvus-io/milvus/issues/42412

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-13 17:44:38 +08:00
yihao.dai
86876682da
enhance: Enhance import integration tests and logs (#42612)
1. Optimize the import process: skip subsequent steps and mark the task
as complete if the number of imported rows is 0.
2. Improve import integration tests:
 a. Add a test to verify that autoIDs are not duplicated
 b. Add a test for the corner case where all data is deleted
 c. Shorten test execution time
3. Enhance import logging:
 a. Print imported segment information upon completion
 b. Include file name in failure logs

issue: https://github.com/milvus-io/milvus/issues/42488,
https://github.com/milvus-io/milvus/issues/42518

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-12 20:02:35 +08:00
Xianhui Lin
98067f5fc6
fix: datacoord stop get stuck After upgrading from 2.5 to 2.6 (#42674)
datacoord stop get stuck After upgrading from 2.5 to 2.6
issue:https://github.com/milvus-io/milvus/issues/42656

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-06-12 16:56:36 +08:00
cai.zhang
57c60af00d
fix: Unsorted small segments should not be considered as indexed (#42614)
issue: #42143

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-12 14:30:35 +08:00
yihao.dai
a72463c619
enhance: Optimize memory usage during garbage collection (#42593)
Defer clone and decompress operations until just before removing from
meta, instead of eagerly applying them to all segments in advance.

issue: https://github.com/milvus-io/milvus/issues/42592

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-11 20:40:39 +08:00
yihao.dai
e6da4a64b5
fix: Pre-check import message to prevent pipeline block indefinitely (#42415)
Pre-check import message to prevent pipeline block indefinitely.

issue: https://github.com/milvus-io/milvus/issues/42414

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: chyezh <chyezh@outlook.com>
2025-06-11 13:40:38 +08:00
Bingyi Sun
fbf5cb4e62
feat: Add json flat index (#39917)
issue: https://github.com/milvus-io/milvus/issues/35528

This PR introduces a JSON flat index that allows indexing JSON fields
and dynamic fields in the same way as other field types.

In a previous PR (#36750), we implemented a JSON index that requires
specifying a JSON path and casting a type. The only distinction lies in
the json_cast_type parameter. When json_cast_type is set to JSON type,
Milvus automatically creates a JSON flat index.

For details on how Tantivy interprets JSON data, refer to the [tantivy
documentation](https://github.com/quickwit-oss/tantivy/blob/main/doc/src/json.md#pitfalls-limitation-and-corner-cases).

Limitations
Array handling: Arrays do not function as nested objects. See the
[limitations
section](https://github.com/quickwit-oss/tantivy/blob/main/doc/src/json.md#arrays-do-not-work-like-nested-object)
for more details.

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-10 19:14:35 +08:00
cai.zhang
e299c533be
fix: Just trigger stats task for Flushed segment (#42424)
issue: #42419

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-05 15:42:32 +08:00
Zhen Ye
0567f512b3
fix: streamingnode get stucked when stop (#42501)
issue: #42498

- fix: sealed segment cannot be flushed after upgrading
- fix: get mvcc panic when upgrading
- ignore the L0 segment when graceful stop of querynode.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-05 12:22:31 +08:00
cai.zhang
43c99a2c49
fix: Only mark segment compacting for sort stats task (#42516)
issue: #42506

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-04 22:46:32 +08:00
yihao.dai
6fda1f69c8
fix: Fix duplicate autoID between import and insert (#42519)
Remove the unlimited logID mechanism and switch to redundantly
allocating a large number of IDs.

issue: https://github.com/milvus-io/milvus/issues/42518

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-04 19:58:31 +08:00
cai.zhang
5566a85bcc
enhance: Add proxy task queue metrics (#42156)
issue: #42155

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-04 11:26:32 +08:00
Chun Han
e9b5d9e8bc
enhance: refine compaction trigger to reduce read/write amplifaction(#41336) (#41728)
related: #41336

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-06-04 11:24:38 +08:00
wei liu
5a355d1e57
fix: Fix data race in global scheduler test using atomic counters (#42454)
issue: #42457

Replace unsafe ExpectedCalls modification with atomic.Int32 state
tracking to avoid race conditions in concurrent test execution. Changes
include:
- Use atomic counters instead of direct mock ExpectedCalls manipulation
- Add RunAndReturn with atomic state transitions for thread safety
- Remove github.com/samber/lo dependency

This prevents data race when mock framework and test goroutines access
ExpectedCalls concurrently.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-03 14:18:30 +08:00
yihao.dai
297331b2cc
enhance: Add slot and tasks num metrics (#42141)
issue: https://github.com/milvus-io/milvus/issues/41123

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-30 21:52:30 +08:00
Zhen Ye
66cc194ab2
enhance: add partition gc at streaming arch (#42179)
issue: #41976

- make drop partition message as a broadcast message.
- add gc when drop partition message is acked.
- add a call back to handle the broadcast message when ack.
- the ack operation of broadcast message will retry until success.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-29 23:20:30 +08:00
Zhen Ye
b94cee2413
fix: growing segment from old arch is not flushed after upgrading (#42164)
issue: #42162

- enhance: add read ahead buffer size issue #42129
- fix: rocksmq consumer's close operation may get stucked
- fix: growing segment from old arch is not flushed after upgrading

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-29 23:00:28 +08:00
Zhen Ye
c7d6e3f19b
fix: data lost when wal balance (#42149)
issue: #42147

- error of sync task should be returned if error is returned to avoid
checkpoint is push forward.
- fix up node id checker of UpdateChannelCheckpoint in streaming.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-29 17:32:29 +08:00
aoiasd
2ae4d80120
enhance: support run analyzer by loaded collection field (#42113)
relate: https://github.com/milvus-io/milvus/issues/42094

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-05-29 10:54:30 +08:00
yihao.dai
79b51cbb73
fix: Fix task getting stuck after recovery (#42114)
Submit tasks into the global scheduler after recovery.

issue: https://github.com/milvus-io/milvus/issues/42046

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-28 12:46:28 +08:00
cai.zhang
63246c040f
fix: Use locking to ensure the atomicity of dropping segment indexes (#42075)
issue: #41288

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-05-28 10:00:28 +08:00
yihao.dai
57b58ad778
fix: Fix concurrent l0Compaction and Stats (#42112)
Return `false` in the `Process()` function for `executing` or
`pipelining` state `l0Compaction`. This prevents the `l0Compaction` task
from being removed from the `CompactionInspector`'s executing queue,
thereby avoiding concurrent execution of `l0Compaction` and `Stats`.

issue: https://github.com/milvus-io/milvus/issues/42008

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-27 20:54:28 +08:00
yihao.dai
59a6eef774
fix: Fix compaction getting stuck (#42087)
Reset `isCompacting` flag after JSONStats and BM25 task finished.

issue: https://github.com/milvus-io/milvus/issues/42083

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-27 10:26:27 +08:00
Chun Han
d1cfa58a0a
feature: support compact expiry data(#41336) (#42056)
related: #41336

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-05-25 16:46:31 +08:00
cai.zhang
344d002346
fix: Don't create index for unsorted importing segment when enable stats (#42044)
issue: #41863

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-05-23 20:46:26 +08:00
Xianhui Lin
a72492169f
feat: add NotifyDropPartition in mixcoord for droppartition in dc (#42029)
add NotifyDropPartition in mixcoord for droppartition in dc
issue:https://github.com/milvus-io/milvus/issues/41976
https://github.com/milvus-io/milvus/issues/41542

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-05-23 18:32:26 +08:00
XuanYang-cn
252d49d01e
fix: ChannelManager double assignment (#41837)
See also: #41876

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-05-23 14:16:29 +08:00
yihao.dai
f71930e8db
enhance: Enhance import context (#42021)
Rename `imeta` to `importMeta` to improve readability, and enhance
import related context usage.

issue: https://github.com/milvus-io/milvus/issues/41123

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-23 12:58:27 +08:00
yihao.dai
83c9527e70
enhance: Use QuerySlot interface for tasks (#41989)
Use `QuerySlot` rpc instead of `QueryTask` for querying slot.

issue: https://github.com/milvus-io/milvus/issues/41123

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-23 10:30:28 +08:00
cai.zhang
48419da4d2
fix: Re add json stats trigger (#41967)
issue: #41123

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-05-23 10:02:27 +08:00
yihao.dai
e04e5b41ca
enhance: Add task version monitoring (#42023)
issue: https://github.com/milvus-io/milvus/issues/41123

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-22 23:24:28 +08:00
yihao.dai
142bd2fc05
enhance: Pooling for data tasks (#41256)
1. Add global scheduler for datacoord.
2. Define and implement new CreateTask, QueryTask, DropTask interfaces.
3. Refine Import, Compaction, Stats, Index task.

issue: https://github.com/milvus-io/milvus/issues/41123

Co-authored-by: Cai Zhang <cai.zhang@zilliz.com>
2025-05-20 21:06:24 +08:00
yihao.dai
65dd3982d8
fix: Fix ants.Pool goroutine leak (#41892)
1. Release the pool after it is no longer in use.
2. Upgrade ants.Pool to fix the goroutine leak issue (see [PR
#287](https://github.com/panjf2000/ants/pull/287)).

issue: https://github.com/milvus-io/milvus/issues/41838

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-19 17:56:22 +08:00
cai.zhang
38ded7364f
fix: Don't create index for unsorted importing segment when enable stats (#41864)
issue: #41863

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-05-19 10:52:23 +08:00