247 Commits

Author SHA1 Message Date
aoiasd
5efb0cedc8
feat: support use fragment config for highlight (#45099)
relate: https://github.com/milvus-io/milvus/issues/42589

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-11-24 17:07:06 +08:00
wei liu
3fbee154f6
enhance: Remove large segment ID arrays from QueryNode logs (#45719)
issue: #45718

Logging complete segment ID arrays caused excessive log volume (3-6 TB
for 200k segments). Remove arrays from logger fields and keep only
segment counts for observability.

Changes:
- Remove requestSegments/preparedSegments arrays from Load logger
- Remove segmentIDs from BM25 stats logs
- Remove entries structure from sync distribution log

This reduces log volume by 99.99% for large-scale operations.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-11-20 17:18:14 +08:00
aoiasd
947c8855f3
feat: support search bm25 with highlight (#44923)
relate: https://github.com/milvus-io/milvus/issues/42589

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-11-18 16:09:39 +08:00
aoiasd
ac82bad0b3
enhance: optimize idf oracle sync logic (#44628)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-10-20 15:42:08 +08:00
aoiasd
754997ac2b
enhance: update some annotations (#44769)
relate: https://github.com/milvus-io/milvus/issues/43114

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-10-17 16:22:02 +08:00
wei liu
33d1e7de83
fix: Replace incorrect log import with milvus v2 log package (#44731)
issue: #44730
Fix the issue where logs were not outputting as expected due to
incorrect log package imports across multiple components.

Changes include:
- Add golangci-lint rule to forbid github.com/pingcap/log usage
- Replace github.com/pingcap/log with
github.com/milvus-io/milvus/pkg/v2/log

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-10-10 20:27:57 +08:00
aoiasd
78ee76f018
enhance: support preload sealed segment bm25 stats and optimize bm25 stats serialize (#44279)
relate: https://github.com/milvus-io/milvus/issues/41424

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-09-29 16:35:05 +08:00
jiaqizho
338ed2fed4
enhance: Introduce sparse filter in query (#44347)
issue: #44373

The current commit implements sparse filtering in query tasks using the
statistical information (Bloom filter/MinMax) of the Primary Key (PK).

The statistical information of the PK is bound to the segment during the
segment loading phase. A new filter has been added to the segment filter
to enable the sparse filtering functionality.

Signed-off-by: jiaqizho <jiaqi.zhou@zilliz.com>
2025-09-23 09:58:09 +08:00
aoiasd
9add663a08
fix: idf oracle use wrong dir (#44266)
relate: https://github.com/milvus-io/milvus/issues/44264

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-09-10 14:41:56 +08:00
Bingyi Sun
0c0630cc38
feat: support dropping index without releasing collection (#42941)
issue: #42942

This pr includes the following changes:
1. Added checks for index checker in querycoord to generate drop index
tasks
2. Added drop index interface to querynode
3. To avoid search failure after dropping the index, the querynode
allows the use of lazy mode (warmup=disable) to load raw data even when
indexes contain raw data.
4. In segcore, loading the index no longer deletes raw data; instead, it
evicts it.
5. In expr, the index is pinned to prevent concurrent errors.

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-02 16:17:52 +08:00
sparknack
70c8114e85
enhance: cachinglayer: resource management for segment loading (#43846)
issue: #41435

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-29 11:37:50 +08:00
Chun Han
da156981c6
feat: milvus support posix-compatible mode(milvus-io#43942) (#43944)
related: #43942

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-08-27 16:29:50 +08:00
Tianx
c0d62268ac
feat: add timesatmptz data type (#44005)
issue: https://github.com/milvus-io/milvus/issues/27467
>
https://github.com/milvus-io/milvus/issues/27467#issuecomment-3092211420
> * [x]  M1 Create collection with timestamptz field
> * [x]  M2 Insert timestamptz field data
> * [x]  M3 Retrieve timestamptz field data
> * [x]  M4 Implement handoff[ ]  

The second PR of issue:
https://github.com/milvus-io/milvus/issues/27467, which completes M1-M4
described above.

---------

Signed-off-by: xtx <xtianx@smail.nju.edu.cn>
2025-08-26 15:59:53 +08:00
congqixia
de3e5c285b
enhance: Add downgrade tsafe switch param item (#43874)
Related to #43873

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-08-15 12:31:43 +08:00
wei liu
715b5153b8
enhance: Improve delegator serviceable check logic in PinReadableSegments (#43768)
issue: #43767
- Enhance serviceable check logic to properly handle full vs partial
result requirements
- For full result (requiredLoadRatio >= 1.0): check
queryView.Serviceable()
- For partial result (requiredLoadRatio < 1.0): check load ratio
satisfaction
- Add comprehensive unit tests covering all serviceable check scenarios

This enhancement ensures delegator correctly validates serviceability
based on the requested result completeness, improving reliability of
query operations.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-08-07 12:13:40 +08:00
Zhen Ye
5551d99425
enhance: remove old arch non-streaming arch code (#43651)
issue: #41609

- remove all dml dead code at proxy
- remove dead code at l0_write_buffer
- remove msgstream dependency at proxy
- remove timetick reporter from proxy
- remove replicate stream implementation

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-08-06 14:41:40 +08:00
sparknack
bdd65871ea
enhance: tiered storage: estimate segment loading resource usage while considering eviction (#43323)
issue: #41435 

After introducing the caching layer's lazy loading and eviction
mechanisms, most parts of a segment won't be loaded into memory or disk
immediately, even if the segment is marked as LOADED. This means
physical resource usage may be very low. However, we still need to
reserve enough resources for the segments marked as LOADED. Thus, the
logic of resource usage estimation during segment loading, which based
on physcial resource usage only for now, should be changed.

To address this issue, we introduced the concept of logical resource
usage in this patch. This can be thought of as the base reserved
resource for each LOADED segment.

A segment’s logical resource usage is derived from its final evictable
and inevictable resource usage and calculated as follows:

```
SLR = SFPIER + evitable_cache_ratio * SFPER
```

it also equals to

```
SLR = (SFPIER + SFPER) - (1.0 - evitable_cache_ratio) * SFPER
```

`SLR`: The logical resource usage of a segment.
`SFPIER`: The final physical inevictable resource usage of a segment.
`SFPER`: The final physical evictable resource usage of a segment.
`evitable_cache_ratio`: The ratio of a segment's evictable resources
that can be cached locally. The higher the ratio, the more physical
memory is reserved for evictable memory.

When loading a segment, two types of resource usage are taken into
account.

First is the estimated maximum physical resource usage:

```
PPR = HPR + CPR + SMPR - SFPER
```

`PPR`: The predicted physical resource usage after the current segment
is allowed to load.
`HPR`: The physical resource usage obtained from hardware information.  
`CPR`: The total physical resource usage of segments that have been
committed but not yet loaded. When one new segment is allow to load,
`CPR' = CPR + (SMR - SER)`. When one of the committed segments is
loaded, `CPR' = CPR - (SMR - SER)`.
`SMPR`: The maximum physical resource usage of the current segment.
`SFPER`: The final physical evictable resource usage of the current
segment.

Second is the estimated logical resource usage, this check is only valid
when eviction is enabled:

```
PLR = LLR + CLR + SLR
```

`PLR`: The predicted logical resource usage after the current segment is
allowed to load.
`LLR`: The total logical resource usage of all loaded segments. When a
new segment is loaded, `LLR` should be updated to `LLR' = LLR + SLR`.
`CLR`: The total logical resource usage of segments that have been
committed but not yet loaded. When one new segment is allow to load,
`CLR' = CLR + SLR`. When one of the committed segments is loaded, `CLR'
= CLR - SLR`.
`SLR`: The logical resource usage of the current segment.

Only when `PPR < PRL && PLR < PRL` (`PRL`: Physical resource limit of
the querynode), the segment is allowed to be loaded.

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-01 21:31:37 +08:00
wei liu
7b8bf6393b
enhance: Improve partial result evaluation with row count based strategy (#43361)
issue: #43360
Enhance the partial result evaluation mechanism in delegator to use row
count based data ratio instead of simple segment count ratio for better
accuracy.

Key improvements:
- Introduce PartialResultEvaluator interface for flexible evaluation
strategy
- Implement NewRowCountBasedEvaluator using sealed segment row count
data
- Replace segment count based ratio with row count based data ratio
calculation
- Update PinReadableSegments to return sealedRowCount information
- Modify executeSubTasks to use configurable evaluator for partial
result decisions
- Add comprehensive unit tests for the new row count based evaluation
logic

This change provides more accurate partial result evaluation by
considering the actual data volume rather than just segment quantity,
leading to better query performance and consistency when some segments
are unavailable.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-28 10:18:55 +08:00
wei liu
990a25e51a
fix: Prevent delete records loss during slow segment loading [QueryNodeV2] (#43527)
issue: #42884
Fixes an issue where delete records for a segment are lost from the
delete buffer if `load segment` execution on the delegator is too slow,
causing `syncTargetVersion` or other cleanup operations to clear them
prematurely.

Changes include:
- Introduced `Pin` and `Unpin` methods in `DeleteBuffer` interface and
its implementations (`doubleCacheBuffer`, `listDeleteBuffer`).
- Added a `pinnedTimestamps` map to track timestamps protected from
cleanup by specific segments.
- Modified `LoadSegments` in `shardDelegator` to `Pin` relevant segment
delete records before loading and `Unpin` them afterwards.
- Added `isPinned` check in `UnRegister` and `TryDiscard` methods of
`listDeleteBuffer` to skip cleanup if corresponding timestamps are
pinned.
- Added comprehensive unit tests for `Pin`, `Unpin`, and `isPinned`
functionality, covering basic, multiple pins, concurrent, and edge
cases.

This ensures the integrity of delete records by preventing their
premature removal from the delete buffer during segment loading.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-24 01:00:54 +08:00
Zhen Ye
3aacd179f7
fix: balance channel before balance segment when upgrading (#43346)
issue: #43117, #42966, #43373

- also fix channel balance may not work at 2.6.
- fix error lost at delete path
- add mvcc into s/q log
- change the log level for TestCoordDownSearch

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-17 20:16:52 +08:00
wei liu
039564199c
fix: Prevent duplicate segment results in count queries (#43173)
issue: #41570
Fix issue where growing and sealed segments could be searched
simultaneously, causing inflated count(*) results. This was caused by
logic introduced in PR #42009 that made sealed segments readable before
target version advancement.

Changes include:
- Fix conditional filtering logic in PinReadableSegments to prevent
sealed segments from becoming readable prematurely
- Use target version filter for full results (ratio=1.0) to ensure
sealed segments only become readable after target advancement
- Use query view segment list filter for partial results (ratio<1.0) to
maintain backward compatibility
- Simplify target version setting in AddDistributions to prevent
premature segment readability
- Add logging for redundant growing segments during sync
- Add comprehensive unit tests covering the duplicate segment scenario

This fix ensures count(*) queries return accurate results by preventing
the same segment from being counted in both growing and sealed states.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-14 11:10:49 +08:00
Zhen Ye
15a6631147
enhance: add quota limit based on sn consuming lag (#43105)
issue: #42995

- The consuming lag at streaming node will be reported to coordinator.
- The consuming lag will trigger the write limit and deny by quota
center.
- Set the ttProtection by default.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-11 14:10:49 +08:00
Chun Han
07745439b5
fix: empty search groupby result causing crash(#43137) (#43214)
related: #43137

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-07-10 12:04:48 +08:00
aoiasd
97b1c3ed96
enhance: add warn log if some segment's bm25 stats lacks (#43111)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-07-09 23:22:47 +08:00
aoiasd
54cc0b60f2
fix: dropped segment in excluded segment use wrong excluded ts (#43115)
cause some excluded growing data insert again
relate: https://github.com/milvus-io/milvus/issues/43114

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-07-08 18:04:46 +08:00
congqixia
7bc7b18ed5
fix: [AddField] Prevent concurrent load during UpdateSchema (#43043)
Related to #43028

This PR:
- Add mutex prevent concurrent load segment & schema change
- Add schema verison field in load meta
- Update schema in PutOrRef if schema verison is larger

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-02 17:38:44 +08:00
wei liu
c381bf3e41
enhance: add logs for count(*) (#43001)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-01 19:36:43 +08:00
wei liu
396120ade5
enhance: Improve delegator serviceable check with coordinator sync state (#42975)
issue: #42404
Add syncedByCoord field to ensure delegator only becomes serviceable
after coordinator sync, preventing unreliable service state when memory
is insufficient.

Issue: When memory is low, delegator may become serviceable before
current target is ready, but segments can be released at any time,
making the serviceable state unreliable.

Changes include:
- Add syncedByCoord field to track coordinator sync status
- Update Serviceable() to require both data readiness and coord sync
- Set syncedByCoord=true in SyncTargetVersion
- Add comprehensive test coverage

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-01 10:00:43 +08:00
aoiasd
e2566c0e92
enhance: bm25 stats local cache use local storage path (#42923)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-25 13:44:46 +08:00
wei liu
bf5fde1431
fix: Prevent delegator unserviceable due to shard leader change (#42689)
issue: #42098 #42404
Fix critical issue where concurrent balance segment and balance channel
operations cause delegator view inconsistency. When shard leader
switches between load and release phases of segment balance, it results
in loading segments on old delegator but releasing on new delegator,
making the new delegator unserviceable.

The root cause is that balance segment modifies delegator views, and if
these modifications happen on different delegators due to leader change,
it corrupts the delegator state and affects query availability.

Changes include:
- Add shardLeaderID field to SegmentTask to track delegator for load
- Record shard leader ID during segment loading in move operations
- Skip release if shard leader changed from the one used for loading
- Add comprehensive unit tests for leader change scenarios

This ensures balance segment operations are atomic on single delegator,
preventing view corruption and maintaining delegator serviceability.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-19 12:10:38 +08:00
wei liu
679930bb93
enhance: refine delegator state checking error msg (#42673)
issue: #42661
Add NotStopped() and IsWorking() methods to shardDelegator for better
state management and error handling.

Changes include:
- Add instance state checking methods with proper error messages
- Replace lifetime package calls with delegator instance methods
- Add comprehensive unit tests for state transitions and error cases
- Improve error reporting with channel name for better debugging

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-17 10:40:38 +08:00
aoiasd
13330bd466
fix: add concurrency and close protect for bm25 function (#42597)
relate: https://github.com/milvus-io/milvus/issues/42576

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-10 11:36:34 +08:00
cai.zhang
5566a85bcc
enhance: Add proxy task queue metrics (#42156)
issue: #42155

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-04 11:26:32 +08:00
Zhen Ye
4bad293655
enhance: make upgrading from 2.5.x less down time (#42082)
issue: #40532

- start timeticksync at rootcoord if the streaming service is not
available
- stop timeticksync if the streaming service is available
- open a read-only wal if some nodes in cluster is not upgrading to 2.6
- allow to open read-write wal after all nodes in cluster is upgrading
to 2.6

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-29 23:02:29 +08:00
aoiasd
3a74044149
fix: hybird search sub requset not set analyzer name (#41896)
relate: https://github.com/milvus-io/milvus/issues/41213

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-05-29 14:56:28 +08:00
aoiasd
2ae4d80120
enhance: support run analyzer by loaded collection field (#42113)
relate: https://github.com/milvus-io/milvus/issues/42094

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-05-29 10:54:30 +08:00
Xianhui Lin
da30e1e4df
fix: pass the ttl duration in the search request for ttl filter (#42122)
fix: pass the TTL duration in the search request for TTL filter
issue:https://github.com/milvus-io/milvus/issues/41959

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-05-28 11:08:29 +08:00
wei liu
54619eaa2c
feat: Implement partial result support on node down (#42009)
issue: https://github.com/milvus-io/milvus/issues/41690
This commit implements partial search result functionality when query
nodes go down, improving system availability during node failures. The
changes include:

- Enhanced load balancing in proxy (lb_policy.go) to handle node
failures with retry support
- Added partial search result capability in querynode delegator and
distribution logic
- Implemented tests for various partial result scenarios when nodes go
down
- Added metrics to track partial search results in querynode_metrics.go
- Updated parameter configuration to support partial result required
data ratio
- Replaced old partial_search_test.go with more comprehensive
partial_result_on_node_down_test.go
- Updated proto definitions and improved retry logic

These changes improve query resilience by returning partial results to
users when some query nodes are unavailable, ensuring that queries don't
completely fail when a portion of data remains accessible.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-05-28 00:12:28 +08:00
aoiasd
0fafb706ba
enhance: add segment bm25 stats local cache (#41775)
relate: https://github.com/milvus-io/milvus/issues/41424

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-05-26 18:44:27 +08:00
Zhen Ye
38c804fb01
fix: more stable recovery graceful closing and stable unittest (#42013)
issue: #41544

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-23 17:52:26 +08:00
wei liu
78010262f0
enhance: Optimize shard serviceable mechanism (#41937)
issue: https://github.com/milvus-io/milvus/issues/41690
- Merge leader view and channel management into ChannelDistManager,
allowing a channel to have multiple delegators.
- Improve shard leader switching to ensure a single replica only has one
shard leader per channel. The shard leader handles all resource loading
and query requests.
- Refine the serviceable mechanism: after QC completes loading, sync the
query view to the delegator. The delegator then determines its
serviceable status based on the query view.
- When a delegator encounters forwarding query or deletion failures,
mark the corresponding segment as offline and transition it to an
unserviceable state.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-05-22 11:38:24 +08:00
congqixia
186a01eef4
fix: [AddField] Broadcast update schema even there is no segment (#41780)
Related to #41744

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-05-13 16:02:55 +08:00
Buqian Zheng
ff5c2770e5
feat: cachinglayer: various improvements (#41546)
issue: https://github.com/milvus-io/milvus/issues/41435

this PR is based on https://github.com/milvus-io/milvus/pull/41436. 

Improvements include:

- Lazy Load support for Storage v1
- Use Low/High watermark to control eviction
- Caching Layer related config changes
- Removed ChunkCache related configs and code in golang
- Add `PinAllCells` helper method to CacheSlot class
- Modified ValueAt, RawAt, PrimitiveRawAt to Bulk version, to reduce
caching layer overhead
- Removed some unclear templated bulk_subscript methods
- CachedSearchIterator to store PinWrapper when searching on
ChunkedColumn, and removed unused contrustor.

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-05-10 09:19:16 +08:00
Zhen Ye
de8f0af20d
enhance: use dispatcher at delegator when enable streaming (#41266)
issue: #38399

- add an adaptor type to adapt the streaming service client and
msgstream client to reuse the msgdispatcher.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-06 01:12:53 +08:00
aoiasd
3892451880
fix: bm25 search failed when avgdl == nan (#41502)
relate: https://github.com/milvus-io/milvus/issues/41490

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-27 17:34:38 +08:00
congqixia
b5443ddbd0
enhance: [AddField] Reopen loaded segments after AddField (#41529)
Related to #39718

This PR:
- Add reopen logic for growing & sealed segments
- Lazy reopen when schema version increases
- Add FinishLoad api for loading progress

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-26 08:48:39 +08:00
Zhen Ye
5fd47c3c89
fix: mockery too unavailable after upgrade golang version (#41481)
issue: #41291
pr: #41318

Signed-off-by: chyezh <chyezh@outlook.com>
2025-04-24 10:46:43 +08:00
aoiasd
f52c2909c4
feat: support multi analyzer for bm25 function (#41351)
relate: https://github.com/milvus-io/milvus/issues/41213

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-23 18:22:38 +08:00
aoiasd
655cc7fe06
fix: bm25 stats idf oracle leak (#41425)
relate: https://github.com/milvus-io/milvus/issues/41424

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-23 14:28:37 +08:00
SimFG
91d40fa558
fix: Update logging context and upgrade dependencies (#41318)
- issue: #41291

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-04-23 10:52:38 +08:00