89 Commits

Author SHA1 Message Date
congqixia
b8d7045539
enhance: [Add Field] Use consistent schema for single buffer (#41891)
Related to #41873

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-05-17 19:46:22 +08:00
congqixia
a6d09ff4cd
enhance: [StorageV2] fix issues integrating basic RW operations (#41834)
Related to #39173

This PR:
- Upgrade milvus-storage commit to fix filesystem finalized issue
- Add bucket-name as prefix for all fs style access io
- Initial arrow fs on querynodes startup
- Fix timestamp access when loading sealed segment

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-05-15 09:52:23 +08:00
congqixia
476984c53e
fix: [AddField] Use latest schema instead of cached one (#41757)
Related to #41713 #41710

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-05-12 16:24:56 +08:00
Zhen Ye
e675da76e4
enhance: simplify the proto message, make segment assignment code more clean (#41671)
issue: #41544

- simplify the proto message for flush and create segment.
- simplify the msg handler for flowgraph.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-11 20:49:00 +08:00
congqixia
b5443ddbd0
enhance: [AddField] Reopen loaded segments after AddField (#41529)
Related to #39718

This PR:
- Add reopen logic for growing & sealed segments
- Lazy reopen when schema version increases
- Add FinishLoad api for loading progress

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-26 08:48:39 +08:00
aoiasd
f52c2909c4
feat: support multi analyzer for bm25 function (#41351)
relate: https://github.com/milvus-io/milvus/issues/41213

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-23 18:22:38 +08:00
SimFG
91d40fa558
fix: Update logging context and upgrade dependencies (#41318)
- issue: #41291

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-04-23 10:52:38 +08:00
congqixia
b36c88f3c8
enhance: [AddField] Broadcast schema change via WAL (#41373)
Related to #39718

Add Broadcast logic for collection schema change and notifies:
- Streamnode - Delegator
- Streamnode - Flush component
- QueryNodes via grpc

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-22 16:28:37 +08:00
Xianhui Lin
f9febe3bae
enhance: Merge RootCoord, DataCoord And QueryCoord into MixCoord (#41006)
Merge RootCoord, DataCoord And QueryCoord into MixCoord
Make Session into one
issue : https://github.com/milvus-io/milvus/issues/37764

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-11 16:36:30 +08:00
Ted Xu
1bcea2a775
fix: assigning the correct storage version in sync and index tasks (#41093)
See #39663 #40667

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-04-08 10:14:25 +08:00
Ted Xu
128efaa3e3
enhance: simplify size calculation in file writers (#40808)
See: #40342

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-03-26 20:04:22 +08:00
sthuang
d7df78a6c9
feat: Storage v2 compaction (#40667)
- Feat: Support Mix compaction. Covering tests include compatibility and
rollback ability.
  - Read v1 segments and compact with v2 format.
  - Read both v1 and v2 segments and compact with v2 format.
  - Read v2 segments and compact with v2 format.
  - Compact with duplicate primary key test.
  - Compact with bm25 segments.
  - Compact with merge sort segments.
  - Compact with no expiration segments.
  - Compact with lack binlog segments.
  - Compact with nullable field segments.
- Feat: Support Clustering compaction. Covering tests include
compatibility and rollback ability.
  - Read v1 segments and compact with v2 format.
  - Read both v1 and v2 segments and compact with v2 format.
  - Read v2 segments and compact with v2 format.
  - Compact bm25 segments with v2 format.
  - Compact with memory limit.
- Enhance: Use serdeMap serialize in BuildRecord function to support all
Milvus data types.
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-03-21 10:16:12 +08:00
XuanYang-cn
015a8f7631
fix: L0 brings its own start pos when syncing (#40663)
See also: #40388, #40207

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-03-18 15:56:16 +08:00
Ted Xu
df4285c9ef
enhance: API integration with storage v2 in clustering-compactions (#40133)
See #39173

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-03-13 14:12:06 +08:00
jaime
c8a96377bb
enhance: move object storage client creation to pkg package (#40440)
issue: #40439

Signed-off-by: jaime <yun.zhang@zilliz.com>
2025-03-12 20:38:07 +08:00
XuanYang-cn
e6c46a25ea
enhance: Use correct counter metrics for overall wa calculation (#40394)
- Use CounterVec to calculate sum of increase during a time period.
- Use entries number instead of binlog size

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-03-10 16:34:06 +08:00
XuanYang-cn
6f70e6d1e1
enhance: Log start position of delete msgs (#40315)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-03-10 14:58:05 +08:00
XuanYang-cn
4bebca6416
enhance: Replace currRows with NumOfRows (#40074)
See also: #40068

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-03-10 12:16:03 +08:00
Ted Xu
878ce56079
fix: correct memory size estimation on arrays (#40312)
See: #40342

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-03-05 16:54:09 +08:00
yihao.dai
004a1875dc
enhance: Introduce batch subscription in msgdispatcher (#39863)
Introduce a batch subscription mechanism in msgdispatcher: the
msgdispatcher now includes a vchannel watch task queue, where all
vchannels in the queue will subscribe to the MQ only once and pull
messages from the oldest vchannel checkpoint to the latest.

issue: https://github.com/milvus-io/milvus/issues/39862

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-03-05 14:38:02 +08:00
sthuang
63a7c4570e
feat: storage v2 sync (#39663)
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-03-05 11:22:15 +08:00
XuanYang-cn
837ac295fa
enhance: Remove iterators in datanode (#40301)
Iterators are long deprecated, but sort are still using it. This PR
unifies stats task with the latest compaction common functions and
remove the usage of iterators.

1. Rename `datanode/compaction` to `datanode/compactor`
2. Add `internal/compaction` and move some compaction commons into it.
3. Replace `DeltalogIterators` with `ComposeDeleteFromDeltalogs`
4. Remove `datanode/iterators`

See also: #39242

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-03-04 12:14:00 +08:00
sthuang
de02a3ebcc
feat: Storage v2 binlog packed record reader and writer (#40221)
related: #39173

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-03-03 10:24:02 +08:00
Zhen Ye
84df80b5e4
enhance: refactor metrics of streaming (#40031)
issue: #38399

- add metrics for broadcaster component.
- add metrics for wal flusher component.
- add metrics for wal interceptors.
- add slow log for wal.
- add more label for some wal metrics. (local or remote/catcup or
tailing...)

Signed-off-by: chyezh <chyezh@outlook.com>
2025-02-25 12:25:56 +08:00
congqixia
cb7f2fa6fd
enhance: Use v2 package name for pkg module (#39990)
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 23:15:58 +08:00
Patrick Weizhi Xu
04fff74a56
feat: introduce Text data type (#39874)
issue: https://github.com/milvus-io/milvus/issues/39818

This PR mimics Varchar data type, allows insert, search, query, delete,
full-text search and others.
Functionalities related to filter expressions are disabled temporarily. 

Storage changes for Text data type will be in the following PRs.

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2025-02-19 11:04:51 +08:00
SimFG
047254665d
feat: support to replicate import msg (#39171)
- issue: #39849

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
Signed-off-by: chyezh <chyezh@outlook.com>
Co-authored-by: chyezh <chyezh@outlook.com>
2025-02-16 00:08:13 +08:00
aoiasd
24d2bbc441
enhance: unmashall ts msg in dispatcher instead in msgstream (#38656)
relate: https://github.com/milvus-io/milvus/issues/38655

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-02-14 12:04:13 +08:00
Zhen Ye
d3e32bb599
enhance: make pchannel level flusher (#39275)
issue: #38399

- Add a pchannel level checkpoint for flush processing
- Refactor the recovery of flushers of wal
- make a shared wal scanner first, then make multi datasyncservice on it

Signed-off-by: chyezh <chyezh@outlook.com>
2025-02-10 16:32:45 +08:00
XuanYang-cn
1f14053c70
enhance: Enable to observe write amplification (#39661)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-02-08 18:38:43 +08:00
jaime
8a4ac8cccd
enhance: expose more metrics data (#39456)
issue: #36621 #39417
1. Adjust the server-side cache size.
2. Add source information for configurations.
3. Add node ID for compaction and indexing tasks.
4. Resolve localhost access issues to fix health check failures for
etcd.

Signed-off-by: jaime <yun.zhang@zilliz.com>
2025-02-07 11:50:50 +08:00
Ted Xu
f007465942
fix: unstable UT pack_writer_test (#39641)
See #39601

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-02-05 16:39:11 +08:00
junjiejiangjjj
16cbdfb3b1
feat: Add Text Embedding Function (#36366)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-01-24 14:23:06 +08:00
Ted Xu
56659bacbb
enhance: make serialization be part of sync task to support file format change (#38946)
See #38945

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-01-23 15:49:05 +08:00
aoiasd
9cb4c4e8ac
fix: bm25 import segment without bm25 stats meta (#38855)
relate: https://github.com/milvus-io/milvus/issues/38854

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-01-21 11:09:04 +08:00
yihao.dai
a5a83a0904
fix: Fix consume blocked due to too many consumers (#38455)
This PR limits the maximum number of consumers per pchannel to 10 for
each QueryNode and DataNode.

issue: https://github.com/milvus-io/milvus/issues/37630

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-01-15 21:37:01 +08:00
yihao.dai
ec2e77b5d7
enhance: Reduce memory usage of BF in DataNode and QueryNode (#38129)
1. DataNode: Skip generating BF during the insert phase (BF will be
regenerated during the sync phase).
2. QueryNode: Skip generating or maintaining BF for growing segments;
deletion checks will be handled in the segcore.

issue: https://github.com/milvus-io/milvus/issues/37630

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-01-15 01:59:01 +08:00
Zhen Ye
bb8d1ab3bf
enhance: make new go package to manage proto (#39114)
issue: #39095

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-10 10:49:01 +08:00
jaime
f03a85725a
enhance: add db name in replica (#38672)
issue: #36621

Signed-off-by: jaime <yun.zhang@zilliz.com>
2025-01-09 19:40:59 +08:00
congqixia
636e107e44
fix: Remove sync task after finished (#38681)
Related to #38680

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-12-24 13:34:48 +08:00
jaime
78438ef41e
fix: revert optimize CPU usage for CheckHealth requests (#35589) (#38555)
issue: #35563

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-12-19 00:38:45 +08:00
jaime
29e620fa6d
fix: sync task still running after DataNode has stopped (#38377)
issue: #38319

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-12-17 18:06:44 +08:00
jaime
28fdbc4e30
enhance: optimize CPU usage for CheckHealth requests (#35589)
issue: #35563
1. Use an internal health checker to monitor the cluster's health state,
storing the latest state on the coordinator node. The CheckHealth
request retrieves the cluster's health from this latest state on the
proxy sides, which enhances cluster stability.
2. Each health check will assess all collections and channels, with
detailed failure messages temporarily saved in the latest state.
3. Use CheckHealth request instead of the heavy GetMetrics request on
the querynode and datanode

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-12-17 11:02:45 +08:00
SimFG
2afe2eaf3e
feat: support to replicate collection when the services contains the system tt msg (#37559)
- issue: #37105

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-12-17 09:08:46 +08:00
SimFG
fa8ac09550
fix: the issue of replicate message exception when the ttMsgEnable config is changed dynamically (#38178)
- issue: #38177

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-12-14 23:24:51 +08:00
tinswzy
27229f7907
enhance: refine exists log print with ctx (#38080)
issue: #35917 
Refines exists log print with ctx

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2024-12-14 22:36:44 +08:00
wei liu
2035575941
fix: Datanode stop progress stuck at writer buffer memory check (#38274)
issue: #38273

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-12-06 18:20:39 +08:00
tinswzy
5768dbbb5d
enhance: refine pular related mq interfaces (#38007)
issue: #35917 
Refines the pulsar-related mq APIs to allow the ctx to be passed down

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2024-12-04 20:50:39 +08:00
XuanYang-cn
70e6a00ba1
fix: Replace outer lock with concurrent map (#37817)
See also: #37493

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-11-21 16:08:33 +08:00
congqixia
b0bd290a6e
enhance: Use internal json(sonic) to replace std json lib (#37708)
Related to #35020

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-18 10:46:31 +08:00