milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

Author	SHA1	Message	Date
Zhen Ye	576084fe86	enhance: support alter collection/database with WAL-based DDL framework (#45266 ) issue: #43897 - Alter collection/database is implemented by WAL-based DDL framework now. - Support AlterCollection/AlterDatabase in wal now. - Alter operation can be synced by new CDC now. - Refactor some UT for alter DDL. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-11-04 09:59:33 +08:00
Zhen Ye	31a609c21d	fix: kafka should auto reset the offset from earliest to read (#45237 ) issue: #44172, #45210, #44851 kafka will auto reset the offset to "latest" if the offset is Out-of-range. the recovery of milvus wal cannot read any message from that. So once the offset is out-of-range, kafka should read from eariest to read the latest uncleared data. https://kafka.apache.org/documentation/#consumerconfigs_auto.offset.reset Signed-off-by: chyezh <chyezh@outlook.com>	2025-11-03 21:07:33 +08:00
cai.zhang	01cf5c9341	enhance: Add log to debug index task (#45198 ) Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-03 20:01:34 +08:00
cai.zhang	ed8ba4a28c	enhance: Make GeometryCache an optional configuration (#45192 ) issue: #45187 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-03 19:59:32 +08:00
Spade A	ae03dee116	feat: implement ngram tokenizer with token_chars and custom_token_chars (#45040 ) issue: https://github.com/milvus-io/milvus/issues/45039 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-11-03 18:09:33 +08:00
zhuwenxing	434e0847fd	test: remove xfail after fix (#45114 ) /kind improvement Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>	2025-11-03 17:21:37 +08:00
zhuwenxing	a03c398986	test: add import case for struct array (#45146 ) /kind improvement Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>	2025-11-03 17:19:39 +08:00
Zhen Ye	25e0485a56	fix: unrecoverable when replicate from old (#45224 ) issue: #44962 Signed-off-by: chyezh <chyezh@outlook.com>	2025-11-03 15:07:36 +08:00
yihao.dai	27734982fa	enhance: Don't start cdc by default (#45216 ) issue: https://github.com/milvus-io/milvus/issues/44123 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-11-03 13:15:32 +08:00
zhuwenxing	a47c168dd7	test: add json dumps for json string data (#45189 ) /kind improvement issue: https://github.com/milvus-io/milvus/issues/44982 Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>	2025-11-03 10:37:33 +08:00
aoiasd	ed69375f00	enhance: remove resource type from file resource config (#45103 ) File resource type was useless till now, remove it before new release. relate: https://github.com/milvus-io/milvus/issues/43687 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-11-03 10:15:32 +08:00
Zhen Ye	00d8d2c33d	enhance: support load/release collection/partition with WAL-based DDL framework (#45154 ) issue: #43897 - Load/Release collection/partition is implemented by WAL-based DDL framework now. - Support AlterLoadConfig/DropLoadConfig in wal now. - Load/Release operation can be synced by new CDC now. - Refactor some UT for load/release DDL. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-11-02 18:39:32 +08:00
Jingsong Yin	e25ee08566	fix: fix LoadMetrics bool type error (#45209 ) #44584 Signed-off-by: thekingking <1677273255@qq.com>	2025-11-01 01:19:32 +08:00
Jingsong Yin	0cc79772e7	enhance: Extend SkipIndex with IN/Match support and BloomFilter (#44581 ) issue: #44584 --------- Signed-off-by: thekingking <1677273255@qq.com>	2025-10-31 22:39:32 +08:00
zhikunyao	950e8f1f92	test: update helm to 5.0.6 for e2e (#45204 ) Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>	2025-10-31 18:46:08 +08:00
congqixia	22098c1785	fix: add null check for packed_writer_ in JsonStatsParquetWriter::Close() (#45158 ) Related to #45157 Fix a bug where DataNode panics when building json stats index throws an exception before the writer is initialized. The destructor would call Close() on an uninitialized packed_writer_ pointer, causing a null pointer dereference. Changes: - Add null check for packed_writer_ before calling Flush() and Close() - Prevents null pointer dereference in edge cases - Ignore close status as this is a cleanup operation This ensures safe cleanup even when initialization fails due to exceptions. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-30 17:40:09 +08:00
Zhen Ye	309d564796	enhance: support collection and index with WAL-based DDL framework (#45033 ) issue: #43897 - Part of collection/index related DDL is implemented by WAL-based DDL framework now. - Support following message type in wal, CreateCollection, DropCollection, CreatePartition, DropPartition, CreateIndex, AlterIndex, DropIndex. - Part of collection/index related DDL can be synced by new CDC now. - Refactor some UT for collection/index DDL. - Add Tombstone scheduler to manage the tombstone GC for collection or partition meta. - Move the vchannel allocation into streaming pchannel manager. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-10-30 14:24:08 +08:00
cai.zhang	3c9aa3e784	fix: Fix import null geometry data (#45161 ) issue: #44787 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-10-30 14:18:07 +08:00
wei liu	3566cb745c	enhance: remove max vector field number limit (#45151 ) issue: #45150 Removed the maximum limit constraint (value range [1, 10]) for vector fields in a collection to support more flexible schema design. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-10-30 12:42:07 +08:00
cqy123456	35d8213a00	fix: fail to mmap emb_list_meta in embedding list (#45127 ) issue: https://github.com/milvus-io/milvus/issues/44965 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-10-30 11:00:09 +08:00
congqixia	8c98adfeb3	fix: update QueryNode NumEntities metrics when collection has no segments (#45147 ) Related to #44509 Fix a bug where QueryNodeNumEntities metrics were not updated for collections with zero segments, causing stale metrics when all segments are flushed or compacted. The previous implementation used separate loops: one to update size metrics for all collections, and another to update num entities metrics only for collections present in the grouped segments map. Collections with no segments were skipped in the second loop, leaving their NumEntities metrics stale. Changes: - Consolidate size and num entities metric updates into single loop - Iterate over all collections instead of grouped segments - Get collection metadata from manager instead of segment instances - Correctly set NumEntities to 0 for collections with no segments - Apply the same fix to both growing and sealed segment processing - Add nil check for collection metadata before processing This ensures all collection metrics are updated consistently, even when segment count drops to zero. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-30 10:12:08 +08:00
Zhen Ye	6e5189fe19	fix: make ack of broadcaster cannot canceled by client (#45145 ) issue: #45141 - make ack of broadcaster cannot canceled by rpc. - make clone for assignment snapshot of wal balancer. - add server id for GetReplicateCheckpoint to avoid failure. Signed-off-by: chyezh <chyezh@outlook.com>	2025-10-29 20:34:11 +08:00
Jingsong Yin	653dfcca41	fix: fix BloomFilter type name mapping is reversed in bfNames map (#45024 ) #45017 Signed-off-by: thekingking <1677273255@qq.com>	2025-10-29 16:28:12 +08:00
Zhen Ye	ce164db1f3	fix: wal state may be unconsistent after recovering from crash (#45092 ) issue: #45088, #45086 - Message on control channel should trigger the checkpoint update. - LastConfrimedMessageID should be recovered from the minimum of checkpoint or the LastConfirmedMessageID of uncommitted txn. - Add more log info for wal debugging. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-10-29 16:26:10 +08:00
tinswzy	2dc6134195	fix: resolve wp GCP Cloud Storage access issue with AK/SK (#45120 ) #43638 Resolve issue accessing GCP Cloud Storage with ak/sk , related wp [pr:11c0834c4](`11c0834c4f`) upgrade wp v0.1.11 Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>	2025-10-29 11:54:10 +08:00
zhikunyao	7cb7651523	enhance: change dockerfile user to milvus (#44524 ) Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>	2025-10-29 11:22:12 +08:00
yihao.dai	b045efc2bd	fix: Fix panic when gracefully stopping cdc (#45094 ) issue: https://github.com/milvus-io/milvus/issues/45093, https://github.com/milvus-io/milvus/issues/44123 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-10-28 16:36:10 +08:00
congqixia	511a04a6a5	enhance: Refactor go_client test wrapper to use embedding and improve test structure (#45113 ) Related to #45105 This commit refactors the test MilvusClient wrapper to leverage Go's embedding pattern and improves test organization with subtests. File: `tests/go_client/base/milvus_client.go` - Use `typeutil.NewSet` for rate limiting: Replace map-based `rateLogMethods` with `typeutil.NewSet` for cleaner and more efficient membership checking - *Embed `client.Client` directly: Change `MilvusClient` structure from wrapping the client as a field to embedding it directly - Remove ~380 lines of wrapper methods: All wrapper methods (database, collection, partition, index, read/write, RBAC, etc.) are now unnecessary thanks to Go's embedding feature, which automatically promotes embedded methods to the outer type - Simplify initialization: Update `NewMilvusClient` and `Close` to use embedded client directly - Fix typo: Correct comment "Ike the actual method" → "Invoke the actual method" File: `tests/go_client/testcases/search_test.go` - Wrap assertions in subtests: Each search expression test is now wrapped in `t.Run()` with descriptive names - Dynamic subtest naming**: Format: `expr={expression}_dynamic-{true/false}` for clear test identification Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-28 16:16:10 +08:00
zhikunyao	a75d19a4f0	test: macos checker refresh cache everyday (#45122 ) Signed-off-by: Zhikun Yao <zhikun.yao@zilliz.com>	2025-10-28 15:30:11 +08:00
aoiasd	ad9a0cae48	enhance: add global analyzer options (#44684 ) relate: https://github.com/milvus-io/milvus/issues/43687 Add global analyzer options, avoid having to merge some milvus params into user's analyzer params. Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-28 14:52:10 +08:00
cai.zhang	c33d221536	fix: Fix bug for importing Geometry data (#45089 ) issue: #44787 , #45012 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-10-27 20:34:11 +08:00
zhagnlu	a38610cd5d	fix: disable build old version jsonstats from request (#45101 ) #44132 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-10-27 19:50:10 +08:00
congqixia	569a5b40d2	enhance: [StorageV2] add manifest path support for FFI integration (#44991 ) Related to #44956 Add manifest_path field throughout the data path to support LOON Storage V2 manifest tracking. The manifest stores metadata for segment data files and enables the unified Storage V2 FFI interface. Changes include: - Add manifest_path field to SegmentInfo and SaveBinlogPathsRequest proto messages - Add UpdateManifest operator to datacoord meta operations - Update metacache, sync manager, and meta writer to propagate manifest paths - Include manifest_path in segment load info for query coordinator This is part of the Storage V2 FFI interface integration. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-27 19:24:10 +08:00
congqixia	fd0ef09e97	fix: Handle all-null data in StringIndexSort to prevent load timeout (#45100 ) Related to #45081 StringIndexSort now properly handles collections with all-null string fields by: - Removing the error thrown when unique_count is 0 in ParseBinaryData - Adding alignment and padding support in mmap serialization (similar to ScalarIndexSort) - Separating data_size_ from mmap_size_ to correctly parse data without reading padding This fixes load collection timeout failures when all string field data is null, particularly affecting STL_SORT and TRIE index types. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-27 18:04:09 +08:00
congqixia	36a887b38b	enhance: add NewSegmentWithLoadInfo API to support segment self-managed loading (#45061 ) This commit introduces the foundation for enabling segments to manage their own loading process by passing load information during segment creation. Changes: C++ Layer: - Add NewSegmentWithLoadInfo() C API to create segments with serialized load info - Add SetLoadInfo() method to SegmentInterface for storing load information - Refactor segment creation logic into shared CreateSegment() helper function - Add comprehensive documentation for the new API Go Layer: - Extend CreateCSegmentRequest to support optional LoadInfo field - Update segment creation in querynode to pass SegmentLoadInfo when available - Add ConvertToSegcoreSegmentLoadInfo() and helper converters for proto translation Proto Definitions: - Add segcorepb.SegmentLoadInfo message with essential loading metadata - Add supporting messages: Binlog, FieldBinlog, FieldIndexInfo, TextIndexStats, JsonKeyStats - Remove dependency on data_coord.proto by creating segcore-specific definitions Testing: - Add comprehensive unit tests for proto conversion functions - Test edge cases including nil inputs, empty data, and nil array/map elements This is the first step toward issue #45060 - enabling segments to autonomously manage their loading process, which will: - Clarify responsibilities between Go and C++ layers - Reduce cross-language call overhead - Enable precise resource management at the C++ level - Support better integration with caching layer - Enable proactive schema evolution handling Related to #45060 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-27 15:28:12 +08:00
yihao.dai	dabbae0386	fix: Prevent retry when importing invalid UTF-8 strings (#45067 ) Convert invalid UTF-8 string the hex in failure reason. issue: https://github.com/milvus-io/milvus/issues/45066 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-10-27 12:30:06 +08:00
yihao.dai	8d11373376	enhance: Show create time for import job (#45058 ) issue: https://github.com/milvus-io/milvus/issues/45056 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-10-27 12:14:08 +08:00
Zhen Ye	9d29e6ee64	fix: append operation can be only canceled by the wal itself but not the rpc (#45078 ) issue: #45077 We need to promise the state of wal consistent with the memory state of streamingnode. So we don't allow the append operation can be cancelled by the append caller to avoid leave a inconsistent state of alive wal. The wal append operation can only be cancelled when the wal is shutting down. Signed-off-by: chyezh <chyezh@outlook.com>	2025-10-27 11:08:05 +08:00
yihao.dai	2631e7f42a	enhance: Close channel replicator more gracefully (#45029 ) issue: https://github.com/milvus-io/milvus/issues/44123 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-10-27 11:00:06 +08:00
Spade A	ce2862d325	fix: fix parquet import bug in STRUCT (#45028 ) issue: https://github.com/milvus-io/milvus/issues/45006 ref: https://github.com/milvus-io/milvus/issues/42148 Previsouly, the parquet import is implemented based on that the STRUCT in the parquet files is hanlded in the way that each field in struct is stored in a single column. However, in the user's perspective, the array of STRUCT contains data is something like STRUCT_A: for one row, [struct{field1_1, field2_1, field3_1}, struct{field1_2, field2_2, field3_2}, ...], rather than {[field1_1, field1_2, ...], [field2_1, field2_2, ...], [field3_1, field3_2, field3_3, ...]}. This PR fixes this. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-10-27 10:26:06 +08:00
congqixia	7c627260f3	enhance: Optimize ScalarIndexSort bitmap initialization for range queries (#45085 ) Optimize bitmap initialization in ScalarIndexSort range queries by using adaptive strategy based on result density. When more than 50% of elements match the range condition, initialize bitmap with all true values and clear non-matching elements. Otherwise, use the original approach of initializing with false and setting matching elements. Also defer bitmap allocation until after early return checks to avoid unnecessary memory allocation. This optimization reduces bit operations for high-selectivity queries while maintaining the same performance for low-selectivity queries. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-27 10:08:06 +08:00
Bingyi Sun	58277c8eb0	feat: Auto add namespace field data if namespace is enabled (#44933 ) issue: #44011 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-10-24 18:40:05 +08:00
cai.zhang	b069eeecd2	fix: Added GetMetrics back to IndexNodeServer to ensure compatibility (#45073 ) issue: #45070 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-10-24 17:00:06 +08:00
tinswzy	c328fd3c6a	fix: etcd request context contamination by RBAC auth info (#44964 ) #44892 fix etcd request context contamination by RBAC auth info ``` When RBAC is enabled, Milvus uses the gRPC metadata library to inject RBAC authentication information into the request context (ctx). Since etcd’s authentication mechanism also relies on the same metadata library, if the same ctx is passed down to the etcd request, the RBAC auth info from Milvus contaminates the auth information used by etcd. This causes the etcd server to report an invalid auth token error when RBAC is enabled but etcd auth is disabled. ``` #43638 upgrade wp to v0.1.10 Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>	2025-10-24 15:38:05 +08:00
Buqian Zheng	c284e8c4a8	enhance: some minor code cleanup, prepare for scalar benchmark (#45008 ) issue: https://github.com/milvus-io/milvus/issues/44452 --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-10-24 14:22:05 +08:00
congqixia	199f6d936e	fix: Update milvus-storage to fix duplicate AWS SDK initialization (#45062 ) Update milvus-storage version from aa189ad to e5f5b4c to include the fix for duplicate AWS SDK initialization that was causing init conflicts. This update removes the redundant arrow::fs::InitializeS3() call that was resulting in duplicate Aws::InitAPI() initialization. The duplicate initialization was causing AWS SDK to ignore custom configurations, particularly affecting GCP Workload Identity authentication. Changes in milvus-storage e5f5b4c: - Remove redundant arrow::fs::InitializeS3() call - Keep only the extended S3 initialization with custom AWS SDK options - Ensure GCP IAM authentication via custom HTTP client factory works correctly Relates to #44745 Reference: milvus-io/milvus-storage#285 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-24 11:32:05 +08:00
zhuwenxing	1e130683be	test: add geometry datatype in checker (#44794 ) /kind improvement --------- Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>	2025-10-24 11:28:04 +08:00
Spade A	17ab2ac622	fix: fix alter collection failed for STRUCT sub-fields (#45041 ) issue: https://github.com/milvus-io/milvus/issues/45001 ref: https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-10-24 10:24:06 +08:00
Spade A	d8591f9548	fix: csv/json import with STRUCT adapts concatenated struct name (#45000 ) After https://github.com/milvus-io/milvus/pull/44557, the field name in STRUCT field becomes STRUCT_NAME[FIELD_NAME] This PR make import consider the change. issue: https://github.com/milvus-io/milvus/issues/45006 ref: https://github.com/milvus-io/milvus/issues/42148 TODO: parquet is much more complex than csv/json, and I will leave it to a separate PR. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-10-24 10:22:15 +08:00
Spade A	6494c75d31	fix: collection level MMAP does not take effect for STRUCT (#44996 ) issue: https://github.com/milvus-io/milvus/issues/42148 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-10-23 19:52:05 +08:00

1 2 3 4 5 ...

23382 Commits