milvus

mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 03:13:22 +08:00

Author	SHA1	Message	Date
cai.zhang	7527ddf50f	enhance: [test] Move R-Tree index tests into the implementation package (#45355 ) Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-07 10:03:33 +08:00
zhagnlu	59c64bee07	fix: not use json_shredding for json path is null (#45310 ) #45284 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-11-06 11:43:33 +08:00
sparknack	9032bb7668	enhance: unify the aligned buffer for both buffered and direct I/O (#45323 ) issue: #43040 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-11-06 10:53:33 +08:00
yihao.dai	121eb912ba	fix: Fix load segment failed due to get disk usage error (#45255 ) When getting disk usage, files or directories may be removed concurrently due to segment release. This PR ignores “file or directory does not exist” errors in such cases. issue: https://github.com/milvus-io/milvus/issues/45239 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-11-06 08:51:33 +08:00
congqixia	55bfd610b6	enhance: [StorageV2] Integrate FFI interface for packed reader (#45132 ) Related to #44956 Integrate the StorageV2 FFI interface as the unified storage layer for reading packed columnar data, replacing the custom iterative reader with a manifest-based approach using the milvus-storage library. Changes: - Add C++ FFI reader implementation (ffi_reader_c.cpp/h) with Arrow C Stream interface - Implement utility functions to convert CStorageConfig to milvus-storage Properties - Create ManifestReader in Go that generates manifests from binlogs - Add FFI packed reader CGO bindings (packed_reader_ffi.go) - Refactor NewBinlogRecordReader to use ManifestReader for V2 storage - Support both manifest file paths and direct manifest content - Enable configurable buffer sizes and column projection Technical improvements: - Zero-copy data exchange using Arrow C Data Interface - Optimized I/O operations through milvus-storage library - Simplified code path with manifest-based reading - Better performance with batched streaming reads --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-11-05 19:57:34 +08:00
cai.zhang	fa3d4ebfbe	fix: Compute the correct batch size for the geometry index of the growing segment (#45253 ) issue: #44648 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-04 20:25:37 +08:00
zhenshan.cao	6327c9a514	fix: Fix bugs related to TimestampTz (#45111 ) issue: https://github.com/milvus-io/milvus/issues/44527 https://github.com/milvus-io/milvus/issues/44537 https://github.com/milvus-io/milvus/issues/44538 https://github.com/milvus-io/milvus/issues/44585 https://github.com/milvus-io/milvus/issues/44622 Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>	2025-11-04 16:51:33 +08:00
sparknack	40b5e6b134	fix: avoid potential race conditions when updating the executor (#45230 ) issue: #43040 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-11-04 14:25:33 +08:00
cai.zhang	617891b436	fix: Skip create tmp dir for growing R-Tree index (#45256 ) issue: #45181 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-04 13:01:32 +08:00
Spade A	cd0b36c39e	feat: impl StructArray -- support diskann index (#45223 ) issue: https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com> Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-11-04 11:57:33 +08:00
zhagnlu	653e95aaad	fix: fix bug for shredding json when empty json but not null (#45221 ) #45157 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-11-04 11:11:33 +08:00
cai.zhang	01cf5c9341	enhance: Add log to debug index task (#45198 ) Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-03 20:01:34 +08:00
cai.zhang	ed8ba4a28c	enhance: Make GeometryCache an optional configuration (#45192 ) issue: #45187 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-11-03 19:59:32 +08:00
Jingsong Yin	e25ee08566	fix: fix LoadMetrics bool type error (#45209 ) #44584 Signed-off-by: thekingking <1677273255@qq.com>	2025-11-01 01:19:32 +08:00
Jingsong Yin	0cc79772e7	enhance: Extend SkipIndex with IN/Match support and BloomFilter (#44581 ) issue: #44584 --------- Signed-off-by: thekingking <1677273255@qq.com>	2025-10-31 22:39:32 +08:00
congqixia	22098c1785	fix: add null check for packed_writer_ in JsonStatsParquetWriter::Close() (#45158 ) Related to #45157 Fix a bug where DataNode panics when building json stats index throws an exception before the writer is initialized. The destructor would call Close() on an uninitialized packed_writer_ pointer, causing a null pointer dereference. Changes: - Add null check for packed_writer_ before calling Flush() and Close() - Prevents null pointer dereference in edge cases - Ignore close status as this is a cleanup operation This ensures safe cleanup even when initialization fails due to exceptions. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-30 17:40:09 +08:00
cqy123456	35d8213a00	fix: fail to mmap emb_list_meta in embedding list (#45127 ) issue: https://github.com/milvus-io/milvus/issues/44965 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-10-30 11:00:09 +08:00
aoiasd	ad9a0cae48	enhance: add global analyzer options (#44684 ) relate: https://github.com/milvus-io/milvus/issues/43687 Add global analyzer options, avoid having to merge some milvus params into user's analyzer params. Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-28 14:52:10 +08:00
congqixia	fd0ef09e97	fix: Handle all-null data in StringIndexSort to prevent load timeout (#45100 ) Related to #45081 StringIndexSort now properly handles collections with all-null string fields by: - Removing the error thrown when unique_count is 0 in ParseBinaryData - Adding alignment and padding support in mmap serialization (similar to ScalarIndexSort) - Separating data_size_ from mmap_size_ to correctly parse data without reading padding This fixes load collection timeout failures when all string field data is null, particularly affecting STL_SORT and TRIE index types. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-27 18:04:09 +08:00
congqixia	36a887b38b	enhance: add NewSegmentWithLoadInfo API to support segment self-managed loading (#45061 ) This commit introduces the foundation for enabling segments to manage their own loading process by passing load information during segment creation. Changes: C++ Layer: - Add NewSegmentWithLoadInfo() C API to create segments with serialized load info - Add SetLoadInfo() method to SegmentInterface for storing load information - Refactor segment creation logic into shared CreateSegment() helper function - Add comprehensive documentation for the new API Go Layer: - Extend CreateCSegmentRequest to support optional LoadInfo field - Update segment creation in querynode to pass SegmentLoadInfo when available - Add ConvertToSegcoreSegmentLoadInfo() and helper converters for proto translation Proto Definitions: - Add segcorepb.SegmentLoadInfo message with essential loading metadata - Add supporting messages: Binlog, FieldBinlog, FieldIndexInfo, TextIndexStats, JsonKeyStats - Remove dependency on data_coord.proto by creating segcore-specific definitions Testing: - Add comprehensive unit tests for proto conversion functions - Test edge cases including nil inputs, empty data, and nil array/map elements This is the first step toward issue #45060 - enabling segments to autonomously manage their loading process, which will: - Clarify responsibilities between Go and C++ layers - Reduce cross-language call overhead - Enable precise resource management at the C++ level - Support better integration with caching layer - Enable proactive schema evolution handling Related to #45060 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-27 15:28:12 +08:00
congqixia	7c627260f3	enhance: Optimize ScalarIndexSort bitmap initialization for range queries (#45085 ) Optimize bitmap initialization in ScalarIndexSort range queries by using adaptive strategy based on result density. When more than 50% of elements match the range condition, initialize bitmap with all true values and clear non-matching elements. Otherwise, use the original approach of initializing with false and setting matching elements. Also defer bitmap allocation until after early return checks to avoid unnecessary memory allocation. This optimization reduces bit operations for high-selectivity queries while maintaining the same performance for low-selectivity queries. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-27 10:08:06 +08:00
Buqian Zheng	c284e8c4a8	enhance: some minor code cleanup, prepare for scalar benchmark (#45008 ) issue: https://github.com/milvus-io/milvus/issues/44452 --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-10-24 14:22:05 +08:00
congqixia	199f6d936e	fix: Update milvus-storage to fix duplicate AWS SDK initialization (#45062 ) Update milvus-storage version from aa189ad to e5f5b4c to include the fix for duplicate AWS SDK initialization that was causing init conflicts. This update removes the redundant arrow::fs::InitializeS3() call that was resulting in duplicate Aws::InitAPI() initialization. The duplicate initialization was causing AWS SDK to ignore custom configurations, particularly affecting GCP Workload Identity authentication. Changes in milvus-storage e5f5b4c: - Remove redundant arrow::fs::InitializeS3() call - Keep only the extended S3 initialization with custom AWS SDK options - Ensure GCP IAM authentication via custom HTTP client factory works correctly Relates to #44745 Reference: milvus-io/milvus-storage#285 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-24 11:32:05 +08:00
Buqian Zheng	22995cea3f	fix: Remove debug logging from JsonFlatIndex (#44807 ) issue: https://github.com/milvus-io/milvus/issues/44452 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com> Co-authored-by: buqian.zheng <buqian.zheng@zilliz.com>	2025-10-23 16:08:06 +08:00
Bingyi Sun	52270701ce	feat: use namespace skip index when search (#44888 ) issue: #44011 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-10-23 12:04:04 +08:00
Spade A	6077178553	enhance: enable STL_SORT to support VARCHAR (#44401 ) issue: https://github.com/milvus-io/milvus/issues/44399 This PR implements STL_SORT for VARCHAR data type for both RAM and MMAP mode. The general idea is that we deduplicate field values and maintains a posting list for each unique value. The serialization format of the index is: ``` [unique_count][string_offsets][string_data][post_list_offsets][post_list_data][magic_code] string_offsets: array of offsets into string_data section string_data: str_len1, str1, str_len2, str2, ... post_list_offsets: array of offsets into post_list_data section post_list_data: post_list_len1, row_id1, row_id2, ..., post_list_len2, row_id1, row_id2, ... ``` --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-10-23 11:00:05 +08:00
cai.zhang	3d11ba06ef	fix: Double check to avoid iter has been earsed by other thread (#45013 ) issue: #44974 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-10-21 23:36:04 +08:00
zhagnlu	730308b1eb	fix: fix not equal not include None (#44959 ) #44816 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-10-21 17:08:03 +08:00
cai.zhang	b23d75a032	fix: Fix bug for gis function to filter geometry (#44966 ) issue: #44961 This PR fixes 3 geometry related bugs: 1. Implement `ToString` interface for GisFunctionFilter. 2. Ignore GisFunctionFilter `MoveCursor` for growing segment. 3. Don't skip null geometry for building R-Tree index, should be record in null_offsets. --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-10-21 09:52:04 +08:00
cai.zhang	a35a3b7c69	fix: Ensure fulfill promise when CreateArrowFileSystem throw an exception (#44975 ) issue: #44974 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-10-20 23:32:03 +08:00
zhagnlu	05df48fbe4	fix:remove duplicated '/' in jsonstats path (#44939 ) #44950 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-10-20 14:06:03 +08:00
Zhen Ye	f98d02b3e1	fix: use short debug string to avoid newline in debug logs (#44925 ) issue: #44924 Signed-off-by: chyezh <chyezh@outlook.com>	2025-10-20 10:16:03 +08:00
Bingyi Sun	3ddf9154ab	fix: Fix exists expr for json flat index (#44910 ) issue: https://github.com/milvus-io/milvus/issues/44915 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-10-19 19:46:07 +08:00
congqixia	27dbb8e75d	fix: support JSON default value in `CreateArrowScalarFromDefaultValue` (#44912 ) Related to #44897 Add missing JSON data type handling in CreateArrowScalarFromDefaultValue to fix query failures when dynamic fields are enabled. JSON default values are now properly converted to arrow::BinaryScalar using bytes_data(). Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-17 18:22:00 +08:00
cai.zhang	b0f642fb4c	fix: Fix the geometry return POINT(0 0) when growing mmap is enabled (#44889 ) issue: #44802 After a Geometry object is serialized into WKB, the resulting binary may contain '\0' bytes. When growing mmap is enabled, the append data logic uses strcpy, which stops copying at the first '\0' bytes. This causes only part of the WKB---typically the portion up to the geometry type field to be copied, leading to corrupted data. As a result, during parsing, all POINT geometries are incorrectly interperted as POINT(0 0). To fix this issue, memcpy will be used instead of strcpy. Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-10-17 17:10:11 +08:00
zhagnlu	b7935557e1	fix:unified json exists path semantic (#44916 ) #44927 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-10-17 16:40:02 +08:00
zhagnlu	ae19c93c14	enhance: remove timestamp filter for search_ids to optimize performance (#44634 ) #44352 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-10-17 16:10:01 +08:00
sparknack	4bd30a74ca	enhance: cachinglayer: add mmap and eviction support for TextMatchIndex (#44806 ) issue: #41435, #44502 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-10-17 14:42:02 +08:00
Bingyi Sun	633cae9461	enhance: add namespace for query and search request (#44343 ) issue: #44011 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-10-16 17:52:01 +08:00
congqixia	684018ca7b	fix: ensure deterministic search result ordering when scores are equal (#44870 ) Related to #44819 This fix addresses an issue(#44819) where the offset parameter did not work correctly during searches when multiple results had identical scores. The problem occurred because results with equal scores were not consistently ordered, leading to unpredictable pagination behavior. The solution adds a new sorting step (SortEqualScoresByPks) in the reduce phase that sorts results with identical scores by their primary keys in ascending order. This ensures deterministic ordering and enables proper offset functionality. Changes: - Add SortEqualScoresByPks() to sort results with equal scores by PK - Add SortEqualScoresOneNQ() to handle per-query sorting logic - Invoke sorting step after FillPrimaryKey() in Reduce() workflow --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-10-16 10:04:00 +08:00
Bingyi Sun	26d06c6340	feat: load skip index using parquet statistics (#44252 ) #44011 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-10-15 19:16:00 +08:00
cqy123456	822588302a	enhance: embedding_list support mmap in MemVectorIndex (#44764 ) issue: https://github.com/milvus-io/milvus/issues/44702 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-10-15 15:22:00 +08:00
Spade A	c4f3f0ce4c	feat: impl StructArray -- support more types of vector in STRUCT (#44736 ) ref: https://github.com/milvus-io/milvus/issues/42148 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-10-15 10:25:59 +08:00
Spade A	b8df1c0cc5	enhance: improve observability in trace for segcore scalar expression (#44260 ) Ref https://github.com/milvus-io/milvus/issues/44259 This PR connects the trace between go and segcore, and add full traces for scalar expression calling chain: <img width="2418" height="960" alt="image" src="https://github.com/user-attachments/assets/8cad69d7-bcb7-4002-a4e3-679a3641e229" /> <img width="2452" height="850" alt="image" src="https://github.com/user-attachments/assets/8b44aed0-0f03-48a7-baa0-b022fee994ce" /> <img width="2403" height="707" alt="image" src="https://github.com/user-attachments/assets/cd6f0601-0d5c-4087-8ed8-2385f1bc740b" /> --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-10-14 17:15:59 +08:00
Bingyi Sun	6cb1f7d7c6	enhance: optimize the performace of bitmap reverse lookup (#44804 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-10-14 11:57:58 +08:00
zhagnlu	2f178f810f	fix:fix json_contains(path, int) bug (#44814 ) #44816 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-10-14 00:19:59 +08:00
sparknack	df6a4dc1a0	fix: cachinglayer: avoid eviction during json handling (#44812 ) issue: #44797 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-10-13 22:07:58 +08:00
aoiasd	1b17e16fc7	fix: expr filter return wrong result when skipped (#44778 ) relate: https://github.com/milvus-io/milvus/issues/44777 Should return res with false if skipped. But now return vaild[0], it almost be true. Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-10-13 18:33:59 +08:00
zhagnlu	3dd5deb70a	fix:disable using shredding for json_path contains digital (#44724 ) #44132 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-10-13 17:25:59 +08:00
sparknack	c8a4d6e2ef	enhance: add cachinglayer management for TextMatchIndex (#44741 ) issue: #41435, #44502 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-10-13 14:37:58 +08:00

1 2 3 4 5 ...

1892 Commits