1949 Commits

Author SHA1 Message Date
Bingyi Sun
ba6198a3b8
fix: Replace json.doc() calls with json.dom_doc() in JsonContainsExpr (#45785)
issue: https://github.com/milvus-io/milvus/issues/45783
pr: https://github.com/milvus-io/milvus/pull/45573

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-11-25 20:19:07 +08:00
Buqian Zheng
1fda4bcae4
enhance: [2.5] add ScalarFieldProto& overload to avoid unnecessary copies (#45744)
1. Array.h: Add output_data(ScalarFieldProto&) overload for both Array
and ArrayView classes
2. Use std::string_view instead of std::string for VARCHAR and GEOMETRY
types to avoid extra string copies
3. Call Reserve(length_) before writing to proto objects to reduce
memory reallocations

a simple test shows those optimizations improve the Array of Varchar
bulk_subscript performance by 20%

issue: https://github.com/milvus-io/milvus/issues/45679
pr: https://github.com/milvus-io/milvus/pull/45743

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-11-21 12:39:05 +08:00
Bingyi Sun
f1844c9841
enhance: optimize term expr performance (#45490)
issue: https://github.com/milvus-io/milvus/issues/45641
pr: https://github.com/milvus-io/milvus/pull/45491

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-11-19 11:51:06 +08:00
cai.zhang
1d6786545b
fix: [2.5] Fix filter geometry for growing with mmap (#45466)
issue: #45450 
master pr: #45464

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-11 15:41:40 +08:00
sparknack
91645d9242
enhance: [2.5] unify the aligned buffer for both buffered and direct I/O (#45324)
issue: #43040
pr: #45323

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-11-06 10:55:35 +08:00
sparknack
561b167f1e
fix:[2.5] avoid potential race conditions when updating the executor (#45231)
issue: #43030
pr: #45230

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-11-05 10:15:33 +08:00
cai.zhang
2e4502a4fc
fix: [2.5]Skip create tmp dir for growing R-Tree index (#45258)
issue: https://github.com/milvus-io/milvus/issues/45181

master pr: https://github.com/milvus-io/milvus/pull/45256

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-04 17:35:34 +08:00
cai.zhang
cc9735ff4f
enhance: [2.5]Make GeometryCache an optional configuration (#45197)
issue: #45187 
master pr: #45192

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-11-03 20:31:34 +08:00
foxspy
0f0ea4d206
enhance: [2.5] update knowhere version (#45148)
issue: #42937 
/kind branch-feature

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-10-30 10:08:08 +08:00
cai.zhang
3ebd1f2f26
fix: [2.5]Fix retrieve geometry null data when enable mmap (#45142)
issue: #44648

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-10-29 16:48:12 +08:00
cai.zhang
d43b030b4d
fix: [2.5] Fix bug for gis function to filter geometry (#44968)
issue: #44961
master pr: #44966 

This PR fixes 3 geometry related bugs:
1. Implement ToString interface for GisFunctionFilter.
2. Ignore GisFunctionFilter MoveCursor for growing segment.
3. Don't skip null geometry for building R-Tree index, should be record
in null_offsets.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-10-21 17:00:13 +08:00
Bingyi Sun
a0201ef98d
enhance: optimize the performace of bitmap reverse lookup (#44804) (#44958)
pr: https://github.com/milvus-io/milvus/pull/44804

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-10-21 14:38:04 +08:00
cai.zhang
c6cc3d2c25
fix: [2.5] Fix the geometry return POINT(0 0) when growing mmap is enabled (#44891)
issue: #44802 
master pr: #44889 

After a Geometry object is serialized into WKB, the resulting binary may
contain '\0' bytes.
When growing mmap is enabled, the append data logic uses strcpy, which
stops copying at the first '\0' bytes.
This causes only part of the WKB---typically the portion up to the
geometry type field to be copied, leading to corrupted data.
As a result, during parsing, all POINT geometries are incorrectly
interperted as POINT(0 0).

To fix this issue, memcpy will be used instead of strcpy.

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-10-17 17:16:08 +08:00
congqixia
93411a388c
fix: [2.5] ensure deterministic search result ordering when scores are equal (#44870) (#44885)
Cherry-pick from master
pr: #44870

Related to #44819
This fix addresses an issue(#44819) where the offset parameter did not
work correctly during searches when multiple results had identical
scores. The problem occurred because results with equal scores were not
consistently ordered, leading to unpredictable pagination behavior.

The solution adds a new sorting step (SortEqualScoresByPks) in the
reduce phase that sorts results with identical scores by their primary
keys in ascending order. This ensures deterministic ordering and enables
proper offset functionality.

Changes:
- Add SortEqualScoresByPks() to sort results with equal scores by PK
- Add SortEqualScoresOneNQ() to handle per-query sorting logic
- Invoke sorting step after FillPrimaryKey() in Reduce() workflow

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-10-16 19:34:08 +08:00
cai.zhang
52ab33ba88
fix: [2.5] Skip empty loop for process growing segment (#44608)
issue: #43427 
master pr: #44606 

The GISFunction asserts that the segment_offsets cannot be nullptr. When
size is 0, the segment_offsets is nullptr, so the loop is skiped.

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-10-10 14:55:59 +08:00
cai.zhang
3dc43422be
enhance: Use GEOSGeometry directly to skipping the gdal to geos conversion (#44519)
issue: #43427

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-09-24 18:30:06 +08:00
Bingyi Sun
c5a7845531
enhance: optimize the performance of binary_search_string (#44470)
pr: #44469

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-09-23 14:28:05 +08:00
cai.zhang
5b8288a0ef
enhance: Refine geometry cache with offsets (#44432)
issue: #43427

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-09-18 20:24:02 +08:00
cai.zhang
124a1b3ce4
fix: Fix geometry bugs and add cache for create Geometry (#44376)
issue: #44102, #44079, #44075

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-09-17 15:24:03 +08:00
cai.zhang
7ef76058d5
enhance: Support gis filter operator st_dwithin (#44392)
issue: #43427

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-09-16 22:44:03 +08:00
congqixia
02d12619e2
fix: [2.5] Update 2.5 branch format (#44096)
Cherry-pick from master
pr: #44077
Related to #44076

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-09-15 11:16:00 +08:00
aoiasd
cb0bb7b31f
enhance: [2.5] forbid panic when tantivy index path not exist (#44136)
pr: https://github.com/milvus-io/milvus/pull/44135

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-09-15 10:30:00 +08:00
cqy123456
ec4442d39b
enhance: update knowhere version (#44292)
issue: https://github.com/milvus-io/milvus/issues/42937 
master pr:https://github.com/milvus-io/milvus/pull/44294

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-09-11 15:05:58 +08:00
cai.zhang
877e68f851
enhance: Support R-Tree index for geometry datatype (#44069)
issue: #43427
pr: #37417

Support R-Tree index for geometry datatype.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>
2025-09-11 14:19:58 +08:00
zhagnlu
802026569d
enhance:add param to modify delete snapshot size (#44213)
pr: #44215

Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-05 14:31:56 +08:00
cqy123456
c17ce3cf90
enhance:[2.5]minhash support and add autoindex config (#44015)
master pr: https://github.com/milvus-io/milvus/pull/44186

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-09-03 17:39:54 +08:00
zhagnlu
4ff9e49a99
fix:expand lock range for dump_snapshot (#44131)
cherry-pick from pr: #44130

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-09-01 16:25:54 +08:00
ZhuXi
cd931a0388
feat:Geospatial Data Type and GIS Function support for milvus (#43661)
issue: #43427
pr: #37417

This pr's main goal is merge #37417 to milvus 2.5 without conflicts.

# Main Goals

1. Create and describe collections with geospatial type
2. Insert geospatial data into the insert binlog
3. Load segments containing geospatial data into memory
4. Enable query and search can display  geospatial data
5. Support using GIS funtions like ST_EQUALS in query

# Solution

1. **Add Type**: Modify the Milvus core by adding a Geospatial type in
both the C++ and Go code layers, defining the Geospatial data structure
and the corresponding interfaces.
2. **Dependency Libraries**: Introduce necessary geospatial data
processing libraries. In the C++ source code, use Conan package
management to include the GDAL library. In the Go source code, add the
go-geom library to the go.mod file.
3. **Protocol Interface**: Revise the Milvus protocol to provide
mechanisms for Geospatial message serialization and deserialization.
4. **Data Pipeline**: Facilitate interaction between the client and
proxy using the WKT format for geospatial data. The proxy will convert
all data into WKB format for downstream processing, providing column
data interfaces, segment encapsulation, segment loading, payload
writing, and cache block management.
5. **Query Operators**: Implement simple display and support for filter
queries. Initially, focus on filtering based on spatial relationships
for a single column of geospatial literal values, providing parsing and
execution for query expressions.Now only support brutal search
6. **Client Modification**: Enable the client to handle user input for
geospatial data and facilitate end-to-end testing.Check the modification
in pymilvus.

---------

Signed-off-by: Yinwei Li <yinwei.li@zilliz.com>
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: cai.zhang <cai.zhang@zilliz.com>
2025-08-26 19:11:55 +08:00
zhagnlu
6c29689ca2
enhance: support expr result cache (#43882)
cherry-pick from pr: #43923

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-26 11:19:57 +08:00
cqy123456
a1ff6c89be
enhance:[2.5] Make build ratio of interim index configurable (#43938)
issue: https://github.com/milvus-io/milvus/issues/43993
master pr: https://github.com/milvus-io/milvus/pull/43939

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-08-25 16:01:52 +08:00
Alexander Guzhva
5903f049fb
enhance: Fix ArithHelperI64 for SVE in bitset (#43953)
pr: #43952

Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
2025-08-19 22:49:43 +08:00
Alexander Guzhva
84b7ec880d
enhance: remove duplicate code in ArithHelperF32 in SVE for bitset (#43951)
pr: #43950

Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
2025-08-19 22:38:44 +08:00
liliu-z
bd9fd42310
enhance: Fix template declaration order for ArithHelperF32 in SVE implemementation (#43948)
pr: #43949

Signed-off-by: Li Liu <li.liu@zilliz.com>
2025-08-19 22:00:39 +08:00
liliu-z
a6bfa25054
enhance: Cp sve support for bitset (#43928)
pr: #43833

Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
Signed-off-by: Li Liu <li.liu@zilliz.com>
Co-authored-by: Alexander Guzhva <alexanderguzhva@gmail.com>
2025-08-19 16:33:47 +08:00
sparknack
b57d104742
enhance: [2.5] add write rate limit for disk file writer (#43856)
issue: https://github.com/milvus-io/milvus/issues/43040
pr: #43912

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-18 23:33:46 +08:00
Bingyi Sun
26883919de
fix: Fix wrong null offsets for json path index (#43823)
pr: #43390
issue: https://github.com/milvus-io/milvus/issues/43315

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-08-18 14:47:46 +08:00
yihao.dai
1644d0b288
enhance: [2.5] Improve error message when query vector and dim mismatch (#43836)
/kind improvement

pr: https://github.com/milvus-io/milvus/pull/43835

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-08-18 10:23:45 +08:00
zhagnlu
6d86aade6c
fix: fix delete consumer bug for cocurrency R-W (#43831) (#43855)
pr: #43831

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-08-14 10:19:43 +08:00
sparknack
4d944aecf7
enhance: add disk file writer with Direct IO support (#43692)
issue: #43040
pr: #42665 

This patch introduces a disk file writer that supports Direct IO.

Currently, it is exclusively utilized during the QueryNode load process.

Below is its parameters:

1. `common.diskWriteMode` This parameter controls the write mode of the
local disk, which is used to write temporary data downloaded from remote
storage. Currently, only QueryNode uses 'common.diskWrite*' parameters.
Support for other components will be added in the future.
The options include 'direct' and 'buffered'. The default value is
'buffered'.

2. `common.diskWriteBufferSizeKb` Disk write buffer size in KB, only
used when disk write mode is 'direct', default is 64KB.
Current valid range is [4, 65536]. If the value is not aligned to 4KB,
it will be rounded up to the nearest multiple of 4KB.

3. `common.diskWriteNumThreads` This parameter controls the number of
writer threads used for disk write operations. The valid range is [0,
hardware_concurrency]. It is designed to limit the maximum concurrency
of disk write operations to reduce the impact on disk read performance.
For example, if you want to limit the maximum concurrency of disk write
operations to 1, you can set this parameter to 1.
The default value is 0, which means the caller will perform write
operations directly without using an additional writer thread pool. In
this case, the maximum concurrency of disk write operations is
determined by the caller's thread pool size.

Both parameters can be updated during runtime.

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-08-08 12:13:41 +08:00
Spade A
a3c5e2e3c3
feat: support phrase match query for 2.5 (#43716)
pr: https://github.com/milvus-io/milvus/pull/38869
issue: https://github.com/milvus-io/milvus/issues/38930

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-08-08 11:35:41 +08:00
aoiasd
305524f99a
fix: jieba tokenizer cause panic when dict word was empty string (#43337) (#43718)
pr: https://github.com/milvus-io/milvus/pull/43337
relate: https://github.com/milvus-io/milvus/issues/42779

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-08-04 20:57:48 +08:00
Chun Han
f033294dc1
fix: try to get span raw data for variable length data type(#43544) (#43703)
related: #43544
pr: https://github.com/milvus-io/milvus/pull/43705

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-08-04 11:25:39 +08:00
Xianhui Lin
e44df1c583
fix: increment offset for invalid data rows in JsonKeyStatsInvertedIndex (#43688)
fix: increment offset for null data rows in JsonKeyStatsInvertedIndex
issue: https://github.com/milvus-io/milvus/issues/43151
pr:https://github.com/milvus-io/milvus/pull/43679

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-08-03 13:11:38 +08:00
Bingyi Sun
cc21855174
enhance: unlink mmap file when chunk and index are destructed (#43546)
pr: https://github.com/milvus-io/milvus/pull/43524
issue: https://github.com/milvus-io/milvus/issues/41636

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-08-01 11:17:37 +08:00
zhagnlu
ea7307747a
fix: fix pk in [..] skip next batch when using multi-chunk segment (#43619)
pr: #43618

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-31 16:59:37 +08:00
zhagnlu
4b8e8bd9fd
enhance: using set element for string term type (#43393)
pr: #43049

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-31 10:43:38 +08:00
foxspy
bb528ba065
enhance: [2.5]update knowhere version (#43623)
issue: #42937 
related: #43528 https://github.com/zilliztech/knowhere/pull/1278
pr: #43528

Signed-off-by: xianliang <xianliang.li@zilliz.com>
2025-07-29 17:27:38 +08:00
Chun Han
a8c28d174f
fix: fail to get string views due to chunk bound empty loop(#41300) (#43482)
related: #41300
pr: https://github.com/milvus-io/milvus/pull/41452

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-07-25 10:16:54 +08:00
aoiasd
fe39128021
enhance: update lindera version (#43457)
relate: https://github.com/milvus-io/milvus/issues/43120
pr: https://github.com/milvus-io/milvus/pull/43121

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-07-22 19:56:53 +08:00
Chun Han
ebb1ff35bb
fix: refine judgement for batch views(#38736) (#43479)
related: #38736
pr: https://github.com/milvus-io/milvus/pull/43481

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-07-22 14:54:53 +08:00