52 Commits

Author SHA1 Message Date
zhagnlu
77f7d19400
fix:avoid mmap rewrite by multi json fields (#44299)
issue: #44127

Signed-off-by: zhagnlu <lu.zhang@zilliz.com>
2025-09-11 10:13:57 +08:00
Spade A
911a8df17c
feat: impl StructArray -- data storage support in segcore (#42406)
Ref https://github.com/milvus-io/milvus/issues/42148
This PR mainly enables segcore to support array of vector (read and
write, but not indexing). Now only float vector as the element type is
supported.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-06-12 14:38:35 +08:00
Xianhui Lin
3bc24c264f
enhance: Add json key inverted index in stats for optimization (#38039)
Add json key inverted index in stats for optimization
https://github.com/milvus-io/milvus/issues/36995

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-04-10 15:20:28 +08:00
smellthemoon
cb1e86e17c
enhance: support add field (#39800)
after the pr merged, we can support to insert, upsert, build index,
query, search in the added field.
can only do the above operates in added field after add field request
complete, which is a sync operate.

compact will be supported in the next pr.
#39718

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-04-02 14:24:31 +08:00
Bingyi Sun
b59555057d
feat: support json index (#36750)
https://github.com/milvus-io/milvus/issues/35528

This PR adds json index support for json and dynamic fields. Now you can
only do unary query like 'a["b"] > 1' using this index. We will support
more filter type later.

basic usage:
```
collection.create_index("json_field", {"index_type": "INVERTED",
    "params": {"json_cast_type": DataType.STRING, "json_path":
'json_field["a"]["b"]'}})
```

There are some limits to use this index:
1. If a record does not have the json path you specify, it will be
ignored and there will not be an error.
2. If a value of the json path fails to be cast to the type you specify,
it will be ignored and there will not be an error.
3. A specific json path can have only one json index.
4. If you try to create more than one json indexes for one json field,
sdk(pymilvus<=2.4.7) may return immediately because of internal
implementation. This will be fixed in a later version.

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-02-15 14:06:15 +08:00
Ted Xu
acc8fb7af6
enhance: eliminate compile warnings (part2) (#38535)
See #38435

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-12-25 15:30:50 +08:00
aoiasd
12951f0abb
enhance: rename tokenizer to analyzer and check analyzer params (#37478)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-11-10 16:12:26 +08:00
aoiasd
d67853fa89
feat: Tokenizer support build with params and clone for concurrency (#37048)
relate: https://github.com/milvus-io/milvus/issues/35853
https://github.com/milvus-io/milvus/issues/36751

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-11-06 17:48:24 +08:00
Buqian Zheng
f7b811450d
feat: add enable_tokenizer params to VarChar field (#36480)
issue: #35922

add an enable_tokenizer param to varchar field: must be set to true so
that a varchar field can enable_match or used as input of BM25 function

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-10-10 20:33:21 +08:00
Jiquan Long
89bf226f0b
feat: support keyword text match (#35923)
fix: #35922

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-09-10 15:11:08 +08:00
smellthemoon
5616b7e8d2
enhance: support null in c data_datacodec and load null value (#32183)
1. support read and write null in segcore
    will store valid_data(use uint8_t type to save memory) in fieldData.
2. support load null
binlog reader read and write data into column(sealed segment),
insertRecord(growing segment). In sealed segment, store valid_data
directly. In growing segment, considering prior implementation and easy
code reading, it covert uint8_t to fbvector<bool>, which may optimize in
future.
3.  retrieve valid_data.
    parse valid_data in search/query.
#31728

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-07-23 16:07:51 +08:00
zhagnlu
bd9727a1f7
fix: fix bug that set incorrect info to columnbase (#34428)
#34427

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-14 22:27:46 +08:00
Cai Yudong
246586be27
enhance: Unify data type check APIs under internal/core (#31800)
Issue: #22837 

Move and rename following C++ APIs:
datatype_sizeof() ==> GetDataTypeSize()
datatype_name() ==> GetDataTypeName()
datatype_is_vector() / IsVectorType() ==> IsVectorDataType()
datatype_is_variable() ==> IsVariableDataType()
datatype_is_sparse_vector() ==> IsSparseFloatVectorDataType()
datatype_is_string() / IsString() ==> IsDataTypeString()
datatype_is_floating() / IsFloat() ==> IsDataTypeFloat()
datatype_is_binary() ==> IsDataTypeBinary()
datatype_is_json() ==> IsDataTypeJson()
datatype_is_array() ==> IsDataTypeArray()
datatype_is_variable() == IsDataTypeVariable()
datatype_is_integer() / IsIntegral() ==> IsDataTypeInteger()

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-04-02 19:15:14 +08:00
Buqian Zheng
070dfc77bf
feat: [Sparse Float Vector] segcore basics and index building (#30357)
This commit adds sparse float vector support to segcore with the
following:

1. data type enum declarations
2. Adds corresponding data structures for handling sparse float vectors
in various scenarios, including:
* FieldData as a bridge between the binlog and the in memory data
structures
* mmap::Column as the in memory representation of a sparse float vector
column of a sealed segment;
* ConcurrentVector as the in memory representation of a sparse float
vector of a growing segment which supports inserts.
3. Adds logic in payload reader/writer to serialize/deserialize from/to
binlog
4. Adds the ability to allow the index node to build sparse float vector
index
5. Adds the ability to allow the query node to build growing index for
growing segment and temp index for sealed segment without index built

This commit also includes some code cleanness, comment improvement, and
some unit tests for sparse vector.

https://github.com/milvus-io/milvus/issues/29419

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-03-11 14:45:02 +08:00
Xu Tong
e429965f32
Add float16 approve for multi-type part (#28427)
issue:https://github.com/milvus-io/milvus/issues/22837

Add bfloat16 vector, add the index part of float16 vector.

Signed-off-by: Writer-X <1256866856@qq.com>
2024-01-11 15:48:51 +08:00
yah01
04b2518ae7
enhance: fix the incorrect init parameter (#29357)
as the `driver_` field is not used so this doesn't matter for now

Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-12-20 20:50:43 +08:00
yah01
8f89e9cf75
enhance: remove all unnecessary string formatting (#29323)
done by two regex expressions:
- `PanicInfo\((.+),[. \n]+fmt::format\(([.\s\S]+?)\)\)`
- `AssertInfo\((.+),[. \n]+fmt::format\(([.\s\S]+?)\)\)`

related: #28811

---------

Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-12-20 10:04:43 +08:00
yah01
342635ed61
enhance: enable assert method to format arguments (#28812)
for now the assert method in segcore could accept a string information,
too many codes don't print the value they assert.

make it happy
related #28811

---------

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-12-01 18:04:33 +08:00
Enwei Jiao
b80a3e19d3
Add code for PanicInfo (#27364)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-09-27 12:01:28 +08:00
cai.zhang
a362bb1457
Support array datatype (#26369)
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2023-09-19 14:23:23 +08:00
Enwei Jiao
0afdfdb9af
Remove other Exceptions, keeps SegcoreError only (#27017)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-09-14 14:05:20 +08:00
Enwei Jiao
c3f15c6b95
Refactor duplicate error class into one place (#26985)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-09-11 20:43:17 +08:00
Xu Tong
9166011c4a
Add float16 vector (#25852)
Signed-off-by: Writer-X <1256866856@qq.com>
2023-09-08 10:03:16 +08:00
Enwei Jiao
967a97b9bd
Support json & array types (#23408)
Signed-off-by: yah01 <yang.cen@zilliz.com>
Co-authored-by: yah01 <yang.cen@zilliz.com>
2023-04-20 11:32:31 +08:00
yah01
081572d31c
Refactor QueryNode (#21625)
Signed-off-by: yah01 <yang.cen@zilliz.com>
Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: aoiasd <zhicheng.yue@zilliz.com>
2023-03-27 00:42:00 +08:00
yah01
bdd6bc7695
Re-format cpp code (#22513)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-03-02 15:55:49 +08:00
yah01
7478e44911
Support using mmap to load data (#22052)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-03-01 18:07:49 +08:00
Cai Yudong
a001412e12
Replace faiss::MetricType with knowhere::MetricType (#17891)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2022-06-29 14:20:19 +08:00
xige-16
b5c11a216d
Alter varChar type params's name to max_length (#17409)
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2022-06-07 15:58:06 +08:00
xige-16
515d0369de
Support string type in segcore (#16546)
Signed-off-by: xige-16 <xi.ge@zilliz.com>
Co-authored-by: dragondriver <jiquan.long@zilliz.com>

Co-authored-by: dragondriver <jiquan.long@zilliz.com>
2022-04-29 13:35:49 +08:00
xige-16
27b4cbc098
Cherry pick remove translateHits commit to mater (#16436)
Signed-off-by: xige-16 <xi.ge@zilliz.com>

Co-authored-by: bigsheeper <yihao.dai@zilliz.com>
2022-04-08 20:27:31 +08:00
zhenshan.cao
9506e75e21
[skip e2e]Update license (#13668)
Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2021-12-17 20:54:42 +08:00
Cai Yudong
597523bf40
Reorder header files for FieldMeta.h (#10263)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2021-10-20 15:44:41 +08:00
Cai Yudong
27dcf698d3
Support set segcore chunk_size via config file (#7635)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2021-09-11 14:40:01 +08:00
FluorineDog
99ed122d11
Remove Dead Code, use signed type (#6398)
* make type signed

Signed-off-by: fluorinedog <fluorinedog@gmail.com>

* remove dead code

Signed-off-by: fluorinedog <fluorinedog@gmail.com>

* remove code

Signed-off-by: fluorinedog <fluorinedog@gmail.com>
2021-07-09 18:25:11 +08:00
FluorineDog
b1a9aea6a6
support get entity by ids in segcore (#5456)
Signed-off-by: fluorinedog <fluorinedog@gmail.com>
2021-05-28 10:39:30 +08:00
FluorineDog
396b3f33e9
Support TermExpr, NotExpr, LogicalExpr (#5096)
1. Support Term, like `A in [1, 2, 3]`
2. Support Not, like `! A < 3`
3. Support logical combination, like `A < 3 && B > 5 or C == 0`

Type: Feature

Signed-off-by: fluorinedog <fluorinedog@gmail.com>
2021-04-30 07:19:52 +00:00
FluorineDog
f39dcdb8f3 Support error code in segcore
Signed-off-by: FluorineDog <guilin.gou@zilliz.com>
2021-03-26 16:18:30 +08:00
FluorineDog
66146223ca Support flat
Signed-off-by: FluorineDog <guilin.gou@zilliz.com>
2021-02-27 12:46:37 +08:00
cai.zhang
c35079d7e7 Update registerNode in indexservice
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2021-01-20 10:15:43 +08:00
GuoRentong
1104f059ee Update doc:module interfaces
Signed-off-by: GuoRentong <rentong.guo@zilliz.com>
2021-01-13 11:08:03 +08:00
quicksilver
8e9d8e36e1 Update run_go_unittest.sh
Signed-off-by: quicksilver <zhifeng.zhang@zilliz.com>
2021-01-13 10:40:46 +08:00
FluorineDog
4cd42c553f Rename field_name, make field_id strongly typed, skip multithread test
Signed-off-by: FluorineDog <guilin.gou@zilliz.com>
2021-01-12 18:31:52 +08:00
FluorineDog
3bf205d9a8 Fix inner product
Signed-off-by: FluorineDog <guilin.gou@zilliz.com>
2021-01-07 09:32:17 +08:00
neza2017
4015d7245d Merge operation
Signed-off-by: neza2017 <yefu.chen@zilliz.com>
2021-01-06 14:45:50 +08:00
sunby
da6eeddbf8 Add flush timestamp
Signed-off-by: sunby <bingyi.sun@zilliz.com>
2021-01-06 13:14:38 +08:00
FluorineDog
5a26f6ef21 Enable sub_query_result
Signed-off-by: FluorineDog <guilin.gou@zilliz.com>
2021-01-06 12:01:13 +08:00
XuanYang-cn
e6f726e73a Add cache for thirdparty files cache
Signed-off-by: XuanYang-cn <xuan.yang@zilliz.com>
2020-12-08 18:51:07 +08:00
neza2017
70710dee47 Add parquet payload
Signed-off-by: neza2017 <yefu.chen@zilliz.com>
2020-12-05 16:11:03 +08:00
FluorineDog
6412ebc0d4 Add support of metric type in schema, enable binary vector, fix segfault
Signed-off-by: FluorineDog <guilin.gou@zilliz.com>
2020-12-05 06:46:01 +08:00