39 Commits

Author SHA1 Message Date
Spade A
c4f3f0ce4c
feat: impl StructArray -- support more types of vector in STRUCT (#44736)
ref: https://github.com/milvus-io/milvus/issues/42148

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-10-15 10:25:59 +08:00
Spade A
7cb15ef141
feat: impl StructArray -- optimize vector array serialization (#44035)
issue: https://github.com/milvus-io/milvus/issues/42148

Optimized from
Go VectorArray → VectorArray Proto → Binary → C++ VectorArray Proto →
C++ VectorArray local impl → Memory
to
Go VectorArray → Arrow ListArray  → Memory

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-09-03 16:39:53 +08:00
Spade A
911a8df17c
feat: impl StructArray -- data storage support in segcore (#42406)
Ref https://github.com/milvus-io/milvus/issues/42148
This PR mainly enables segcore to support array of vector (read and
write, but not indexing). Now only float vector as the element type is
supported.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-06-12 14:38:35 +08:00
congqixia
b76478378a
feat: [Tiered] Make load list work as warmup hint (#42490)
Related to #42489
See also #41435

This PR's main target is to make partial load field list work as caching
layer warmup policy hint. If user specify load field list, the fields
not included in the list shall use `disabled` warmup policy and be able
to lazily loaded if any read op uses them.

The major changes are listed here:
- Pass load list to segcore and creating collection&schema
- Add util functions to check field shall be proactively loaded
- Adapt storage v2 column group, which may lead to hint fail if columns
share same group

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-04 10:28:32 +08:00
congqixia
b5443ddbd0
enhance: [AddField] Reopen loaded segments after AddField (#41529)
Related to #39718

This PR:
- Add reopen logic for growing & sealed segments
- Lazy reopen when schema version increases
- Add FinishLoad api for loading progress

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-04-26 08:48:39 +08:00
sthuang
1f1c836fb9
feat: Storage v2 growing segment load (#41001)
support parallel loading sealed and growing segments with storage v2
format by async reading row groups.
related: #39173

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-04-16 17:14:33 +08:00
smellthemoon
cb1e86e17c
enhance: support add field (#39800)
after the pr merged, we can support to insert, upsert, build index,
query, search in the added field.
can only do the above operates in added field after add field request
complete, which is a sync operate.

compact will be supported in the next pr.
#39718

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2025-04-02 14:24:31 +08:00
Jiquan Long
89bf226f0b
feat: support keyword text match (#35923)
fix: #35922

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-09-10 15:11:08 +08:00
zhagnlu
3107701fe8
enhance: optimize retrieve on dynamic field (#35580)
#35514

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
Co-authored-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-08-22 14:24:56 +08:00
smellthemoon
5616b7e8d2
enhance: support null in c data_datacodec and load null value (#32183)
1. support read and write null in segcore
    will store valid_data(use uint8_t type to save memory) in fieldData.
2. support load null
binlog reader read and write data into column(sealed segment),
insertRecord(growing segment). In sealed segment, store valid_data
directly. In growing segment, considering prior implementation and easy
code reading, it covert uint8_t to fbvector<bool>, which may optimize in
future.
3.  retrieve valid_data.
    parse valid_data in search/query.
#31728

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-07-23 16:07:51 +08:00
Cai Yudong
246586be27
enhance: Unify data type check APIs under internal/core (#31800)
Issue: #22837 

Move and rename following C++ APIs:
datatype_sizeof() ==> GetDataTypeSize()
datatype_name() ==> GetDataTypeName()
datatype_is_vector() / IsVectorType() ==> IsVectorDataType()
datatype_is_variable() ==> IsVariableDataType()
datatype_is_sparse_vector() ==> IsSparseFloatVectorDataType()
datatype_is_string() / IsString() ==> IsDataTypeString()
datatype_is_floating() / IsFloat() ==> IsDataTypeFloat()
datatype_is_binary() ==> IsDataTypeBinary()
datatype_is_json() ==> IsDataTypeJson()
datatype_is_array() ==> IsDataTypeArray()
datatype_is_variable() == IsDataTypeVariable()
datatype_is_integer() / IsIntegral() ==> IsDataTypeInteger()

Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-04-02 19:15:14 +08:00
Buqian Zheng
070dfc77bf
feat: [Sparse Float Vector] segcore basics and index building (#30357)
This commit adds sparse float vector support to segcore with the
following:

1. data type enum declarations
2. Adds corresponding data structures for handling sparse float vectors
in various scenarios, including:
* FieldData as a bridge between the binlog and the in memory data
structures
* mmap::Column as the in memory representation of a sparse float vector
column of a sealed segment;
* ConcurrentVector as the in memory representation of a sparse float
vector of a growing segment which supports inserts.
3. Adds logic in payload reader/writer to serialize/deserialize from/to
binlog
4. Adds the ability to allow the index node to build sparse float vector
index
5. Adds the ability to allow the query node to build growing index for
growing segment and temp index for sealed segment without index built

This commit also includes some code cleanness, comment improvement, and
some unit tests for sparse vector.

https://github.com/milvus-io/milvus/issues/29419

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-03-11 14:45:02 +08:00
yah01
02c5a649cf
enhance: store system fields in segcore (#28524)
we need the system fields info for some usacase
fix: #28523

---------

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-21 09:28:22 +08:00
cai.zhang
a362bb1457
Support array datatype (#26369)
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2023-09-19 14:23:23 +08:00
foxspy
6f4ed517de
add growing segment index (#23615)
Signed-off-by: xianliang <xianliang.li@zilliz.com>
2023-04-26 10:14:41 +08:00
yah01
bdd6bc7695
Re-format cpp code (#22513)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-03-02 15:55:49 +08:00
Cai Yudong
a001412e12
Replace faiss::MetricType with knowhere::MetricType (#17891)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2022-06-29 14:20:19 +08:00
xige-16
b5c11a216d
Alter varChar type params's name to max_length (#17409)
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2022-06-07 15:58:06 +08:00
xige-16
515d0369de
Support string type in segcore (#16546)
Signed-off-by: xige-16 <xi.ge@zilliz.com>
Co-authored-by: dragondriver <jiquan.long@zilliz.com>

Co-authored-by: dragondriver <jiquan.long@zilliz.com>
2022-04-29 13:35:49 +08:00
Cai Yudong
ca7f1c1038
[skip e2e] Reorder header files for common/Schema.cpp (#14808)
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2022-01-05 11:55:20 +08:00
zhenshan.cao
6099d4c55f
[skip e2e]Update license (#13669)
Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2021-12-17 21:14:41 +08:00
FluorineDog
b1a9aea6a6
support get entity by ids in segcore (#5456)
Signed-off-by: fluorinedog <fluorinedog@gmail.com>
2021-05-28 10:39:30 +08:00
FluorineDog
1446cd5453 Fix flat unsupported bug
Signed-off-by: FluorineDog <guilin.gou@zilliz.com>
2021-04-07 10:39:35 +08:00
FluorineDog
2cec04ed90 Fix empty schema proto hack
Signed-off-by: FluorineDog <guilin.gou@zilliz.com>
2021-03-17 11:35:28 +08:00
FluorineDog
bf75c2fbb4 Fix bug (#1053)
Signed-off-by: FluorineDog <guilin.gou@zilliz.com>
2021-02-25 17:56:43 +08:00
FluorineDog
92261e38c5 Add system property and optimize easy assert
Signed-off-by: FluorineDog <guilin.gou@zilliz.com>
2021-01-18 13:56:20 +08:00
GuoRentong
1104f059ee Update doc:module interfaces
Signed-off-by: GuoRentong <rentong.guo@zilliz.com>
2021-01-13 11:08:03 +08:00
quicksilver
8e9d8e36e1 Update run_go_unittest.sh
Signed-off-by: quicksilver <zhifeng.zhang@zilliz.com>
2021-01-13 10:40:46 +08:00
FluorineDog
4cd42c553f Rename field_name, make field_id strongly typed, skip multithread test
Signed-off-by: FluorineDog <guilin.gou@zilliz.com>
2021-01-12 18:31:52 +08:00
neza2017
4015d7245d Merge operation
Signed-off-by: neza2017 <yefu.chen@zilliz.com>
2021-01-06 14:45:50 +08:00
sunby
da6eeddbf8 Add flush timestamp
Signed-off-by: sunby <bingyi.sun@zilliz.com>
2021-01-06 13:14:38 +08:00
FluorineDog
5a26f6ef21 Enable sub_query_result
Signed-off-by: FluorineDog <guilin.gou@zilliz.com>
2021-01-06 12:01:13 +08:00
bigsheeper
1ba47ac433 Check system field in segCore
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2021-01-04 10:13:01 +08:00
XuanYang-cn
e6f726e73a Add cache for thirdparty files cache
Signed-off-by: XuanYang-cn <xuan.yang@zilliz.com>
2020-12-08 18:51:07 +08:00
neza2017
70710dee47 Add parquet payload
Signed-off-by: neza2017 <yefu.chen@zilliz.com>
2020-12-05 16:11:03 +08:00
FluorineDog
6412ebc0d4 Add support of metric type in schema, enable binary vector, fix segfault
Signed-off-by: FluorineDog <guilin.gou@zilliz.com>
2020-12-05 06:46:01 +08:00
cai.zhang
85a544b79b Add dockerfile for each component
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2020-11-28 19:06:48 +08:00
dragondriver
568cef0730 Fix InsertTask and SearchTask
Signed-off-by: dragondriver <jiquan.long@zilliz.com>
2020-11-28 10:48:29 +08:00
FluorineDog
c9fb34142c Enable primary_key switch
Signed-off-by: FluorineDog <guilin.gou@zilliz.com>
2020-11-28 09:16:00 +08:00