1. Enable Milvus to read cipher configs
2. Enable cipher plugin in binlog reader and writer
3. Add a testCipher for unittests
4. Support pooling for datanode
5. Add encryption in storagev2
See also: #40321
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
#42032
Also, fix the cacheoptfield method to work in storagev2.
Also, change the sparse related interface for knowhere version bump
#43974 .
Also, includes https://github.com/milvus-io/milvus/pull/44046 for metric
lost.
---------
Signed-off-by: chasingegg <chao.gao@zilliz.com>
Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com>
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>
Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/43917
1. fix ngrma index to be mistakenly used for unsopported operation
2. fix potential uaf problem
---------
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Ref https://github.com/milvus-io/milvus/issues/42148
This PR supports create index for vector array (now, only for
`DataType.FLOAT_VECTOR`) and search on it.
The index type supported in this PR is `EMB_LIST_HNSW` and the metric
type is `MAX_SIM` only.
The way to use it:
```python
milvus_client = MilvusClient("xxx:19530")
schema = milvus_client.create_schema(enable_dynamic_field=True, auto_id=True)
...
struct_schema = milvus_client.create_struct_array_field_schema("struct_array_field")
...
struct_schema.add_field("struct_float_vec", DataType.ARRAY_OF_VECTOR, element_type=DataType.FLOAT_VECTOR, dim=128, max_capacity=1000)
...
schema.add_struct_array_field(struct_schema)
index_params = milvus_client.prepare_index_params()
index_params.add_index(field_name="struct_float_vec", index_type="EMB_LIST_HNSW", metric_type="MAX_SIM", index_params={"nlist": 128})
...
milvus_client.create_index(COLLECTION_NAME, schema=schema, index_params=index_params)
```
Note: This PR uses `Lims` to convey offsets of the vector array to
knowhere where vectors of multiple vector arrays are concatenated and we
need offsets to specify which vectors belong to which vector array.
---------
Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
Related to #43936
This PR:
- Use `folly::SharedMutex` instead of `std::shared_mutex` preventing
starvation
- Use `folly::SharedMutex::WriteHolder/ReadHolder` instead of
std::shared_lock and std::unique_lock to get better performance
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Enables the compilation of SVE code for the bitset library if a C++
compiler supports it.
There are two conditions for enabling the SVE code
* a C++ compiler needs to have a `-march=armv8-a+sve`
* `arm_sve.h` header must be available
AFAIK, `gcc 7 does not support SVE`, `gcc 8` and `gcc 9` support SVE,
but have no `arm_sve.h` file, and only `gcc 10` has both.
Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
The Out of Memory (OOM) error occurs because a handler retains the
entire ImportRecordBatch in memory. Consequently, even when child arrays
within the batch are flushed, the memory for the complete batch is not
released. We temporarily fixed by deep copying record batch in #43724.
The proposed fix is to split the RecordBatch into smaller sub-batches by
column group. These sub-batches will be transferred via CGO, then
reassembled before being written to storage using the Storage V2 API.
Thus we can achieve zero-copy and only transferring references in CGO.
related: #43310
Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
Related to #43230
This PR
- Move segcore setup function to `initcore` package to remove cgo
dependency from pkg
- Register core callback only for components depends on segcore
- Rectify `UpdateLogLevel` implementation
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #43725
This patch add assertion preventing segment reloading same field column.
Also improve the message info when pk already exists.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #43660
This patch reduces the unwanted offset&ts entries having same timestamp
of delete record. Under large amount of upsert, this false hit could
increase large amount of memory usage while applying delete.
The next step could be passing a callback to `search_pk_func_` to handle
hit entry streamingly.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #43655
This patch add a padding when writing mmap file for ScalarSortedIndex in
case of mmap falure due to 0 mmap length.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>