milvus/core/src/segment/Segment.cpp
groot 28eca5de38
wal implement (#3464)
* refactor LogMgr (#3372)

Signed-off-by: Wang Xiangyu <xy.wang@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* add wal unittest (#3367)

* add wal unittest

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix test fail

Signed-off-by: groot <yihua.mo@zilliz.com>

* Find system headers with cmake and clang-tidy (#3364)

Signed-off-by: quicksilver <zhifeng.zhang@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* [skip ci]add google/crc32c into NOTICE.md (#3373)

Signed-off-by: Wang Xiangyu <xy.wang@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* prepare change memmanager for wal

Signed-off-by: groot <yihua.mo@zilliz.com>

* [skip ci]update issue template (#3379)

Signed-off-by: Wang Xiangyu <xy.wang@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* fix wal test case

Signed-off-by: groot <yihua.mo@zilliz.com>

* [skip ci] fix java sdk test case (#3375)

* java main class

Signed-off-by: zongyufen <zongyufen@foxmail.com>

* [skip ci] fix java sdk test

Signed-off-by: zongyufen <zongyufen@foxmail.com>

* [skip ci] fix java sdk test

Signed-off-by: zongyufen <zongyufen@foxmail.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* optimize download & compile of sqlite (#3361)

* optimize sqlite

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* rm sqlite url

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* fix bug

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

Co-authored-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* [skip ci] rm log (#3378)

Signed-off-by: shengjun.li <shengjun.li@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* [skip ci]  fix generate default entities (#3382)

* java main class

Signed-off-by: zongyufen <zongyufen@foxmail.com>

* [skip ci] fix java sdk test

Signed-off-by: zongyufen <zongyufen@foxmail.com>

* [skip ci] fix java sdk test

Signed-off-by: zongyufen <zongyufen@foxmail.com>

* [skip ci] fix generate default entities

Signed-off-by: zongyufen <zongyufen@foxmail.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* Skip flat search params (#3381)

* assert top ids

Signed-off-by: zw <zw@milvus.io>

* update milvus-helm to 0.11.0

Signed-off-by: zw <zw@milvus.io>

Co-authored-by: zw <zw@milvus.io>
Signed-off-by: groot <yihua.mo@zilliz.com>

* [skip ci] update clang-tidy rules (#3386)

* [skip ci] update clang-tidy rules

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* [skip ci] update clang-tidy rules

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* Optimize download & compile of Aws (#3384)

* runable aws

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* optimize aws

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

Co-authored-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* [skip ci] modify test flush and compact (#3390)

* java main class

Signed-off-by: zongyufen <zongyufen@foxmail.com>

* [skip ci] fix java sdk test

Signed-off-by: zongyufen <zongyufen@foxmail.com>

* [skip ci] fix java sdk test

Signed-off-by: zongyufen <zongyufen@foxmail.com>

* [skip ci] fix generate default entities

Signed-off-by: zongyufen <zongyufen@foxmail.com>

* [skip ci] modify test flush

Signed-off-by: zongyufen <zongyufen@foxmail.com>

* [skip ci] modify test compact

Signed-off-by: zongyufen <zongyufen@foxmail.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* rewrite insert memmanager

Signed-off-by: groot <yihua.mo@zilliz.com>

* Optimize thirdparty download workflow  (#3394)

* Optimize thirdparty download workflow

Signed-off-by: quicksilver <zhifeng.zhang@zilliz.com>

* Optimize thirdparty download workflow

Signed-off-by: quicksilver <zhifeng.zhang@zilliz.com>

* Optimize thirdparty download workflow

Signed-off-by: quicksilver <zhifeng.zhang@zilliz.com>

* Optimize thirdparty download workflow

Signed-off-by: quicksilver <zhifeng.zhang@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* Enlarge event queue (#3393)

Signed-off-by: yinghao.zou <yinghao.zou@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* Add GetPageEntity unittest (#3397)

* Add web server interface

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Add unittest/server

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Add web server ut

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Fix web server insert bug

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Fix web server ut crash bug

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Fix web server ut gpu compile error

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Fix codacy quality

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Fix row_num error

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Change row insert to column insert

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Fix dsl issue

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Fix GetEntityByID bug

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Add GetPageEntity interface

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Fix some webserver bugs

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Remove server_config.yaml

Signed-off-by: fishpenguin <kun.yu@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* Fix thirdparty not find ccache but still use it (#3398)

* fix sqlite ccache

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* thirdparty EP using ccache configure

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

Co-authored-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* optimize oatpp  (#3377)

* optimize oatpp

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* fix some bug

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* fit atomic not find bug

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* add one config fo oatpp

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* change oatpp version

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* comment url_md5

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* change oatpp version

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* change aws target name

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

Co-authored-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* Split Ftype into Ftype and FEtype (#3341)

* fix include directories not find bug (#3323)

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

Co-authored-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: fluorinedog <fluorinedog@gmail.com>

* reduce grpc download file size from 380M to 130M  (#3326)

* reduce grpc download file size from 380M to 130M

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* fix bug

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

Co-authored-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: fluorinedog <fluorinedog@gmail.com>

* update clang-tidy config (#3314)

* update clang-tidy config

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* update .clang-tidy

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* update clang-tidy config

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* update clang-tidy HeaderFilterRegex

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
Signed-off-by: fluorinedog <fluorinedog@gmail.com>

* enable fetype

Signed-off-by: fluorinedog <fluorinedog@gmail.com>

* rename field_element_method

Signed-off-by: fluorinedog <fluorinedog@gmail.com>

* add FETYPE_TYPE

Signed-off-by: fluorinedog <fluorinedog@gmail.com>

* Add WebServer unittest (#3321)

* Add web server interface

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Add unittest/server

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Add web server ut

Signed-off-by: fishpenguin <kun.yu@zilliz.com>
Signed-off-by: fluorinedog <fluorinedog@gmail.com>

* modify db schema and remove conversion

Signed-off-by: fluorinedog <fluorinedog@gmail.com>

* weak fetype

Signed-off-by: fluorinedog <fluorinedog@gmail.com>

* enforce strong fetype

Signed-off-by: fluorinedog <fluorinedog@gmail.com>

* lint

Signed-off-by: fluorinedog <fluorinedog@gmail.com>

* format code

Signed-off-by: fluorinedog <fluorinedog@gmail.com>

* [skip ci] fix

Signed-off-by: fluorinedog <fluorinedog@gmail.com>

Co-authored-by: XuanYang-cn <51370125+XuanYang-cn@users.noreply.github.com>
Co-authored-by: yangxuan <xuan.yang@zilliz.com>
Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>
Co-authored-by: yukun <kun.yu@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* Master codacy check (#3366)

* codacy fix

Signed-off-by: cqy <yaya645@126.com>

* codacy

Signed-off-by: cqy <yaya645@126.com>

* codacy

Signed-off-by: cqy <yaya645@126.com>

* codacy check

Signed-off-by: cqy <yaya645@126.com>

* codacy

Signed-off-by: cqy <yaya645@126.com>

* codacy

Signed-off-by: cqy <yaya645@126.com>

* codacy

Signed-off-by: cqy <yaya645@126.com>

* clang-tiny

Signed-off-by: cqy <yaya645@126.com>

* clang-tidy check

Signed-off-by: cqy <yaya645@126.com>

* clang-tindy check

Signed-off-by: cqy <yaya645@126.com>

* clang-tidy check

Signed-off-by: cqy <yaya645@126.com>

* clang-tidy

Signed-off-by: cqy <yaya645@126.com>

* clang-tidy check

Signed-off-by: cqy <yaya645@126.com>

* clang-tidy check

Signed-off-by: cqy <yaya645@126.com>

* clang-tidy check

Signed-off-by: cqy <yaya645@126.com>

* clang-tidy check

Signed-off-by: cqy <yaya645@126.com>

Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* [skip ci] add constants.java (#3404)

Signed-off-by: zw <zw@milvus.io>

Co-authored-by: zw <zw@milvus.io>
Signed-off-by: groot <yihua.mo@zilliz.com>

* fix dsl term with multi fields (#3403)

Signed-off-by: zongyufen <zongyufen@foxmail.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* [skip ci] comment unused insert case (#3405)

Signed-off-by: yinghao.zou <yinghao.zou@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* enable clang-tidy check (#3396)

* enable clang-tidy check

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* update run_clang_tidy.py

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* enable clang-tidy check

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* update run_clang_tidy.py

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* update run_clang_tidy.py

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* update run_clang_tidy.py

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* remove rule modernize-use-equals-default

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* fix the bugs of delete all and compact (#3395)

* fix the bugs of delete all and compact

Signed-off-by: godchen0212 <qingxiang.chen@zilliz.com>

* fix the wrong usages in unittest

Signed-off-by: godchen0212 <qingxiang.chen@zilliz.com>

* fix the bug of insert makes no effect

Signed-off-by: godchen0212 <qingxiang.chen@zilliz.com>

* add character check in ExtraFileInfo and change the const size and type

Signed-off-by: godchen0212 <qingxiang.chen@zilliz.com>

* format code

Signed-off-by: godchen0212 <qingxiang.chen@zilliz.com>

* fix wrong test case

Signed-off-by: godchen0212 <qingxiang.chen@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* Fix TestSearchDSL multi fields bug (#3411)

Signed-off-by: fishpenguin <kun.yu@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* Optimize fiu and finish thirdparty optimization (#3412)

* fiu runable

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* runable riu

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* optimize fiu and rm ThirdPartyPackages.cmake

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* fix bug

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* fix test using fiu

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

Co-authored-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* [skip ci] Java sdk test: add testIndex.java  (#3409)

* [skip ci] add constants.java

Signed-off-by: zw <zw@milvus.io>

* [skip ci] add testIndex.java

Signed-off-by: zw <zw@milvus.io>

* [skip ci] add TestDeleteEntities.java

Signed-off-by: zw <zw@milvus.io>

Co-authored-by: zw <zw@milvus.io>
Signed-off-by: groot <yihua.mo@zilliz.com>

* [skip ci] Test get entity by id (#3417)

* [skip ci] update for Contasts

Signed-off-by: zongyufen <zongyufen@foxmail.com>

* [skip ci] test get entity by id

Signed-off-by: zongyufen <zongyufen@foxmail.com>

* [skip ci] test get entity by id

Signed-off-by: zongyufen <zongyufen@foxmail.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* update clang-tidy rules (#3416)

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* fix unittest failed

Signed-off-by: groot <yihua.mo@zilliz.com>

* remove the _id fileld when get collection info (#3414)

* fix the bug of issue #3336

Signed-off-by: godchen0212 <qingxiang.chen@zilliz.com>

* fix wrong test case

Signed-off-by: godchen0212 <qingxiang.chen@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* rewrite insert machinery

Signed-off-by: groot <yihua.mo@zilliz.com>

* insert fields validation

Signed-off-by: groot <yihua.mo@zilliz.com>

* code format

Signed-off-by: groot <yihua.mo@zilliz.com>

* Milvus build stage parallel processing (#3423)

* Milvus build stage parallel processing

Signed-off-by: quicksilver <zhifeng.zhang@zilliz.com>

* Milvus build stage parallel processing

Signed-off-by: quicksilver <zhifeng.zhang@zilliz.com>

* Milvus build stage parallel processing

Signed-off-by: quicksilver <zhifeng.zhang@zilliz.com>

* Milvus build stage parallel processing

Signed-off-by: quicksilver <zhifeng.zhang@zilliz.com>

* Milvus build stage parallel processing

Signed-off-by: quicksilver <zhifeng.zhang@zilliz.com>

* Milvus build stage parallel processing

Signed-off-by: quicksilver <zhifeng.zhang@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* avoid build hang

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix faiss cannot build gpu bug (#3424)

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

Co-authored-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* wal path

Signed-off-by: groot <yihua.mo@zilliz.com>

* typo

Signed-off-by: groot <yihua.mo@zilliz.com>

* [skip ci] add TestSearchEntities.java  (#3433)

* [skip ci] add constants.java

Signed-off-by: zw <zw@milvus.io>

* [skip ci] add testIndex.java

Signed-off-by: zw <zw@milvus.io>

* [skip ci] add TestDeleteEntities.java

Signed-off-by: zw <zw@milvus.io>

* update TestSearchEntities.java

Signed-off-by: zw <zw@milvus.io>

* [skip ci] add TestSearchEntities.java

Signed-off-by: zw <zw@milvus.io>

Co-authored-by: zw <zw@milvus.io>
Signed-off-by: groot <yihua.mo@zilliz.com>

* #3265 fix memory leak (#3413)

* #3265 fix memory leak

Signed-off-by: Wang Xiangyu <xy.wang@zilliz.com>

* fix clang-format

Signed-off-by: Wang Xiangyu <xy.wang@zilliz.com>

* update

Signed-off-by: Wang Xiangyu <xy.wang@zilliz.com>

* comment some change

Signed-off-by: Wang Xiangyu <xy.wang@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* Fix webserver set_config bug (#3425)

* Fix TestSearchDSL multi fields bug

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Fix set_config bug

Signed-off-by: fishpenguin <kun.yu@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* Update mergify

Signed-off-by: jinhai <hai.jin@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* Fix Server get stuck if create index with invalide metric types after entities inserted (#3428)

* add create index with invalid metric type case

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* fix bin index validation

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

* change changelog

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

Co-authored-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* Add partition_tag in GetPageEntities (#3434)

* Fix TestSearchDSL multi fields bug

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Fix set_config bug

Signed-off-by: fishpenguin <kun.yu@zilliz.com>

* Add partition_tag in GetPageEntities

Signed-off-by: fishpenguin <kun.yu@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* Fix Server crashed during random test (#3436)

Signed-off-by: yinghao.zou <yinghao.zou@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* Optimize clang-tidy workflow for code static analysis (#3432)

* Optimize clang-tidy for code static analysis

Signed-off-by: quicksilver <zhifeng.zhang@zilliz.com>

* Optimize clang-tidy for code static analysis

Signed-off-by: quicksilver <zhifeng.zhang@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* [skip ci] add assert by collection stats (#3437)

Signed-off-by: zongyufen <zongyufen@foxmail.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* fix get entity by id bug

Signed-off-by: groot <yihua.mo@zilliz.com>

* typo

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix wal bug

Signed-off-by: groot <yihua.mo@zilliz.com>

* [skip ci] upgrade master version to v0.11.0 (fix #3449) (#3450)

Signed-off-by: yinghao.zou <yinghao.zou@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* fix a bug

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix wal bug

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix wal test bug

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix the data type of crc32c (#3455)

Signed-off-by: godchen0212 <qingxiang.chen@zilliz.com>
Signed-off-by: groot <yihua.mo@zilliz.com>

* fix wal path bug

Signed-off-by: groot <yihua.mo@zilliz.com>

* Update version (#3454)

* [skip ci] add constants.java

Signed-off-by: zw <zw@milvus.io>

* [skip ci] update server version to 0.11.0 in test cases

Signed-off-by: zw <zw@milvus.io>

Co-authored-by: zw <zw@milvus.io>
Signed-off-by: groot <yihua.mo@zilliz.com>

* fix: shards/requirements.txt to reduce vulnerabilities (#3457)

The following vulnerabilities are fixed by pinning transitive dependencies:
- https://snyk.io/vuln/SNYK-PYTHON-SQLALCHEMY-590109

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix test failure

Signed-off-by: groot <yihua.mo@zilliz.com>

* typo

Signed-off-by: groot <yihua.mo@zilliz.com>

Co-authored-by: Wang Xiangyu <xy.wang@zilliz.com>
Co-authored-by: quicksilver <zhifeng.zhang@zilliz.com>
Co-authored-by: ThreadDao <zongyufen@foxmail.com>
Co-authored-by: XuanYang-cn <51370125+XuanYang-cn@users.noreply.github.com>
Co-authored-by: yangxuan <xuan.yang@zilliz.com>
Co-authored-by: shengjun.li <shengjun.li@zilliz.com>
Co-authored-by: del-zhenwu <56623710+del-zhenwu@users.noreply.github.com>
Co-authored-by: zw <zw@milvus.io>
Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>
Co-authored-by: BossZou <40255591+BossZou@users.noreply.github.com>
Co-authored-by: yukun <kun.yu@zilliz.com>
Co-authored-by: FluorineDog <fluorinedog@gmail.com>
Co-authored-by: cqy123456 <39671710+cqy123456@users.noreply.github.com>
Co-authored-by: chen qingxiang <67679556+godchen0212@users.noreply.github.com>
Co-authored-by: jinhai <hai.jin@zilliz.com>
Co-authored-by: Snyk bot <snyk-bot@snyk.io>
2020-08-27 09:43:01 +08:00

373 lines
11 KiB
C++

// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
#include "segment/Segment.h"
#include "db/SnapshotUtils.h"
#include "db/snapshot/Snapshots.h"
#include "knowhere/index/vector_index/helpers/IndexParameter.h"
#include "utils/Log.h"
#include <algorithm>
#include <functional>
#include <utility>
namespace milvus {
namespace engine {
const char* COLLECTIONS_FOLDER = "/collections";
Status
Segment::SetFields(int64_t collection_id) {
snapshot::ScopedSnapshotT ss;
STATUS_CHECK(snapshot::Snapshots::GetInstance().GetSnapshot(ss, collection_id));
auto& fields = ss->GetResources<snapshot::Field>();
for (auto& kv : fields) {
const snapshot::FieldPtr& field = kv.second.Get();
STATUS_CHECK(AddField(field));
}
return Status::OK();
}
Status
Segment::AddField(const snapshot::FieldPtr& field) {
if (field == nullptr) {
return Status(DB_ERROR, "Field is null pointer");
}
std::string name = field->GetName();
auto ftype = static_cast<DataType>(field->GetFtype());
if (IsVectorField(field)) {
json params = field->GetParams();
if (params.find(knowhere::meta::DIM) == params.end()) {
std::string msg = "Vector field params must contain: dimension";
LOG_SERVER_ERROR_ << msg;
return Status(DB_ERROR, msg);
}
int64_t field_width = 0;
int64_t dimension = params[knowhere::meta::DIM];
if (ftype == DataType::VECTOR_BINARY) {
field_width += (dimension / 8);
} else {
field_width += (dimension * sizeof(float));
}
AddField(name, ftype, field_width);
} else {
AddField(name, ftype);
}
return Status::OK();
}
Status
Segment::AddField(const std::string& field_name, DataType field_type, int64_t field_width) {
if (field_types_.find(field_name) != field_types_.end()) {
return Status(DB_ERROR, "duplicate field: " + field_name);
}
int64_t real_field_width = 0;
switch (field_type) {
case DataType::BOOL:
real_field_width = sizeof(bool);
break;
case DataType::DOUBLE:
real_field_width = sizeof(double);
break;
case DataType::FLOAT:
real_field_width = sizeof(float);
break;
case DataType::INT8:
real_field_width = sizeof(uint8_t);
break;
case DataType::INT16:
real_field_width = sizeof(uint16_t);
break;
case DataType::INT32:
real_field_width = sizeof(uint32_t);
break;
case DataType::INT64:
real_field_width = sizeof(uint64_t);
break;
case DataType::VECTOR_FLOAT:
case DataType::VECTOR_BINARY: {
if (field_width <= 0) {
std::string msg = "vecor field dimension required: " + field_name;
LOG_SERVER_ERROR_ << msg;
return Status(DB_ERROR, msg);
}
real_field_width = field_width;
break;
}
default:
break;
}
field_types_.insert(std::make_pair(field_name, field_type));
fixed_fields_width_.insert(std::make_pair(field_name, real_field_width));
return Status::OK();
}
Status
Segment::AddChunk(const DataChunkPtr& chunk_ptr) {
if (chunk_ptr == nullptr || chunk_ptr->count_ == 0) {
return Status(DB_ERROR, "invalid input");
}
return AddChunk(chunk_ptr, 0, chunk_ptr->count_);
}
Status
Segment::AddChunk(const DataChunkPtr& chunk_ptr, int64_t from, int64_t to) {
if (chunk_ptr == nullptr || from < 0 || to < 0 || from > chunk_ptr->count_ || to > chunk_ptr->count_ ||
from >= to) {
return Status(DB_ERROR, "invalid input");
}
// check input
for (auto& iter : chunk_ptr->fixed_fields_) {
if (iter.second == nullptr) {
return Status(DB_ERROR, "illegal field: " + iter.first);
}
auto width_iter = fixed_fields_width_.find(iter.first);
if (width_iter == fixed_fields_width_.end()) {
return Status(DB_ERROR, "field not yet defined: " + iter.first);
}
if (iter.second->Size() != width_iter->second * chunk_ptr->count_) {
return Status(DB_ERROR, "illegal field: " + iter.first);
}
}
// consume
AppendChunk(chunk_ptr, from, to);
return Status::OK();
}
Status
Segment::Reserve(const std::vector<std::string>& field_names, int64_t count) {
if (count <= 0) {
return Status(DB_ERROR, "Invalid input fot segment resize");
}
if (field_names.empty()) {
for (auto& width_iter : fixed_fields_width_) {
int64_t resize_bytes = count * width_iter.second;
auto& data = fixed_fields_[width_iter.first];
if (data == nullptr) {
data = std::make_shared<BinaryData>();
}
data->data_.resize(resize_bytes);
}
} else {
for (const auto& name : field_names) {
auto iter_width = fixed_fields_width_.find(name);
if (iter_width == fixed_fields_width_.end()) {
return Status(DB_ERROR, "Invalid input fot segment resize");
}
int64_t resize_bytes = count * iter_width->second;
auto& data = fixed_fields_[name];
if (data == nullptr) {
data = std::make_shared<BinaryData>();
}
data->data_.resize(resize_bytes);
}
}
return Status::OK();
}
Status
Segment::AppendChunk(const DataChunkPtr& chunk_ptr, int64_t from, int64_t to) {
if (chunk_ptr == nullptr || from < 0 || to < 0 || from > to) {
return Status(DB_ERROR, "Invalid input fot segment append");
}
int64_t add_count = to - from;
if (add_count == 0) {
add_count = 1; // n ~ n also means append the No.n
}
for (auto& width_iter : fixed_fields_width_) {
auto input = chunk_ptr->fixed_fields_.find(width_iter.first);
if (input == chunk_ptr->fixed_fields_.end()) {
continue;
}
auto& data = fixed_fields_[width_iter.first];
if (data == nullptr) {
fixed_fields_[width_iter.first] = input->second;
continue;
}
size_t origin_bytes = data->data_.size();
int64_t add_bytes = add_count * width_iter.second;
int64_t previous_bytes = row_count_ * width_iter.second;
int64_t target_bytes = previous_bytes + add_bytes;
if (data->data_.size() < target_bytes) {
data->data_.resize(target_bytes);
}
if (input == chunk_ptr->fixed_fields_.end()) {
// this field is not provided, complicate by 0
memset(data->data_.data() + origin_bytes, 0, target_bytes - origin_bytes);
} else {
// complicate by 0
if (origin_bytes < previous_bytes) {
memset(data->data_.data() + origin_bytes, 0, previous_bytes - origin_bytes);
}
// copy input into this field
memcpy(data->data_.data() + previous_bytes, input->second->data_.data() + from * width_iter.second,
add_bytes);
}
}
row_count_ += add_count;
return Status::OK();
}
Status
Segment::DeleteEntity(std::vector<offset_t>& offsets) {
if (offsets.size() == 0) {
return Status::OK();
}
// sort offset in descendant
std::sort(offsets.begin(), offsets.end(), std::greater<>());
// delete entity data from max offset to min offset
for (auto& pair : fixed_fields_) {
int64_t width = fixed_fields_width_[pair.first];
if (width == 0 || pair.second == nullptr) {
continue;
}
auto& data = pair.second;
for (auto offset : offsets) {
if (offset >= 0 && offset < row_count_) {
auto step = offset * width;
data->data_.erase(data->data_.begin() + step, data->data_.begin() + step + width);
}
}
}
// reset row count
for (auto offset : offsets) {
if (offset >= 0 && offset < row_count_) {
row_count_--;
}
}
return Status::OK();
}
Status
Segment::GetFieldType(const std::string& field_name, DataType& type) {
auto iter = field_types_.find(field_name);
if (iter == field_types_.end()) {
return Status(DB_ERROR, "invalid field name: " + field_name);
}
type = iter->second;
return Status::OK();
}
Status
Segment::GetFixedFieldWidth(const std::string& field_name, int64_t& width) {
auto iter = fixed_fields_width_.find(field_name);
if (iter == fixed_fields_width_.end()) {
return Status(DB_ERROR, "invalid field name: " + field_name);
}
width = iter->second;
return Status::OK();
}
Status
Segment::GetFixedFieldData(const std::string& field_name, BinaryDataPtr& data) {
auto iter = fixed_fields_.find(field_name);
if (iter == fixed_fields_.end()) {
return Status(DB_ERROR, "invalid field name: " + field_name);
}
data = iter->second;
return Status::OK();
}
Status
Segment::SetFixedFieldData(const std::string& field_name, BinaryDataPtr& data) {
if (data == nullptr) {
return Status(DB_ERROR, "Could not set null pointer");
}
int64_t width = 0;
auto status = GetFixedFieldWidth(field_name, width);
if (!status.ok()) {
return status;
}
fixed_fields_[field_name] = data;
if (row_count_ == 0 && width != 0) {
row_count_ = data->Size() / width;
}
return Status::OK();
}
Status
Segment::GetVectorIndex(const std::string& field_name, knowhere::VecIndexPtr& index) {
index = nullptr;
auto iter = vector_indice_.find(field_name);
if (iter == vector_indice_.end()) {
return Status(DB_ERROR, "Invalid field name: " + field_name);
}
index = iter->second;
return Status::OK();
}
Status
Segment::SetVectorIndex(const std::string& field_name, const knowhere::VecIndexPtr& index) {
vector_indice_[field_name] = index;
return Status::OK();
}
Status
Segment::GetStructuredIndex(const std::string& field_name, knowhere::IndexPtr& index) {
index = nullptr;
auto iter = structured_indice_.find(field_name);
if (iter == structured_indice_.end()) {
return Status(DB_ERROR, "invalid field name: " + field_name);
}
index = iter->second;
return Status::OK();
}
Status
Segment::SetStructuredIndex(const std::string& field_name, const knowhere::IndexPtr& index) {
structured_indice_[field_name] = index;
return Status::OK();
}
} // namespace engine
} // namespace milvus