milvus/tests/milvus_python_test/test_search_by_id.py
Jin Hai dab74700b2
Delete and WAL feature branch merge (#1436)
* add read/write lock

* change compact to ddl queue

* add api to get vector data

* add flush / merge / compact lock

* add api to get vector data

* add data size for table info

* add db recovery test

* add data_size check

* change file name to uppercase

Signed-off-by: jinhai <hai.jin@zilliz.com>

* update wal flush_merge_compact_mutex_

* update wal flush_merge_compact_mutex_

* change requirement

* change requirement

* upd requirement

* add logging

* add logging

* add logging

* add logging

* add logging

* add logging

* add logging

* add logging

* add logging

* delete part

* add all size checks

* fix bug

* update faiss get_vector_by_id

* add get_vector case

* update get vector by id

* update server

* fix DBImpl

* attempting to fix #1268

* lint

* update unit test

* fix #1259

* issue 1271 fix wal config

* update

* fix cases

Signed-off-by: del.zhenwu <zhenxiang.li@zilliz.com>

* update read / write error message

* update read / write error message

* [skip ci] get vectors by id from raw files instead faiss

* [skip ci] update FilesByType meta

* update

* fix ci error

* update

* lint

* Hide partition_name parameter

* Remove douban pip source

Signed-off-by: zhenwu <zw@zilliz.com>

* Update epsilon value in test cases

Signed-off-by: zhenwu <zw@zilliz.com>

* Add default partition

* Caiyd crud (#1313)

* fix clang format

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* fix unittest build error

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* add faiss_bitset_test

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* avoid user directly operate partition table

* fix has table bug

* Caiyd crud (#1323)

* fix clang format

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* fix unittest build error

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* use compile option -O3

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* update faiss_bitset_test.cpp

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* change open flags

* change OngoingFileChecker to static instance

* mark ongoing files when applying deletes

* update clean up with ttl

* fix centos ci

* update

* lint

* update partition

Signed-off-by: zhenwu <zw@zilliz.com>

* update delete and flush to include partitions

* update

* Update cases

Signed-off-by: zhenwu <zw@zilliz.com>

* Fix test cases crud (#1350)

* fix order

* add wal case

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* fix wal case

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* fix wal case

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* fix wal case

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* fix invalid operation issue

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* fix invalid operation issue

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* fix bug

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* fix bug

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* crud fix

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* crud fix

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* add table info test cases

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix cases

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix cases

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix cases

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix cases

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix cases

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
Signed-off-by: JinHai-CN <hai.jin@zilliz.com>

* merge cases

Signed-off-by: zhenwu <zw@zilliz.com>

* Shengjun (#1349)

* Add GPU sharing solution on native Kubernetes  (#1102)

* run hadolint with reviewdog

* add LINCENSE in Dockerfile

* run hadolint with reviewdog

* Reporter of reviewdog command is "github-pr-check"

* format Dockerfile

* ignore DL3007 in hadolint

* clean up old docker images

* Add GPU sharing solution on native Kubernetes

* nightly test mailer

* Fix http server bug (#1096)

* refactoring(create_table done)

* refactoring

* refactor server delivery (insert done)

* refactoring server module (count_table done)

* server refactor done

* cmake pass

* refactor server module done.

* set grpc response status correctly

* format done.

* fix redefine ErrorMap()

* optimize insert reducing ids data copy

* optimize grpc request with reducing data copy

* clang format

* [skip ci] Refactor server module done. update changlog. prepare for PR

* remove explicit and change int32_t to int64_t

* add web server

* [skip ci] add license in web module

* modify header include & comment oatpp environment config

* add port configure & create table in handler

* modify web url

* simple url complation done & add swagger

* make sure web url

* web functionality done. debuging

* add web unittest

* web test pass

* add web server port

* add web server port in template

* update unittest cmake file

* change web server default port to 19121

* rename method in web module & unittest pass

* add search case in unittest for web module

* rename some variables

* fix bug

* unittest pass

* web prepare

* fix cmd bug(check server status)

* update changlog

* add web port validate & default set

* clang-format pass

* add web port test in unittest

* add CORS & redirect root to swagger ui

* add web status

* web table method func cascade test pass

* add config url in web module

* modify thirdparty cmake to avoid building oatpp test

* clang format

* update changlog

* add constants in web module

* reserve Config.cpp

* fix constants reference bug

* replace web server with async module

* modify component to support async

* format

* developing controller & add test clent into unittest

* add web port into demo/server_config

* modify thirdparty cmake to allow build test

* remove  unnecessary comment

* add endpoint info in controller

* finish web test(bug here)

* clang format

* add web test cpp to lint exclusions

* check null field in GetConfig

* add macro RETURN STATUS DTo

* fix cmake conflict

* fix crash when exit server

* remove surplus comments & add http param check

* add uri /docs to direct swagger

* format

* change cmd to system

* add default value & unittest in web module

* add macros to judge if GPU supported

* add macros in unit & add default in index dto & print error message when bind http port fail

* format (fix #788)

* fix cors bug (not completed)

* comment cors

* change web framework to simple api

* comments optimize

* change to simple API

* remove comments in controller.hpp

* remove EP_COMMON_CMAKE_ARGS in oatpp and oatpp-swagger

* add ep cmake args to sqlite

* clang-format

* change a format

* test pass

* change name to

* fix compiler issue(oatpp-swagger depend on oatpp)

* add & in start_server.h

* specify lib location with oatpp and oatpp-swagger

* add comments

* add swagger definition

* [skip ci] change http method options status code

* remove oatpp swagger(fix #970)

* remove comments

* check Start web behavior

* add default to cpu_cache_capacity

* remove swagger component.hpp & /docs url

* remove /docs info

* remove /docs in unittest

* remove space in test rpc

* remove repeate info in CHANGLOG

* change cache_insert_data default value as a constant

* [skip ci] Fix some broken links (#960)

* [skip ci] Fix broken link

* [skip ci] Fix broken link

* [skip ci] Fix broken link

* [skip ci] Fix broken links

* fix issue 373 (#964)

* fix issue 373

* Adjustment format

* Adjustment format

* Adjustment format

* change readme

* #966 update NOTICE.md (#967)

* remove comments

* check Start web behavior

* add default to cpu_cache_capacity

* remove swagger component.hpp & /docs url

* remove /docs info

* remove /docs in unittest

* remove space in test rpc

* remove repeate info in CHANGLOG

* change cache_insert_data default value as a constant

* adjust web port cofig place

* rename web_port variable

* change gpu resources invoke way to cmd()

* set advanced config name add DEFAULT

* change config setting to cmd

* modify ..

* optimize code

* assign TableDto' count default value 0 (fix #995)

* check if table exists when show partitions (fix #1028)

* check table exists when drop partition (fix #1029)

* check if partition name is legal (fix #1022)

* modify status code when partition tag is illegal

* update changlog

* add info to /system url

* add binary index and add bin uri & handler method(not completed)

* optimize http insert and search time(fix #1066) | add binary vectors support(fix #1067)

* fix test partition bug

* fix test bug when check insert records

* add binary vectors test

* add default for offset and page_size

* fix uinttest bug

* [skip ci] remove comments

* optimize web code for PR comments

* add new folder named utils

* check offset and pagesize (fix #1082)

* improve error message if offset or page_size is not legal (fix #1075)

* add log into web module

* update changlog

* check gpu sources setting when assign repeated value (fix #990)

* update changlog

* clang-format pass

* add default handler in http handler

* [skip ci] improve error msg when check gpu resources

* change check offset way

* remove func IsIntStr

* add case

* change int32 to int64 when check number str

* add log in we module(doing)

* update test case

* add log in web controller

Co-authored-by: jielinxu <52057195+jielinxu@users.noreply.github.com>
Co-authored-by: JackLCL <53512883+JackLCL@users.noreply.github.com>
Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>

* Filtering for specific paths in Jenkins CI  (#1107)

* run hadolint with reviewdog

* add LINCENSE in Dockerfile

* run hadolint with reviewdog

* Reporter of reviewdog command is "github-pr-check"

* format Dockerfile

* ignore DL3007 in hadolint

* clean up old docker images

* Add GPU sharing solution on native Kubernetes

* nightly test mailer

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Fix Filtering for specific paths in Jenkins CI bug (#1109)

* run hadolint with reviewdog

* add LINCENSE in Dockerfile

* run hadolint with reviewdog

* Reporter of reviewdog command is "github-pr-check"

* format Dockerfile

* ignore DL3007 in hadolint

* clean up old docker images

* Add GPU sharing solution on native Kubernetes

* nightly test mailer

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Fix Filtering for specific paths in Jenkins CI bug (#1110)

* run hadolint with reviewdog

* add LINCENSE in Dockerfile

* run hadolint with reviewdog

* Reporter of reviewdog command is "github-pr-check"

* format Dockerfile

* ignore DL3007 in hadolint

* clean up old docker images

* Add GPU sharing solution on native Kubernetes

* nightly test mailer

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Don't skip ci when triggered by a time (#1113)

* run hadolint with reviewdog

* add LINCENSE in Dockerfile

* run hadolint with reviewdog

* Reporter of reviewdog command is "github-pr-check"

* format Dockerfile

* ignore DL3007 in hadolint

* clean up old docker images

* Add GPU sharing solution on native Kubernetes

* nightly test mailer

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Don't skip ci when triggered by a time

* Don't skip ci when triggered by a time

* Set default sending to Milvus Dev mail group  (#1121)

* run hadolint with reviewdog

* add LINCENSE in Dockerfile

* run hadolint with reviewdog

* Reporter of reviewdog command is "github-pr-check"

* format Dockerfile

* ignore DL3007 in hadolint

* clean up old docker images

* Add GPU sharing solution on native Kubernetes

* nightly test mailer

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Test filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* Filtering for specific paths in Jenkins CI

* No skip ci when triggered by a time

* Don't skip ci when triggered by a time

* Set default sending to Milvus Dev

* Support hnsw (#1131)

* add hnsw

* add config

* format...

* format..

* Remove test.template (#1129)

* Update framework

* remove files

* Remove files

* Remove ann-acc cases && Update java-sdk cases

* change cn to en

* [skip ci] remove doc test

* [skip ci] change cn to en

* Case stability

* Add mail notification when test failed

* Add main notification

* Add main notification

* gen milvus instance from utils

* Distable case with multiprocess

* Add mail notification when nightly test failed

* add milvus handler param

* add http handler

* Remove test.template

Co-authored-by: quicksilver <zhifeng.zhang@zilliz.com>

* Add doc for the RESTful API / Update contributor number in Milvus readme (#1100)

* [skip ci] Update contributor number.

* [skip ci] Add RESTful API doc.

* [skip ci] Some updates.

* [skip ci] Change port to 19121.

* [skip ci] Update README.md.

Update the descriptions for OPTIONS.

* Update README.md

Fix a typo.

* #1105 update error message when creating IVFSQ8H index without GPU resources (#1117)

* [skip ci] Update README (#1104)

* remove Nvidia owned files from faiss (#1136)

* #1135 remove Nvidia owned files from faiss

* Revert "#1135 remove Nvidia owned files from faiss"

This reverts commit 3bc007c28c8df5861fdd0452fd64c0e2e719eda2.

* #1135 remove Nvidia API implementation

* #1135 remove Nvidia owned files from faiss

* Update CODE_OF_CONDUCT.md (#1163)

* Improve codecov (#1095)

* Optimize config test. Dir src/config 99% lines covered

* add unittest coverage

* optimize cache&config unittest

* code format

* format

* format code

* fix merge conflict

* cover src/utils unittest

*  '#831 fix exe_path judge error'

* #831 fix exe_path judge error

* add some unittest coverage

* add some unittest coverage

* improve coverage of src/wrapper

* improve src/wrapper coverage

* *test optimize db/meta unittest

* fix bug

* *test optimize mysqlMetaImpl unittest

* *style: format code

* import server& scheduler unittest coverage

* handover next work

* *test: add some test_meta test case

* *format code

* *fix: fix typo

* feat(codecov): improve code coverage for src/db(#872)

* feat(codecov): improve code coverage for src/db/engine(#872)

* feat(codecov): improve code coverage(#872)

* fix config unittest bug

* feat(codecov): improve code coverage core/db/engine(#872)

* feat(codecov): improve code coverage core/knowhere

* feat(codecov): improve code coverage core/knowhere

* feat(codecov): improve code coverage

* feat(codecov): fix cpu test some error

* feat(codecov): improve code coverage

* feat(codecov): rename some fiu

* fix(db/meta): fix switch/case default action

* feat(codecov): improve code coverage(#872)
* fix error caused by merge code
* format code

* feat(codecov): improve code coverage & format code(#872)

* feat(codecov): fix test error(#872)

* feat(codecov): fix unittest test_mem(#872)

* feat(codecov): fix unittest(#872)

* feat(codecov): fix unittest for resource manager(#872)

* feat(codecov): code format (#872)

* feat(codecov): trigger ci(#872)

* fix(RequestScheduler): remove a wrong sleep statement

* test(test_rpc): fix rpc test

* Fix format issue

* Remove unused comments

* Fix unit test error

Co-authored-by: ABNER-1 <ABNER-1@users.noreply.github.com>
Co-authored-by: Jin Hai <hai.jin@zilliz.com>

* Support run dev test with http handler in python SDK (#1116)

* refactoring(create_table done)

* refactoring

* refactor server delivery (insert done)

* refactoring server module (count_table done)

* server refactor done

* cmake pass

* refactor server module done.

* set grpc response status correctly

* format done.

* fix redefine ErrorMap()

* optimize insert reducing ids data copy

* optimize grpc request with reducing data copy

* clang format

* [skip ci] Refactor server module done. update changlog. prepare for PR

* remove explicit and change int32_t to int64_t

* add web server

* [skip ci] add license in web module

* modify header include & comment oatpp environment config

* add port configure & create table in handler

* modify web url

* simple url complation done & add swagger

* make sure web url

* web functionality done. debuging

* add web unittest

* web test pass

* add web server port

* add web server port in template

* update unittest cmake file

* change web server default port to 19121

* rename method in web module & unittest pass

* add search case in unittest for web module

* rename some variables

* fix bug

* unittest pass

* web prepare

* fix cmd bug(check server status)

* update changlog

* add web port validate & default set

* clang-format pass

* add web port test in unittest

* add CORS & redirect root to swagger ui

* add web status

* web table method func cascade test pass

* add config url in web module

* modify thirdparty cmake to avoid building oatpp test

* clang format

* update changlog

* add constants in web module

* reserve Config.cpp

* fix constants reference bug

* replace web server with async module

* modify component to support async

* format

* developing controller & add test clent into unittest

* add web port into demo/server_config

* modify thirdparty cmake to allow build test

* remove  unnecessary comment

* add endpoint info in controller

* finish web test(bug here)

* clang format

* add web test cpp to lint exclusions

* check null field in GetConfig

* add macro RETURN STATUS DTo

* fix cmake conflict

* fix crash when exit server

* remove surplus comments & add http param check

* add uri /docs to direct swagger

* format

* change cmd to system

* add default value & unittest in web module

* add macros to judge if GPU supported

* add macros in unit & add default in index dto & print error message when bind http port fail

* format (fix #788)

* fix cors bug (not completed)

* comment cors

* change web framework to simple api

* comments optimize

* change to simple API

* remove comments in controller.hpp

* remove EP_COMMON_CMAKE_ARGS in oatpp and oatpp-swagger

* add ep cmake args to sqlite

* clang-format

* change a format

* test pass

* change name to

* fix compiler issue(oatpp-swagger depend on oatpp)

* add & in start_server.h

* specify lib location with oatpp and oatpp-swagger

* add comments

* add swagger definition

* [skip ci] change http method options status code

* remove oatpp swagger(fix #970)

* remove comments

* check Start web behavior

* add default to cpu_cache_capacity

* remove swagger component.hpp & /docs url

* remove /docs info

* remove /docs in unittest

* remove space in test rpc

* remove repeate info in CHANGLOG

* change cache_insert_data default value as a constant

* [skip ci] Fix some broken links (#960)

* [skip ci] Fix broken link

* [skip ci] Fix broken link

* [skip ci] Fix broken link

* [skip ci] Fix broken links

* fix issue 373 (#964)

* fix issue 373

* Adjustment format

* Adjustment format

* Adjustment format

* change readme

* #966 update NOTICE.md (#967)

* remove comments

* check Start web behavior

* add default to cpu_cache_capacity

* remove swagger component.hpp & /docs url

* remove /docs info

* remove /docs in unittest

* remove space in test rpc

* remove repeate info in CHANGLOG

* change cache_insert_data default value as a constant

* adjust web port cofig place

* rename web_port variable

* change gpu resources invoke way to cmd()

* set advanced config name add DEFAULT

* change config setting to cmd

* modify ..

* optimize code

* assign TableDto' count default value 0 (fix #995)

* check if table exists when show partitions (fix #1028)

* check table exists when drop partition (fix #1029)

* check if partition name is legal (fix #1022)

* modify status code when partition tag is illegal

* update changlog

* add info to /system url

* add binary index and add bin uri & handler method(not completed)

* optimize http insert and search time(fix #1066) | add binary vectors support(fix #1067)

* fix test partition bug

* fix test bug when check insert records

* add binary vectors test

* add default for offset and page_size

* fix uinttest bug

* [skip ci] remove comments

* optimize web code for PR comments

* add new folder named utils

* check offset and pagesize (fix #1082)

* improve error message if offset or page_size is not legal (fix #1075)

* add log into web module

* update changlog

* check gpu sources setting when assign repeated value (fix #990)

* update changlog

* clang-format pass

* add default handler in http handler

* [skip ci] improve error msg when check gpu resources

* change check offset way

* remove func IsIntStr

* add case

* change int32 to int64 when check number str

* add log in we module(doing)

* update test case

* add log in web controller

* remove surplus dot

* add preload into /system/

* change get_milvus() to get_milvus(args['handler'])

* support load table into memory with http server (fix #1115)

* [skip ci] comment surplus dto in VectorDto

Co-authored-by: jielinxu <52057195+jielinxu@users.noreply.github.com>
Co-authored-by: JackLCL <53512883+JackLCL@users.noreply.github.com>
Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>

* Fix #1140 (#1162)

* fix

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* update...

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* fix2

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* fix3

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* update changelog

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* Update INSTALL.md (#1175)

* Update INSTALL.md

1. Change image tag and Milvus source code to latest.
2. Fix a typo

Signed-off-by: Lu Wang <yamasite@qq.com>

* Update INSTALL.md

Signed-off-by: lu.wang <yamasite@qq.com>

* add Tanimoto ground truth (#1138)

* add milvus ground truth

* add milvus groundtruth

* [skip ci] add milvus ground truth

* [skip ci]add tanimoto ground truth

* fix mix case bug (#1208)

* fix mix case bug

Signed-off-by: del.zhenwu <zhenxiang.li@zilliz.com>

* Remove case.md

Signed-off-by: del.zhenwu <zhenxiang.li@zilliz.com>

* Update README.md (#1206)

Add LFAI mailing lists.

Signed-off-by: Lutkin Wang <yamasite@qq.com>

* Add design.md to store links to design docs (#1219)

* Update README.md

Add link to Milvus design docs

Signed-off-by: Lutkin Wang <yamasite@qq.com>

* Create design.md

Signed-off-by: Lutkin Wang <yamasite@qq.com>

* Update design.md

Signed-off-by: Lutkin Wang <yamasite@qq.com>

* Add troubleshooting info about libmysqlpp.so.3 error (#1225)

* Update INSTALL.md

Signed-off-by: Lutkin Wang <yamasite@qq.com>

* Update INSTALL.md

Signed-off-by: Lutkin Wang <yamasite@qq.com>

* Update README.md (#1233)

Signed-off-by: Lutkin Wang <yamasite@qq.com>

* #1240 Update license declaration of each file (#1241)

* #1240 Update license declaration of each files

Signed-off-by: jinhai <hai.jin@zilliz.com>

* #1240 Update CHANGELOG

Signed-off-by: jinhai <hai.jin@zilliz.com>

* Update README.md (#1258)

Add Jenkins master badge.

Signed-off-by: Lutkin Wang <yamasite@qq.com>

* Update INSTALL.md (#1265)

Fix indentation.

* support CPU profiling (#1251)

* #1250 support CPU profiling

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* #1250 fix code coverage

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* Fix HNSW crash (#1262)

* fix

Signed-off-by: xiaojun.lin <xiaojun.lin@zilliz.com>

* update.

Signed-off-by: xiaojun.lin <xiaojun.lin@zilliz.com>

* Add troubleshooting information for INSTALL.md and enhance readability (#1274)

* Update INSTALL.md

1. Add new troubleshooting message;
2. Enhance readability.

Signed-off-by: Lutkin Wang <yamasite@qq.com>

* Update INSTALL.md

Signed-off-by: Lutkin Wang <yamasite@qq.com>

* Update INSTALL.md

Signed-off-by: Lutkin Wang <yamasite@qq.com>

* Update INSTALL.md

Add CentOS link.

Signed-off-by: Lutkin Wang <yamasite@qq.com>

* Create COMMUNITY.md (#1292)

Signed-off-by: Lutkin Wang <yamasite@qq.com>

* fix gtest

* add copyright

* fix gtest

* MERGE_NOT_YET

* fix lint

Co-authored-by: quicksilver <zhifeng.zhang@zilliz.com>
Co-authored-by: BossZou <40255591+BossZou@users.noreply.github.com>
Co-authored-by: jielinxu <52057195+jielinxu@users.noreply.github.com>
Co-authored-by: JackLCL <53512883+JackLCL@users.noreply.github.com>
Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>
Co-authored-by: Tinkerrr <linxiaojun.cn@outlook.com>
Co-authored-by: del-zhenwu <56623710+del-zhenwu@users.noreply.github.com>
Co-authored-by: Lutkin Wang <yamasite@qq.com>
Co-authored-by: shengjh <46514371+shengjh@users.noreply.github.com>
Co-authored-by: ABNER-1 <ABNER-1@users.noreply.github.com>
Co-authored-by: Jin Hai <hai.jin@zilliz.com>
Co-authored-by: shiyu22 <cshiyu22@gmail.com>

* #1302 Get all record IDs in a segment by given a segment id

* Remove query time ranges

Signed-off-by: zhenwu <zw@zilliz.com>

* #1295 let wal enable by default

* fix cases

Signed-off-by: zhenwu <zw@zilliz.com>

* fix partition cases

Signed-off-by: zhenwu <zw@zilliz.com>

* [skip ci] update test_db

* update

* fix case bug

Signed-off-by: zhenwu <zw@zilliz.com>

* lint

* fix test case failures

* remove some code

* Caiyd crud 1 (#1377)

* fix clang format

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* fix unittest build error

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* fix build issue when enable profiling

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* fix hastable bug

* update bloom filter

* update

* benchmark

* update benchmark

* update

* update

* remove wal record size

Signed-off-by: shengjun.li <shengjun.li@zilliz.com>

* remove wal record size config

Signed-off-by: shengjun.li <shengjun.li@zilliz.com>

* update apply deletes: switch to binary search

* update sdk_simple

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* update apply deletes: switch to binary search

* add test_search_by_id

Signed-off-by: zhenwu <zw@zilliz.com>

* add more log

* flush error with multi same ids

Signed-off-by: zhenwu <zw@zilliz.com>

* modify wal config

Signed-off-by: shengjun.li <shengjun.li@zilliz.com>

* update

* add binary search_by_id

* fix case bug

Signed-off-by: zhenwu <zw@zilliz.com>

* update cases

Signed-off-by: zhenwu <zw@zilliz.com>

* fix unit test #1395

* improve merge performance

* add uids_ for VectorIndex to improve search performance

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* fix error

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* update

* fix search

* fix record num

Signed-off-by: shengjun.li <shengjun.li@zilliz.com>

* refine code

* refine code

* Add get_vector_ids test cases (#1407)

* fix order

* add wal case

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* fix wal case

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* fix wal case

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* fix wal case

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* fix invalid operation issue

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* fix invalid operation issue

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* fix bug

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* fix bug

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* crud fix

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* crud fix

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* add table info test cases

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix cases

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix cases

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix cases

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix cases

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix cases

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
Signed-off-by: JinHai-CN <hai.jin@zilliz.com>

* add to compact case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* add to compact case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* add to compact case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* add case and debug compact

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* test pdb

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* test pdb

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* test pdb

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix cases

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* update table_info case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* update table_info case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* update table_info case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* update get vector ids case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* update get vector ids case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* update get vector ids case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* update get vector ids case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* update case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* update case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* update case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* update case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* update case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* pdb test

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* pdb test

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* add tests for get_vector_ids

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix case

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* add binary and ip

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix binary index

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix pdb

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* #1408 fix search result in-correct after DeleteById

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* add one case

* delete failed segment

* update serialize

* update serialize

* fix case

Signed-off-by: zhenwu <zw@zilliz.com>

* update

* update case assertion

Signed-off-by: zhenwu <zw@zilliz.com>

* [skip ci] update config

* change bloom filter msync flag to async

* #1319 add more timing debug info

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* update

* update

* add normalize

Signed-off-by: zhenwu <zw@zilliz.com>

* add normalize

Signed-off-by: zhenwu <zw@zilliz.com>

* add normalize

Signed-off-by: zhenwu <zw@zilliz.com>

* Fix compiling error

Signed-off-by: jinhai <hai.jin@zilliz.com>

* support ip (#1383)

* support ip

Signed-off-by: xiaojun.lin <xiaojun.lin@zilliz.com>

* IP result distance sort by descend

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* update

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* format

Signed-off-by: xiaojun.lin <xiaojun.lin@zilliz.com>

* get table lsn

* Remove unused third party

Signed-off-by: jinhai <hai.jin@zilliz.com>

* Refine code

Signed-off-by: jinhai <hai.jin@zilliz.com>

* #1319 fix clang format

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* fix wal applied lsn

Signed-off-by: shengjun.li <shengjun.li@zilliz.com>

* validate partition tag

* #1319 improve search performance

Signed-off-by: yudong.cai <yudong.cai@zilliz.com>

* build error

Co-authored-by: Zhiru Zhu <youny626@hotmail.com>
Co-authored-by: groot <yihua.mo@zilliz.com>
Co-authored-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
Co-authored-by: shengjh <46514371+shengjh@users.noreply.github.com>
Co-authored-by: del-zhenwu <56623710+del-zhenwu@users.noreply.github.com>
Co-authored-by: shengjun.li <49774184+shengjun1985@users.noreply.github.com>
Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>
Co-authored-by: quicksilver <zhifeng.zhang@zilliz.com>
Co-authored-by: BossZou <40255591+BossZou@users.noreply.github.com>
Co-authored-by: jielinxu <52057195+jielinxu@users.noreply.github.com>
Co-authored-by: JackLCL <53512883+JackLCL@users.noreply.github.com>
Co-authored-by: Tinkerrr <linxiaojun.cn@outlook.com>
Co-authored-by: Lutkin Wang <yamasite@qq.com>
Co-authored-by: ABNER-1 <ABNER-1@users.noreply.github.com>
Co-authored-by: shiyu22 <cshiyu22@gmail.com>
2020-02-29 16:11:31 +08:00

702 lines
28 KiB
Python
Executable File

import pdb
import copy
import struct
import pytest
import threading
import datetime
import logging
from time import sleep
from multiprocessing import Process
import numpy
from milvus import Milvus, IndexType, MetricType
from utils import *
dim = 128
table_id = "test_search"
add_interval_time = 2
vectors = gen_vectors(6000, dim)
# vectors /= numpy.linalg.norm(vectors)
# vectors = vectors.tolist()
nprobe = 1
epsilon = 0.001
tag = "overallpaper"
top_k = 5
nprobe = 1
non_exist_id = 9527
small_size = 6000
raw_vectors, binary_vectors = gen_binary_vectors(6000, dim)
class TestSearchBase:
@pytest.fixture(scope="function", autouse=True)
def skip_check(self, connect):
if str(connect._cmd("mode")[1]) == "GPU":
reason = "GPU mode not support"
logging.getLogger().info(reason)
pytest.skip(reason)
def init_data(self, connect, table, nb=6000):
'''
Generate vectors and add it in table, before search vectors
'''
global vectors
if nb == 6000:
add_vectors = vectors
else:
add_vectors = gen_vectors(nb, dim)
status, ids = connect.add_vectors(table, add_vectors)
sleep(add_interval_time)
return add_vectors, ids
def init_data_binary(self, connect, table, nb=6000):
'''
Generate vectors and add it in table, before search vectors
'''
global binary_vectors
if nb == 6000:
add_vectors = binary_vectors
else:
add_vectors = gen_binary_vectors(nb, dim)
status, ids = connect.add_vectors(table, add_vectors)
sleep(add_interval_time)
return add_vectors, ids
def init_data_no_flush(self, connect, table, nb=6000):
global vectors
if nb == 6000:
add_vectors = vectors
else:
add_vectors = gen_vectors(nb, dim)
status, ids = connect.add_vectors(table, add_vectors)
# sleep(add_interval_time)
return add_vectors, ids
def init_data_no_flush_ids(self, connect, table, nb=6000):
global vectors
my_ids = [i for i in range(nb)]
if nb == 6000:
add_vectors = vectors
else:
add_vectors = gen_vectors(nb, dim)
status, ids = connect.add_vectors(table, add_vectors, my_ids)
# sleep(add_interval_time)
return add_vectors, ids
def init_data_ids(self, connect, table, nb=6000):
global vectors
my_ids = [i for i in range(nb)]
if nb == 6000:
add_vectors = vectors
else:
add_vectors = gen_vectors(nb, dim)
status, ids = connect.add_vectors(table, add_vectors, my_ids)
sleep(add_interval_time)
return add_vectors, ids
def add_data(self, connect, table, vectors):
'''
Add specified vectors to table
'''
status, ids = connect.add_vectors(table, vectors)
# sleep(add_interval_time)
sleep(10)
return vectors, ids
def add_data_ids(self, connect, table, vectors):
my_ids = [i for i in range(len(vectors))]
status, ids = connect.add_vectors(table, vectors, my_ids)
sleep(add_interval_time)
return vectors, ids
def add_data_and_flush(self, connect, table, vectors):
status, ids = connect.add_vectors(table, vectors)
connect.flush([table])
return vectors, ids
def add_data_and_flush_ids(self, connect, table, vectors):
my_ids = [i for i in range(len(vectors))]
status, ids = connect.add_vectors(table, vectors, my_ids)
connect.flush([table])
return vectors, ids
def add_data_no_flush(self, connect, table, vectors):
'''
Add specified vectors to table
'''
status, ids = connect.add_vectors(table, vectors)
return vectors, ids
def add_data_no_flush_ids(self, connect, table, vectors):
my_ids = [i for i in range(len(vectors))]
status, ids = connect.add_vectors(table, vectors, my_ids)
return vectors, ids
# delete data and auto flush - timeout due to the flush interval in config file
def delete_data(self, connect, table, ids):
'''
delete vectors by id
'''
status = connect.delete_by_id(table, ids)
sleep(add_interval_time)
return status
# delete data and auto flush - timeout due to the flush interval in config file
def delete_data_no_flush(self, connect, table, ids):
'''
delete vectors by id
'''
status = connect.delete_by_id(table, ids)
return status
# delete data and manual flush
def delete_data_and_flush(self, connect, table, ids):
'''
delete vectors by id
'''
status = connect.delete_by_id(table, ids)
connect.flush([table])
return status
def check_no_result(self, results):
if len(results) == 0:
return True
flag = True
for r in results:
flag = flag and (r.id == -1)
if not flag:
return False
return flag
def init_data_partition(self, connect, table, partition_tag, nb=6000):
'''
Generate vectors and add it in table, before search vectors
'''
global vectors
if nb == 6000:
add_vectors = vectors
else:
add_vectors = gen_vectors(nb, dim)
# add_vectors /= numpy.linalg.norm(add_vectors)
# add_vectors = add_vectors.tolist()
status, ids = connect.add_vectors(table, add_vectors, partition_tag=partition_tag)
sleep(add_interval_time)
return add_vectors, ids
def init_data_and_flush(self, connect, table, nb=6000):
'''
Generate vectors and add it in table, before search vectors
'''
global vectors
if nb == 6000:
add_vectors = vectors
else:
add_vectors = gen_vectors(nb, dim)
# add_vectors /= numpy.linalg.norm(add_vectors)
# add_vectors = add_vectors.tolist()
status, ids = connect.add_vectors(table, add_vectors)
connect.flush([table])
return add_vectors, ids
def init_data_and_flush_ids(self, connect, table, nb=6000):
global vectors
my_ids = [i for i in range(nb)]
if nb == 6000:
add_vectors = vectors
else:
add_vectors = gen_vectors(nb, dim)
status, ids = connect.add_vectors(table, add_vectors, my_ids)
connect.flush([table])
return add_vectors, ids
def init_data_partition_and_flush(self, connect, table, partition_tag, nb=6000):
'''
Generate vectors and add it in table, before search vectors
'''
global vectors
if nb == 6000:
add_vectors = vectors
else:
add_vectors = gen_vectors(nb, dim)
# add_vectors /= numpy.linalg.norm(add_vectors)
# add_vectors = add_vectors.tolist()
status, ids = connect.add_vectors(table, add_vectors, partition_tag=partition_tag)
connect.flush([table])
return add_vectors, ids
@pytest.fixture(
scope="function",
params=gen_simple_index_params()
)
def get_simple_index_params(self, request, connect):
if request.param["index_type"] not in [IndexType.FLAT, IndexType.IVF_FLAT, IndexType.IVF_SQ8]:
pytest.skip("Skip PQ Temporary")
return request.param
@pytest.fixture(
scope="function",
params=gen_simple_index_params()
)
def get_jaccard_index_params(self, request, connect):
logging.getLogger().info(request.param)
if request.param["index_type"] == IndexType.IVFLAT or request.param["index_type"] == IndexType.FLAT:
return request.param
else:
pytest.skip("Skip index Temporary")
@pytest.fixture(
scope="function",
params=gen_simple_index_params()
)
def get_hamming_index_params(self, request, connect):
logging.getLogger().info(request.param)
if request.param["index_type"] == IndexType.IVFLAT or request.param["index_type"] == IndexType.FLAT:
return request.param
else:
pytest.skip("Skip index Temporary")
"""
generate top-k params
"""
@pytest.fixture(
scope="function",
params=[1, 99, 1024, 2048]
)
def get_top_k(self, request):
yield request.param
# auto flush
def test_search_flat_normal_topk(self, connect, table, get_top_k):
'''
target: test basic search fuction, all the search params is corrent, change top-k value
method: search with the given vector id, check the result
expected: search status ok, and the length of the result is top_k
'''
top_k = get_top_k
vectors, ids = self.init_data(connect, table, nb=small_size)
query_id = ids[0]
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert result[0][0].distance <= epsilon
assert check_result(result[0], ids[0])
def test_search_flat_max_topk(self, connect, table):
'''
target: test basic search fuction, all the search params is corrent, change top-k value
method: search with the given vector id, check the result
expected: search status ok, and the length of the result is top_k
'''
top_k = 2049
vectors, ids = self.init_data(connect, table, nb=small_size)
query_id = ids[0]
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
assert not status.OK()
def test_search_id_not_existed(self, connect, table):
'''
target: test basic search fuction, all the search params is corrent, change top-k value
method: search with the given vector id, check the result
expected: search status ok, and the length of the result is top_k
'''
vectors, ids = self.init_data_and_flush(connect, table, nb=small_size)
query_id = non_exist_id
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
# auto flush
def test_search_ids(self, connect, table):
vectors, ids = self.init_data_ids(connect, table, nb=small_size)
query_id = ids[0]
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert result[0][0].distance <= epsilon
assert check_result(result[0], ids[0])
# manual flush
def test_search_ids_flush(self, connect, table):
vectors, ids = self.init_data_and_flush_ids(connect, table, nb=small_size)
query_id = non_exist_id
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert self.check_no_result(result[0])
# ------------------------------------------------------------- l2, add manual flush, delete, search ------------------------------------------------------------- #
# ids, manual flush, search table, exist
def test_search_index_l2(self, connect, table, get_simple_index_params):
'''
target: test basic search fuction, all the search params is corrent, test all index params, and build
method: search with the given vectors, check the result
expected: search status ok, and the length of the result is top_k
'''
index_params = get_simple_index_params
vectors, ids = self.init_data_and_flush_ids(connect, table, nb=small_size)
status = connect.create_index(table, index_params)
query_id = ids[0]
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert result[0][0].distance <= epsilon
assert check_result(result[0], ids[0])
# ids, manual flush, search table, non exist
def test_search_index_l2_id_not_existed(self, connect, table, get_simple_index_params):
'''
target: test basic search fuction, all the search params is corrent, test all index params, and build
method: search with the given vectors, check the result
expected: search status ok, and the length of the result is top_k
'''
index_params = get_simple_index_params
vectors, ids = self.init_data_and_flush_ids(connect, table, nb=small_size)
status = connect.create_index(table, index_params)
query_id = non_exist_id
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
# ids, manual flush, delete, manual flush, search table, exist
def test_search_index_delete(self, connect, table, get_simple_index_params):
index_params = get_simple_index_params
vectors, ids = self.init_data_and_flush_ids(connect, table, nb=small_size)
status = connect.create_index(table, index_params)
query_id = ids[0]
status = self.delete_data_and_flush(connect, table, [query_id])
assert status.OK()
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
assert status.OK()
assert self.check_no_result(result[0])
# ids, manual flush, delete, manual flush, search table, non exist
def test_search_index_delete_id_not_existed(self, connect, table, get_simple_index_params):
index_params = get_simple_index_params
vectors, ids = self.init_data_and_flush_ids(connect, table, nb=small_size)
status = connect.create_index(table, index_params)
query_id = ids[0]
status = self.delete_data_and_flush(connect, table, [query_id])
assert status.OK()
query_id = non_exist_id
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
assert status.OK()
assert self.check_no_result(result[0])
def test_search_index_delete_no_flush(self, connect, table, get_simple_index_params):
index_params = get_simple_index_params
vectors, ids = self.init_data_and_flush_ids(connect, table, nb=small_size)
status = connect.create_index(table, index_params)
query_id = ids[0]
status = self.delete_data_no_flush(connect, table, [query_id])
assert status.OK()
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
assert status.OK()
assert check_result(result[0], query_id)
# ids, manual flush, delete, no flush, search table, non exist
def test_search_index_delete_no_flush_id_not_existed(self, connect, table, get_simple_index_params):
index_params = get_simple_index_params
vectors, ids = self.init_data_and_flush_ids(connect, table, nb=small_size)
status = connect.create_index(table, index_params)
query_id = ids[0]
status = self.delete_data_no_flush(connect, table, [query_id])
assert status.OK()
query_id = non_exist_id
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
assert status.OK()
assert self.check_no_result(result[0])
def test_search_index_delete_add(self, connect, table, get_simple_index_params):
index_params = get_simple_index_params
vectors, ids = self.init_data_and_flush_ids(connect, table, nb=small_size)
status = connect.create_index(table, index_params)
query_id = ids[0]
status = self.delete_data_no_flush(connect, table, [query_id])
assert status.OK()
vectors, new_ids = self.add_data_and_flush_ids(connect, table, vectors)
status = connect.create_index(table, index_params)
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert result[0][0].distance <= epsilon
assert check_result(result[0], query_id)
status = self.delete_data_no_flush(connect, table, [query_id])
assert status.OK()
# add to table, auto flush, search table, search partition exist
def test_search_l2_index_partition(self, connect, table, get_simple_index_params):
'''
target: test basic search fuction, all the search params is corrent, test all index params, and build
method: add vectors into table, search with the given vectors, check the result
expected: search status ok, and the length of the result is top_k, search table with partition tag return empty
'''
index_params = get_simple_index_params
status = connect.create_partition(table, tag)
vectors, ids = self.init_data(connect, table, nb=small_size)
status = connect.create_index(table, index_params)
query_id = ids[0]
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert check_result(result[0], ids[0])
assert result[0][0].distance <= epsilon
status, result = connect.search_by_id(table, top_k, nprobe, query_id, partition_tag_array=[tag])
assert status.OK()
assert len(result) == 0
# add to partition, auto flush, search partition exist
def test_search_l2_index_params_partition_2(self, connect, table, get_simple_index_params):
index_params = get_simple_index_params
status = connect.create_partition(table, tag)
vectors, ids = self.init_data_partition(connect, table, tag, nb=small_size)
status = connect.create_index(table, index_params)
query_id = ids[0]
status, result = connect.search_by_id(table, top_k, nprobe, query_id, partition_tag_array=[tag])
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert check_result(result[0], query_id)
def test_search_l2_index_partition_id_not_existed(self, connect, table, get_simple_index_params):
index_params = get_simple_index_params
status = connect.create_partition(table, tag)
vectors, ids = self.init_data(connect, table, nb=small_size)
status = connect.create_index(table, index_params)
query_id = non_exist_id
status, result = connect.search_by_id(table, top_k, nprobe, query_id, partition_tag_array=[tag])
assert status.OK()
assert len(result) == 0
# add to table, manual flush, search non-existing partition non exist
def test_search_l2_index_partition_tag_not_existed(self, connect, table, get_simple_index_params):
index_params = get_simple_index_params
status = connect.create_partition(table, tag)
vectors, ids = self.init_data_partition_and_flush(connect, table, tag, nb=small_size)
status = connect.create_index(table, index_params)
query_id = non_exist_id
status, result = connect.search_by_id(table, top_k, nprobe, query_id, partition_tag_array=['non_existing_tag'])
assert status.OK()
assert len(result) == 0
def test_search_l2_index_partitions(self, connect, table, get_simple_index_params):
new_tag = "new_tag"
index_params = get_simple_index_params
status = connect.create_partition(table, tag)
status = connect.create_partition(table, new_tag)
vectors, ids = self.init_data_partition_and_flush(connect, table, tag, nb=small_size)
vectors, new_ids = self.init_data_partition_and_flush(connect, table, new_tag, nb=small_size)
status = connect.create_index(table, index_params)
query_id = ids[0]
status, result = connect.search_by_id(table, top_k, nprobe, query_id, partition_tag_array=[tag, new_tag])
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert check_result(result[0], ids[0])
assert result[0][0].distance <= epsilon
query_id = new_ids[0]
status, result = connect.search_by_id(table, top_k, nprobe, query_id, partition_tag_array=[tag, new_tag])
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert check_result(result[0], new_ids[0])
assert result[0][0].distance <= epsilon
@pytest.mark.level(2)
def test_search_by_id_without_connect(self, dis_connect, table):
'''
target: test search vectors without connection
method: use dis connected instance, call search method and check if search successfully
expected: raise exception
'''
query_idtors = 123
with pytest.raises(Exception) as e:
status, ids = dis_connect.search_by_id(table, top_k, nprobe, query_idtors)
def test_search_table_name_not_existed(self, connect, table):
'''
target: search table not existed
method: search with the random table_name, which is not in db
expected: status not ok
'''
table_name = gen_unique_str("not_existed_table")
query_id = non_exist_id
status, result = connect.search_by_id(table_name, top_k, nprobe, query_id)
assert not status.OK()
def test_search_table_name_None(self, connect, table):
'''
target: search table that table name is None
method: search with the table_name: None
expected: status not ok
'''
table_name = None
query_ids = non_exist_id
with pytest.raises(Exception) as e:
status, result = connect.search_by_id(table_name, top_k, nprobe, query_id)
def test_search_jac(self, connect, jac_table, get_jaccard_index_params):
index_params = get_jaccard_index_params
vectors, ids = self.init_data_binary(connect, jac_table)
status = connect.create_index(jac_table, index_params)
assert status.OK()
query_id = ids[0]
status, result = connect.search_by_id(jac_table, top_k, nprobe, query_id)
logging.getLogger().info(status)
logging.getLogger().info(result)
assert status.OK()
assert check_result(result[0], ids[0])
assert result[0][0].distance <= epsilon
def test_search_ham(self, connect, ham_table, get_hamming_index_params):
index_params = get_hamming_index_params
vectors, ids = self.init_data_binary(connect, ham_table)
status = connect.create_index(ham_table, index_params)
assert status.OK()
query_id = ids[0]
status, result = connect.search_by_id(ham_table, top_k, nprobe, query_id)
logging.getLogger().info(status)
logging.getLogger().info(result)
assert status.OK()
assert check_result(result[0], ids[0])
assert result[0][0].distance <= epsilon
"""
******************************************************************
# The following cases are used to test `search_by_id` function
# with invalid table_name top-k / nprobe / query_range
******************************************************************
"""
class TestSearchParamsInvalid(object):
nlist = 16384
index_param = {"index_type": IndexType.IVF_SQ8, "nlist": nlist}
"""
Test search table with invalid table names
"""
@pytest.fixture(
scope="function",
params=gen_invalid_table_names()
)
def get_table_name(self, request):
yield request.param
@pytest.mark.level(2)
def test_search_with_invalid_tablename(self, connect, get_table_name):
table_name = get_table_name
query_id = non_exist_id
status, result = connect.search_by_id(table_name, top_k, nprobe, query_id)
assert not status.OK()
@pytest.mark.level(1)
def test_search_with_invalid_tag_format(self, connect, table):
query_id = non_exist_id
with pytest.raises(Exception) as e:
status, result = connect.search_by_id(table_name, top_k, nprobe, query_id, partition_tag_array="tag")
"""
Test search table with invalid top-k
"""
@pytest.fixture(
scope="function",
params=gen_invalid_top_ks()
)
def get_top_k(self, request):
yield request.param
@pytest.mark.level(1)
def test_search_with_invalid_top_k(self, connect, table, get_top_k):
top_k = get_top_k
query_id = non_exist_id
if isinstance(top_k, int):
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
assert not status.OK()
else:
with pytest.raises(Exception) as e:
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
@pytest.mark.level(2)
def test_search_with_invalid_top_k_ip(self, connect, ip_table, get_top_k):
top_k = get_top_k
query_id = non_exist_id
if isinstance(top_k, int):
status, result = connect.search_by_id(ip_table, top_k, nprobe, query_id)
assert not status.OK()
else:
with pytest.raises(Exception) as e:
status, result = connect.search_by_id(ip_table, top_k, nprobe, query_id)
"""
Test search table with invalid nprobe
"""
@pytest.fixture(
scope="function",
params=gen_invalid_nprobes()
)
def get_nprobes(self, request):
yield request.param
@pytest.mark.level(1)
def test_search_with_invalid_nprobe(self, connect, table, get_nprobes):
nprobe = get_nprobes
logging.getLogger().info(nprobe)
query_id = non_exist_id
if isinstance(nprobe, int):
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
assert not status.OK()
else:
with pytest.raises(Exception) as e:
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
@pytest.mark.level(2)
def test_search_with_invalid_nprobe_ip(self, connect, ip_table, get_nprobes):
'''
target: test search fuction, with the wrong top_k
method: search with top_k
expected: raise an error, and the connection is normal
'''
nprobe = get_nprobes
logging.getLogger().info(nprobe)
query_id = non_exist_id
if isinstance(nprobe, int):
status, result = connect.search_by_id(ip_table, top_k, nprobe, query_id)
assert not status.OK()
else:
with pytest.raises(Exception) as e:
status, result = connect.search_by_id(ip_table, top_k, nprobe, query_id)
"""
Test search table with invalid ids
"""
@pytest.fixture(
scope="function",
params=gen_invalid_vector_ids()
)
def get_vector_ids(self, request):
yield request.param
@pytest.mark.level(1)
def test_search_flat_with_invalid_vector_id(self, connect, table, get_vector_ids):
'''
target: test search fuction, with the wrong query_range
method: search with query_range
expected: raise an error, and the connection is normal
'''
query_id = get_vector_ids
logging.getLogger().info(query_id)
with pytest.raises(Exception) as e:
status, result = connect.search_by_id(table, top_k, nprobe, query_id)
@pytest.mark.level(2)
def test_search_flat_with_invalid_vector_id_ip(self, connect, ip_table, get_vector_ids):
query_id = get_vector_ids
logging.getLogger().info(query_id)
with pytest.raises(Exception) as e:
status, result = connect.search_by_id(ip_table, top_k, nprobe, query_id)
def check_result(result, id):
if len(result) >= 5:
return id in [x.id for x in result[:5]]
else:
return id in (i.id for i in result)