milvus/tests/milvus_python_test/test_compact.py
groot a08b51c2b6
merge json to master to get docker image (#1500)
* General proto api for NNS libraries

Signed-off-by: groot <yihua.mo@zilliz.com>

* refactor confadapter

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* fix unittest failures

Signed-off-by: groot <yihua.mo@zilliz.com>

* update test_add

Signed-off-by: zhenwu <zw@zilliz.com>

* update knowhere

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* update test cases

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* Update cases

* C++ sdk for json parameters

Signed-off-by: groot <yihua.mo@zilliz.com>

* update unittest

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* fix unittest failures

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix case

Signed-off-by: del-zhenwu <zw@zilliz.com>

* modify test_index.py

Signed-off-by: shengjh <jianghong.sheng@zilliz.com>

* update

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* update sptag

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* update...

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* Build Pass

Signed-off-by: xiaojun.lin <xiaojun.lin@zilliz.com>

* knowhere/wrapper ut pass

Signed-off-by: xiaojun.lin <xiaojun.lin@zilliz.com>

* update util

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* fix wal case

Signed-off-by: del-zhenwu <zw@zilliz.com>

* modify test_search_vectors

Signed-off-by: shengjh <jianghong.sheng@zilliz.com>

* update ci

Signed-off-by: del-zhenwu <zw@zilliz.com>

* update util

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* modify test_search_vectoes

Signed-off-by: shengjh <jianghong.sheng@zilliz.com>

* add hnsw in http module & modify index apis

Signed-off-by: Yhz <yinghao.zou@zilliz.com>

* modify search in http module

Signed-off-by: Yhz <yinghao.zou@zilliz.com>

* fix build error

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix typo in test_index and test_search

Signed-off-by: shengjh <jianghong.sheng@zilliz.com>

* update...

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* index apis in http module done

Signed-off-by: Yhz <yinghao.zou@zilliz.com>

* fix build index bug

Signed-off-by: groot <yihua.mo@zilliz.com>

* search apis unittest pass

Signed-off-by: Yhz <yinghao.zou@zilliz.com>

* web test pass

Signed-off-by: Yhz <yinghao.zou@zilliz.com>

* update confadapter

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* update util

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* code format

Signed-off-by: groot <yihua.mo@zilliz.com>

* code format

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix vectors results bug (fix #1476)

Signed-off-by: Yhz <yinghao.zou@zilliz.com>

* clang format

Signed-off-by: Yhz <yinghao.zou@zilliz.com>

* update test

Signed-off-by: shengjh <jianghong.sheng@zilliz.com>

* fix unittest

Signed-off-by: groot <yihua.mo@zilliz.com>

* add test_config

Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>

* add log

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix a build error

Signed-off-by: groot <yihua.mo@zilliz.com>

* add invalid param search test

Signed-off-by: shengjh <jianghong.sheng@zilliz.com>

* fix range check

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* cpmpact/flush case passed

Signed-off-by: del-zhenwu <zhenxiang.li@zilliz.com>

* fix unittest failures

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix unittest failures

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix unittest failures

Signed-off-by: groot <yihua.mo@zilliz.com>

* validate json parameters in request

Signed-off-by: groot <yihua.mo@zilliz.com>

* add unittest cases

Signed-off-by: groot <yihua.mo@zilliz.com>

* update test index/search

Signed-off-by: shengjh <jianghong.sheng@zilliz.com>

* update test_config

Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>

* fix

Signed-off-by: shengjh <jianghong.sheng@zilliz.com>

* support nsg and ivf-nlist

Signed-off-by: Nicky <nicky.xj.lin@gmail.com>

* update

Signed-off-by: xiaojun.lin <xiaojun.lin@zilliz.com>

* fix validation bug

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix python test bug

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix python test bug

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix python test bug

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix python test bug

Signed-off-by: groot <yihua.mo@zilliz.com>

* code format

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix python test failure

Signed-off-by: groot <yihua.mo@zilliz.com>

* remove rnsg cases

Signed-off-by: zhenwu <zw@zilliz.com>

* fix python test failure

Signed-off-by: groot <yihua.mo@zilliz.com>

* Update changelog

Signed-off-by: JinHai-CN <hai.jin@zilliz.com>

* Fix typo

Signed-off-by: JinHai-CN <hai.jin@zilliz.com>

* add pq to test_index && multithread test

Signed-off-by: shengjh <jianghong.sheng@zilliz.com>

* add pq to test_search

Signed-off-by: shengjh <jianghong.sheng@zilliz.com>

* Fix format

Signed-off-by: JinHai-CN <hai.jin@zilliz.com>

* Update CHANGELOG

Signed-off-by: JinHai-CN <hai.jin@zilliz.com>

* Fix compiling error

Signed-off-by: JinHai-CN <hai.jin@zilliz.com>

* Fix compiling error

Signed-off-by: JinHai-CN <hai.jin@zilliz.com>

* fix config bug

Signed-off-by: groot <yihua.mo@zilliz.com>

* code format

Signed-off-by: groot <yihua.mo@zilliz.com>

* fix config test

Signed-off-by: xiaojun.lin <xiaojun.lin@zilliz.com>

* Update CHANGELOG.md

Signed-off-by: JinHai-CN <hai.jin@zilliz.com>

* Update CHANGELOG.md

Signed-off-by: JinHai-CN <hai.jin@zilliz.com>

* disable config test case

Signed-off-by: zhenwu <zw@zilliz.com>

Co-authored-by: Nicky <nicky.xj.lin@gmail.com>
Co-authored-by: zhenwu <zw@zilliz.com>
Co-authored-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
Co-authored-by: shengjh <jianghong.sheng@zilliz.com>
Co-authored-by: xiaojun.lin <xiaojun.lin@zilliz.com>
Co-authored-by: Yhz <yinghao.zou@zilliz.com>
Co-authored-by: del-zhenwu <zhenxiang.li@zilliz.com>
Co-authored-by: JinHai-CN <hai.jin@zilliz.com>
2020-03-07 15:23:34 +08:00

1050 lines
41 KiB
Python

import time
import pdb
import threading
import logging
from multiprocessing import Pool, Process
import pytest
from milvus import IndexType, MetricType
from utils import *
dim = 128
index_file_size = 10
COMPACT_TIMEOUT = 30
nprobe = 1
top_k = 1
tag = "1970-01-01"
nb = 6000
class TestCompactBase:
"""
******************************************************************
The following cases are used to test `compact` function
******************************************************************
"""
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_compact_table_name_None(self, connect, table):
'''
target: compact table where table name is None
method: compact with the table_name: None
expected: exception raised
'''
table_name = None
with pytest.raises(Exception) as e:
status = connect.compact(table_name)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_compact_table_name_not_existed(self, connect, table):
'''
target: compact table not existed
method: compact with a random table_name, which is not in db
expected: status not ok
'''
table_name = gen_unique_str("not_existed_table")
status = connect.compact(table_name)
assert not status.OK()
@pytest.fixture(
scope="function",
params=gen_invalid_table_names()
)
def get_table_name(self, request):
yield request.param
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_compact_table_name_invalid(self, connect, get_table_name):
'''
target: compact table with invalid name
method: compact with invalid table_name
expected: status not ok
'''
table_name = get_table_name
status = connect.compact(table_name)
assert not status.OK()
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vector_and_compact(self, connect, table):
'''
target: test add vector and compact
method: add vector and compact table
expected: status ok, vector added
'''
vector = gen_single_vector(dim)
status, ids = connect.add_vectors(table, vector)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(table)
assert status.OK()
logging.getLogger().info(info)
size_before = info.partitions_stat[0].segments_stat[0].data_size
status = connect.compact(table)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(table)
assert status.OK()
size_after = info.partitions_stat[0].segments_stat[0].data_size
assert(size_before == size_after)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vectors_and_compact(self, connect, table):
'''
target: test add vectors and compact
method: add vectors and compact table
expected: status ok, vectors added
'''
vectors = gen_vector(nb, dim)
status, ids = connect.add_vectors(table, vectors)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(table)
assert status.OK()
size_before = info.partitions_stat[0].segments_stat[0].data_size
status = connect.compact(table)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(table)
assert status.OK()
size_after = info.partitions_stat[0].segments_stat[0].data_size
assert(size_before == size_after)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vectors_delete_part_and_compact(self, connect, table):
'''
target: test add vectors, delete part of them and compact
method: add vectors, delete a few and compact table
expected: status ok, data size is smaller after compact
'''
vectors = gen_vector(nb, dim)
status, ids = connect.add_vectors(table, vectors)
assert status.OK()
status = connect.flush([table])
assert status.OK()
delete_ids = [ids[0], ids[-1]]
status = connect.delete_by_id(table, delete_ids)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(table)
assert status.OK()
logging.getLogger().info(info.partitions_stat)
size_before = info.partitions_stat[0].segments_stat[0].data_size
logging.getLogger().info(size_before)
status = connect.compact(table)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(table)
assert status.OK()
logging.getLogger().info(info.partitions_stat)
size_after = info.partitions_stat[0].segments_stat[0].data_size
logging.getLogger().info(size_after)
assert(size_before > size_after)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vectors_delete_all_and_compact(self, connect, table):
'''
target: test add vectors, delete them and compact
method: add vectors, delete all and compact table
expected: status ok, no data size in table info because table is empty
'''
vectors = gen_vector(nb, dim)
status, ids = connect.add_vectors(table, vectors)
assert status.OK()
status = connect.flush([table])
assert status.OK()
status = connect.delete_by_id(table, ids)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(table)
assert status.OK()
status = connect.compact(table)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(table)
assert status.OK()
logging.getLogger().info(info.partitions_stat)
assert(len(info.partitions_stat[0].segments_stat) == 0)
@pytest.fixture(
scope="function",
params=gen_simple_index()
)
def get_simple_index(self, request, connect):
if str(connect._cmd("mode")[1]) == "CPU":
if request.param["index_type"] not in [IndexType.IVF_SQ8, IndexType.IVFLAT, IndexType.FLAT]:
pytest.skip("Only support index_type: flat/ivf_flat/ivf_sq8")
else:
pytest.skip("Only support CPU mode")
return request.param
def test_compact_after_index_created(self, connect, table, get_simple_index):
'''
target: test compact table after index created
method: add vectors, create index, delete part of vectors and compact
expected: status ok, index description no change, data size smaller after compact
'''
count = 10
index_param = get_simple_index["index_param"]
index_type = get_simple_index["index_type"]
vectors = gen_vector(count, dim)
status, ids = connect.add_vectors(table, vectors)
assert status.OK()
status = connect.flush([table])
assert status.OK()
status = connect.create_index(table, index_type, index_param)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(table)
assert status.OK()
size_before = info.partitions_stat[0].segments_stat[0].data_size
logging.getLogger().info(info.partitions_stat)
delete_ids = [ids[0], ids[-1]]
status = connect.delete_by_id(table, delete_ids)
assert status.OK()
status = connect.flush([table])
assert status.OK()
status = connect.compact(table)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(table)
assert status.OK()
logging.getLogger().info(info.partitions_stat)
size_after = info.partitions_stat[0].segments_stat[0].data_size
assert(size_before > size_after)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vector_and_compact_twice(self, connect, table):
'''
target: test add vector and compact twice
method: add vector and compact table twice
expected: status ok, data size no change
'''
vector = gen_single_vector(dim)
status, ids = connect.add_vectors(table, vector)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(table)
assert status.OK()
size_before = info.partitions_stat[0].segments_stat[0].data_size
status = connect.compact(table)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(table)
assert status.OK()
size_after = info.partitions_stat[0].segments_stat[0].data_size
assert(size_before == size_after)
status = connect.compact(table)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info after compact twice
status, info = connect.table_info(table)
assert status.OK()
size_after_twice = info.partitions_stat[0].segments_stat[0].data_size
assert(size_after == size_after_twice)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vectors_delete_part_and_compact_twice(self, connect, table):
'''
target: test add vectors, delete part of them and compact twice
method: add vectors, delete part and compact table twice
expected: status ok, data size smaller after first compact, no change after second
'''
vectors = gen_vector(nb, dim)
status, ids = connect.add_vectors(table, vectors)
assert status.OK()
status = connect.flush([table])
assert status.OK()
delete_ids = [ids[0], ids[-1]]
status = connect.delete_by_id(table, delete_ids)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(table)
assert status.OK()
size_before = info.partitions_stat[0].segments_stat[0].data_size
status = connect.compact(table)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(table)
assert status.OK()
size_after = info.partitions_stat[0].segments_stat[0].data_size
assert(size_before > size_after)
status = connect.compact(table)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info after compact twice
status, info = connect.table_info(table)
assert status.OK()
size_after_twice = info.partitions_stat[0].segments_stat[0].data_size
assert(size_after == size_after_twice)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_compact_multi_tables(self, connect):
'''
target: test compact works or not with multiple tables
method: create 50 tables, add vectors into them and compact in turn
expected: status ok
'''
nq = 100
num_tables = 50
vectors = gen_vectors(nq, dim)
table_list = []
for i in range(num_tables):
table_name = gen_unique_str("test_compact_multi_table_%d" % i)
table_list.append(table_name)
param = {'table_name': table_name,
'dimension': dim,
'index_file_size': index_file_size,
'metric_type': MetricType.L2}
connect.create_table(param)
time.sleep(6)
for i in range(num_tables):
status, ids = connect.add_vectors(table_name=table_list[i], records=vectors)
assert status.OK()
status = connect.compact(table_list[i])
assert status.OK()
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vector_after_compact(self, connect, table):
'''
target: test add vector after compact
method: after compact operation, add vector
expected: status ok, vector added
'''
vectors = gen_vector(nb, dim)
status, ids = connect.add_vectors(table, vectors)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(table)
assert status.OK()
size_before = info.partitions_stat[0].segments_stat[0].data_size
status = connect.compact(table)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(table)
assert status.OK()
size_after = info.partitions_stat[0].segments_stat[0].data_size
assert(size_before == size_after)
vector = gen_single_vector(dim)
status, ids = connect.add_vectors(table, vector)
assert status.OK()
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_index_creation_after_compact(self, connect, table, get_simple_index):
'''
target: test index creation after compact
method: after compact operation, create index
expected: status ok, index description no change
'''
vectors = gen_vector(nb, dim)
status, ids = connect.add_vectors(table, vectors)
assert status.OK()
status = connect.flush([table])
assert status.OK()
status = connect.compact(table)
assert status.OK()
status = connect.flush([table])
assert status.OK()
index_param = get_simple_index["index_param"]
index_type = get_simple_index["index_type"]
status = connect.create_index(table, index_type, index_param)
assert status.OK()
status, result = connect.describe_index(table)
assert result._table_name == table
assert result._index_type == index_type
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_delete_vectors_after_compact(self, connect, table):
'''
target: test delete vectors after compact
method: after compact operation, delete vectors
expected: status ok, vectors deleted
'''
vectors = gen_vector(nb, dim)
status, ids = connect.add_vectors(table, vectors)
assert status.OK()
status = connect.flush([table])
assert status.OK()
status = connect.compact(table)
assert status.OK()
status = connect.flush([table])
assert status.OK()
status = connect.delete_by_id(table, ids)
assert status.OK()
status = connect.flush([table])
assert status.OK()
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_search_after_compact(self, connect, table):
'''
target: test search after compact
method: after compact operation, search vector
expected: status ok
'''
vectors = gen_vector(nb, dim)
status, ids = connect.add_vectors(table, vectors)
assert status.OK()
status = connect.flush([table])
assert status.OK()
status = connect.compact(table)
assert status.OK()
status = connect.flush([table])
assert status.OK()
query_vecs = [vectors[0]]
status, res = connect.search_vectors(table, top_k, query_records=query_vecs)
logging.getLogger().info(res)
assert status.OK()
def test_compact_server_crashed_recovery(self, connect, table):
'''
target: test compact when server crashed unexpectedly and restarted
method: add vectors, delete and compact table; server stopped and restarted during compact
expected: status ok, request recovered
'''
vectors = gen_vector(nb * 100, dim)
status, ids = connect.add_vectors(table, vectors)
assert status.OK()
status = connect.flush([table])
assert status.OK()
delete_ids = ids[0:1000]
status = connect.delete_by_id(table, delete_ids)
assert status.OK()
status = connect.flush([table])
assert status.OK()
# start to compact, kill and restart server
logging.getLogger().info("compact starting...")
status = connect.compact(table)
# pdb.set_trace()
assert status.OK()
status = connect.flush([table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(table)
assert status.OK()
assert info.partitions_stat[0].count == nb * 100 - 1000
class TestCompactJAC:
"""
******************************************************************
The following cases are used to test `compact` function
******************************************************************
"""
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vector_and_compact(self, connect, jac_table):
'''
target: test add vector and compact
method: add vector and compact table
expected: status ok, vector added
'''
tmp, vector = gen_binary_vectors(1, dim)
status, ids = connect.add_vectors(jac_table, vector)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(jac_table)
assert status.OK()
size_before = info.partitions_stat[0].segments_stat[0].data_size
status = connect.compact(jac_table)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(jac_table)
assert status.OK()
size_after = info.partitions_stat[0].segments_stat[0].data_size
assert(size_before == size_after)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vectors_and_compact(self, connect, jac_table):
'''
target: test add vectors and compact
method: add vectors and compact table
expected: status ok, vectors added
'''
tmp, vectors = gen_binary_vectors(nb, dim)
status, ids = connect.add_vectors(jac_table, vectors)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(jac_table)
assert status.OK()
size_before = info.partitions_stat[0].segments_stat[0].data_size
status = connect.compact(jac_table)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(jac_table)
assert status.OK()
size_after = info.partitions_stat[0].segments_stat[0].data_size
assert(size_before == size_after)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vectors_delete_part_and_compact(self, connect, jac_table):
'''
target: test add vectors, delete part of them and compact
method: add vectors, delete a few and compact table
expected: status ok, data size is smaller after compact
'''
tmp, vectors = gen_binary_vectors(nb, dim)
status, ids = connect.add_vectors(jac_table, vectors)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
delete_ids = [ids[0], ids[-1]]
status = connect.delete_by_id(jac_table, delete_ids)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(jac_table)
assert status.OK()
logging.getLogger().info(info.partitions_stat)
size_before = info.partitions_stat[0].segments_stat[0].data_size
logging.getLogger().info(size_before)
status = connect.compact(jac_table)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(jac_table)
assert status.OK()
logging.getLogger().info(info.partitions_stat)
size_after = info.partitions_stat[0].segments_stat[0].data_size
logging.getLogger().info(size_after)
assert(size_before > size_after)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vectors_delete_all_and_compact(self, connect, jac_table):
'''
target: test add vectors, delete them and compact
method: add vectors, delete all and compact table
expected: status ok, no data size in table info because table is empty
'''
tmp, vectors = gen_binary_vectors(nb, dim)
status, ids = connect.add_vectors(jac_table, vectors)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
status = connect.delete_by_id(jac_table, ids)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(jac_table)
assert status.OK()
status = connect.compact(jac_table)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(jac_table)
assert status.OK()
logging.getLogger().info(info.partitions_stat)
assert(len(info.partitions_stat[0].segments_stat) == 0)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vector_and_compact_twice(self, connect, jac_table):
'''
target: test add vector and compact twice
method: add vector and compact table twice
expected: status ok
'''
tmp, vector = gen_binary_vectors(1, dim)
status, ids = connect.add_vectors(jac_table, vector)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(jac_table)
assert status.OK()
size_before = info.partitions_stat[0].segments_stat[0].data_size
status = connect.compact(jac_table)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(jac_table)
assert status.OK()
size_after = info.partitions_stat[0].segments_stat[0].data_size
assert(size_before == size_after)
status = connect.compact(jac_table)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
# get table info after compact twice
status, info = connect.table_info(jac_table)
assert status.OK()
size_after_twice = info.partitions_stat[0].segments_stat[0].data_size
assert(size_after == size_after_twice)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vectors_delete_part_and_compact_twice(self, connect, jac_table):
'''
target: test add vectors, delete part of them and compact twice
method: add vectors, delete part and compact table twice
expected: status ok, data size smaller after first compact, no change after second
'''
tmp, vectors = gen_binary_vectors(nb, dim)
status, ids = connect.add_vectors(jac_table, vectors)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
delete_ids = [ids[0], ids[-1]]
status = connect.delete_by_id(jac_table, delete_ids)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(jac_table)
assert status.OK()
size_before = info.partitions_stat[0].segments_stat[0].data_size
status = connect.compact(jac_table)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(jac_table)
assert status.OK()
size_after = info.partitions_stat[0].segments_stat[0].data_size
assert(size_before > size_after)
status = connect.compact(jac_table)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
# get table info after compact twice
status, info = connect.table_info(jac_table)
assert status.OK()
size_after_twice = info.partitions_stat[0].segments_stat[0].data_size
assert(size_after == size_after_twice)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_compact_multi_tables(self, connect):
'''
target: test compact works or not with multiple tables
method: create 50 tables, add vectors into them and compact in turn
expected: status ok
'''
nq = 100
num_tables = 10
tmp, vectors = gen_binary_vectors(nq, dim)
table_list = []
for i in range(num_tables):
table_name = gen_unique_str("test_compact_multi_table_%d" % i)
table_list.append(table_name)
param = {'table_name': table_name,
'dimension': dim,
'index_file_size': index_file_size,
'metric_type': MetricType.JACCARD}
connect.create_table(param)
time.sleep(6)
for i in range(num_tables):
status, ids = connect.add_vectors(table_name=table_list[i], records=vectors)
assert status.OK()
status = connect.delete_by_id(table_list[i], [ids[0], ids[-1]])
assert status.OK()
status = connect.flush([table_list[i]])
assert status.OK()
status = connect.compact(table_list[i])
assert status.OK()
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vector_after_compact(self, connect, jac_table):
'''
target: test add vector after compact
method: after compact operation, add vector
expected: status ok, vector added
'''
tmp, vectors = gen_binary_vectors(nb, dim)
status, ids = connect.add_vectors(jac_table, vectors)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(jac_table)
assert status.OK()
size_before = info.partitions_stat[0].segments_stat[0].data_size
status = connect.compact(jac_table)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(jac_table)
assert status.OK()
size_after = info.partitions_stat[0].segments_stat[0].data_size
assert(size_before == size_after)
tmp, vector = gen_binary_vectors(1, dim)
status, ids = connect.add_vectors(jac_table, vector)
assert status.OK()
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_delete_vectors_after_compact(self, connect, jac_table):
'''
target: test delete vectors after compact
method: after compact operation, delete vectors
expected: status ok, vectors deleted
'''
tmp, vectors = gen_binary_vectors(nb, dim)
status, ids = connect.add_vectors(jac_table, vectors)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
status = connect.compact(jac_table)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
status = connect.delete_by_id(jac_table, ids)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_search_after_compact(self, connect, jac_table):
'''
target: test search after compact
method: after compact operation, search vector
expected: status ok
'''
tmp, vectors = gen_binary_vectors(nb, dim)
status, ids = connect.add_vectors(jac_table, vectors)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
status = connect.compact(jac_table)
assert status.OK()
status = connect.flush([jac_table])
assert status.OK()
query_vecs = [vectors[0]]
status, res = connect.search_vectors(jac_table, top_k, query_records=query_vecs)
logging.getLogger().info(res)
assert status.OK()
class TestCompactIP:
"""
******************************************************************
The following cases are used to test `compact` function
******************************************************************
"""
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vector_and_compact(self, connect, ip_table):
'''
target: test add vector and compact
method: add vector and compact table
expected: status ok, vector added
'''
vector = gen_single_vector(dim)
status, ids = connect.add_vectors(ip_table, vector)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(ip_table)
assert status.OK()
size_before = info.partitions_stat[0].segments_stat[0].data_size
status = connect.compact(ip_table)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(ip_table)
assert status.OK()
size_after = info.partitions_stat[0].segments_stat[0].data_size
assert(size_before == size_after)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vectors_and_compact(self, connect, ip_table):
'''
target: test add vectors and compact
method: add vectors and compact table
expected: status ok, vectors added
'''
vectors = gen_vector(nb, dim)
status, ids = connect.add_vectors(ip_table, vectors)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(ip_table)
assert status.OK()
size_before = info.partitions_stat[0].segments_stat[0].data_size
status = connect.compact(ip_table)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(ip_table)
assert status.OK()
size_after = info.partitions_stat[0].segments_stat[0].data_size
assert(size_before == size_after)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vectors_delete_part_and_compact(self, connect, ip_table):
'''
target: test add vectors, delete part of them and compact
method: add vectors, delete a few and compact table
expected: status ok, data size is smaller after compact
'''
vectors = gen_vector(nb, dim)
status, ids = connect.add_vectors(ip_table, vectors)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
delete_ids = [ids[0], ids[-1]]
status = connect.delete_by_id(ip_table, delete_ids)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(ip_table)
assert status.OK()
logging.getLogger().info(info.partitions_stat)
size_before = info.partitions_stat[0].segments_stat[0].data_size
logging.getLogger().info(size_before)
status = connect.compact(ip_table)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(ip_table)
assert status.OK()
logging.getLogger().info(info.partitions_stat)
size_after = info.partitions_stat[0].segments_stat[0].data_size
logging.getLogger().info(size_after)
assert(size_before > size_after)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vectors_delete_all_and_compact(self, connect, ip_table):
'''
target: test add vectors, delete them and compact
method: add vectors, delete all and compact table
expected: status ok, no data size in table info because table is empty
'''
vectors = gen_vector(nb, dim)
status, ids = connect.add_vectors(ip_table, vectors)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
status = connect.delete_by_id(ip_table, ids)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(ip_table)
assert status.OK()
status = connect.compact(ip_table)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(ip_table)
assert status.OK()
logging.getLogger().info(info.partitions_stat)
assert(len(info.partitions_stat[0].segments_stat) == 0)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vector_and_compact_twice(self, connect, ip_table):
'''
target: test add vector and compact twice
method: add vector and compact table twice
expected: status ok
'''
vector = gen_single_vector(dim)
status, ids = connect.add_vectors(ip_table, vector)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(ip_table)
assert status.OK()
size_before = info.partitions_stat[0].segments_stat[0].data_size
status = connect.compact(ip_table)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(ip_table)
assert status.OK()
size_after = info.partitions_stat[0].segments_stat[0].data_size
assert(size_before == size_after)
status = connect.compact(ip_table)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
# get table info after compact twice
status, info = connect.table_info(ip_table)
assert status.OK()
size_after_twice = info.partitions_stat[0].segments_stat[0].data_size
assert(size_after == size_after_twice)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vectors_delete_part_and_compact_twice(self, connect, ip_table):
'''
target: test add vectors, delete part of them and compact twice
method: add vectors, delete part and compact table twice
expected: status ok, data size smaller after first compact, no change after second
'''
vectors = gen_vector(nb, dim)
status, ids = connect.add_vectors(ip_table, vectors)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
delete_ids = [ids[0], ids[-1]]
status = connect.delete_by_id(ip_table, delete_ids)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(ip_table)
assert status.OK()
size_before = info.partitions_stat[0].segments_stat[0].data_size
status = connect.compact(ip_table)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(ip_table)
assert status.OK()
size_after = info.partitions_stat[0].segments_stat[0].data_size
assert(size_before > size_after)
status = connect.compact(ip_table)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
# get table info after compact twice
status, info = connect.table_info(ip_table)
assert status.OK()
size_after_twice = info.partitions_stat[0].segments_stat[0].data_size
assert(size_after == size_after_twice)
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_compact_multi_tables(self, connect):
'''
target: test compact works or not with multiple tables
method: create 50 tables, add vectors into them and compact in turn
expected: status ok
'''
nq = 100
num_tables = 50
vectors = gen_vectors(nq, dim)
table_list = []
for i in range(num_tables):
table_name = gen_unique_str("test_compact_multi_table_%d" % i)
table_list.append(table_name)
param = {'table_name': table_name,
'dimension': dim,
'index_file_size': index_file_size,
'metric_type': MetricType.IP}
connect.create_table(param)
time.sleep(6)
for i in range(num_tables):
status, ids = connect.add_vectors(table_name=table_list[i], records=vectors)
assert status.OK()
status = connect.compact(table_list[i])
assert status.OK()
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_add_vector_after_compact(self, connect, ip_table):
'''
target: test add vector after compact
method: after compact operation, add vector
expected: status ok, vector added
'''
vectors = gen_vector(nb, dim)
status, ids = connect.add_vectors(ip_table, vectors)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
# get table info before compact
status, info = connect.table_info(ip_table)
assert status.OK()
size_before = info.partitions_stat[0].segments_stat[0].data_size
status = connect.compact(ip_table)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
# get table info after compact
status, info = connect.table_info(ip_table)
assert status.OK()
size_after = info.partitions_stat[0].segments_stat[0].data_size
assert(size_before == size_after)
vector = gen_single_vector(dim)
status, ids = connect.add_vectors(ip_table, vector)
assert status.OK()
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_delete_vectors_after_compact(self, connect, ip_table):
'''
target: test delete vectors after compact
method: after compact operation, delete vectors
expected: status ok, vectors deleted
'''
vectors = gen_vector(nb, dim)
status, ids = connect.add_vectors(ip_table, vectors)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
status = connect.compact(ip_table)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
status = connect.delete_by_id(ip_table, ids)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
@pytest.mark.timeout(COMPACT_TIMEOUT)
def test_search_after_compact(self, connect, ip_table):
'''
target: test search after compact
method: after compact operation, search vector
expected: status ok
'''
vectors = gen_vector(nb, dim)
status, ids = connect.add_vectors(ip_table, vectors)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
status = connect.compact(ip_table)
assert status.OK()
status = connect.flush([ip_table])
assert status.OK()
query_vecs = [vectors[0]]
status, res = connect.search_vectors(ip_table, top_k, query_records=query_vecs)
logging.getLogger().info(res)
assert status.OK()