mirror of https://gitee.com/milvus-io/milvus.git synced 2026-01-07 19:31:51 +08:00

History

feat: optimize Like query with n-gram (#41803 )

Ref #42053

This is the first PR for optimizing `LIKE` with ngram inverted index.
Now, only VARCHAR data type is supported and only InnerMatch LIKE
(%xxx%) query is supported.


How to use it:
```
milvus_client = MilvusClient("http://localhost:19530")
schema = milvus_client.create_schema()
...
schema.add_field("content_ngram", DataType.VARCHAR, max_length=10000)
...
index_params = milvus_client.prepare_index_params()
index_params.add_index(field_name="content_ngram", index_type="NGRAM", index_name="ngram_index", min_gram=2, max_gram=3)
milvus_client.create_collection(COLLECTION_NAME, ...)
```

min_gram and max_gram controls how we tokenize the documents. For
example, for min_gram=2 and max_gram=4, we will tokenize each document
with 2-gram, 3-gram and 4-gram.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>

2025-07-01 10:08:44 +08:00

allocator

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

broker

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

session

enhance: Use QuerySlot interface for tasks (#41989 )

2025-05-23 10:30:28 +08:00

task

enhance: Add proxy task queue metrics (#42156 )

2025-06-04 11:26:32 +08:00

.mockery.yaml

fix: Pre-check import message to prevent pipeline block indefinitely (#42415 )

2025-06-11 13:40:38 +08:00

analyze_inspector.go

fix: Fix task getting stuck after recovery (#42114 )

2025-05-28 12:46:28 +08:00

analyze_meta_test.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

analyze_meta.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

build_index_policy.go

Format the code (#27275 )

2023-09-21 09:45:27 +08:00

channel_manager_factory.go

fix: drop collection failed if enable streaming service (#37444 )

2024-11-07 10:26:26 +08:00

channel_manager_test.go

fix: ChannelManager double assignment (#41837 )

2025-05-23 14:16:29 +08:00

channel_manager.go

fix: ChannelManager double assignment (#41837 )

2025-05-23 14:16:29 +08:00

channel_store_test.go

fix: ChannelManager double assignment (#41837 )

2025-05-23 14:16:29 +08:00

channel_store.go

fix: ChannelManager double assignment (#41837 )

2025-05-23 14:16:29 +08:00

channel.go

enhance: support run analyzer by loaded collection field (#42113 )

2025-05-29 10:54:30 +08:00

cluster_test.go

enhance: Use QuerySlot interface for tasks (#41989 )

2025-05-23 10:30:28 +08:00

cluster.go

enhance: Use QuerySlot interface for tasks (#41989 )

2025-05-23 10:30:28 +08:00

compaction_inspector_test.go

fix: Fix task getting stuck after recovery (#42114 )

2025-05-28 12:46:28 +08:00

compaction_inspector.go

fix: Fix task getting stuck after recovery (#42114 )

2025-05-28 12:46:28 +08:00

compaction_l0_view_test.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

compaction_l0_view.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

compaction_policy_clustering_test.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

compaction_policy_clustering.go

fix: Only mark segment compacting for sort stats task (#42516 )

2025-06-04 22:46:32 +08:00

compaction_policy_l0_test.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

compaction_policy_l0.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

compaction_policy_single_test.go

enhance: Check if segment has too many deletions together (#42668 )

2025-06-24 16:30:49 +08:00

compaction_policy_single.go

enhance: Check if segment has too many deletions together (#42668 )

2025-06-24 16:30:49 +08:00

compaction_queue_test.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

compaction_queue.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

compaction_task_clustering_test.go

fix: Fix concurrent l0Compaction and Stats (#42112 )

2025-05-27 20:54:28 +08:00

compaction_task_clustering.go

enhance: Add task version monitoring (#42023 )

2025-05-22 23:24:28 +08:00

compaction_task_l0_test.go

fix: Fix concurrent l0Compaction and Stats (#42112 )

2025-05-27 20:54:28 +08:00

compaction_task_l0.go

fix: Fix concurrent l0Compaction and Stats (#42112 )

2025-05-27 20:54:28 +08:00

compaction_task_meta_test.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

compaction_task_meta.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

compaction_task_mix_test.go

fix: Fix concurrent l0Compaction and Stats (#42112 )

2025-05-27 20:54:28 +08:00

compaction_task_mix.go

enhance: Add task version monitoring (#42023 )

2025-05-22 23:24:28 +08:00

compaction_task.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

compaction_trigger_test.go

enhance: refine compaction trigger to reduce read/write amplifaction(#41336 ) (#41728 )

2025-06-04 11:24:38 +08:00

compaction_trigger_v2_test.go

enhance: Enhance import context (#42021 )

2025-05-23 12:58:27 +08:00

compaction_trigger_v2.go

enhance: Enhance import context (#42021 )

2025-05-23 12:58:27 +08:00

compaction_trigger.go

enhance: Check if segment has too many deletions together (#42668 )

2025-06-24 16:30:49 +08:00

compaction_util.go

fix: Fix delete data loss due to duplicate binlogID (#40960 )

2025-04-01 10:36:22 +08:00

compaction_view.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

const.go

enhance: pass partition key scalar info if enabled when build vector index (#29931 )

2024-01-24 00:04:55 +08:00

create_meta_test.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

errors_test.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

errors.go

Refine DataCoord status (#27262 )

2023-09-26 17:15:27 +08:00

garbage_collector_test.go

enhance: Check loaded segments before gc (#42639 )

2025-06-13 17:44:38 +08:00

garbage_collector.go

fix: data duplicated when msgdispatcher make splitting (#42827 )

2025-06-19 16:32:39 +08:00

go_channel_singleton.go

fix: Fix improper use of offset in HybridSearch (#36244 )

2024-09-13 22:05:15 +08:00

handler_test.go

fix: create multiple idential indexes by accident (#40179 )

2025-04-08 15:06:25 +08:00

handler.go

enhance: Check loaded segments before gc (#42639 )

2025-06-13 17:44:38 +08:00

import_checker_test.go

enhance: Enhance import integration tests and logs (#42612 )

2025-06-12 20:02:35 +08:00

import_checker.go

enhance: Enhance import integration tests and logs (#42612 )

2025-06-12 20:02:35 +08:00

import_inspector_test.go

fix: Fix task getting stuck after recovery (#42114 )

2025-05-28 12:46:28 +08:00

import_inspector.go

fix: Fix task getting stuck after recovery (#42114 )

2025-05-28 12:46:28 +08:00

import_job.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

import_meta_test.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

import_meta.go

enhance: Enhance import context (#42021 )

2025-05-23 12:58:27 +08:00

import_task_import_test.go

enhance: Enhance import context (#42021 )

2025-05-23 12:58:27 +08:00

import_task_import.go

enhance: Enhance import context (#42021 )

2025-05-23 12:58:27 +08:00

import_task_preimport_test.go

enhance: Enhance import context (#42021 )

2025-05-23 12:58:27 +08:00

import_task_preimport.go

enhance: Enhance import context (#42021 )

2025-05-23 12:58:27 +08:00

import_task.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

import_util_test.go

fix: Pre-check import message to prevent pipeline block indefinitely (#42415 )

2025-06-11 13:40:38 +08:00

import_util.go

fix: Consider fields number when preallocating ids for import (#42810 )

2025-06-25 23:38:41 +08:00

index_engine_version_manager_test.go

fix: solve incompitable problem for none-encoding index(#40838 ) (#41369 )

2025-04-20 22:56:44 +08:00

index_engine_version_manager.go

fix: solve incompitable problem for none-encoding index(#40838 ) (#41369 )

2025-04-20 22:56:44 +08:00

index_inspector_test.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

index_inspector.go

fix: Fix task getting stuck after recovery (#42114 )

2025-05-28 12:46:28 +08:00

index_meta_test.go

fix: Use locking to ensure the atomicity of dropping segment indexes (#42075 )

2025-05-28 10:00:28 +08:00

index_meta.go

feat: Add json flat index (#39917 )

2025-06-10 19:14:35 +08:00

index_service_test.go

feat: Add json flat index (#39917 )

2025-06-10 19:14:35 +08:00

index_service.go

feat: Add json flat index (#39917 )

2025-06-10 19:14:35 +08:00

knapsack_test.go

enhance: refine compaction trigger to reduce read/write amplifaction(#41336 ) (#41728 )

2025-06-04 11:24:38 +08:00

knapsack.go

enhance: refine compaction trigger to reduce read/write amplifaction(#41336 ) (#41728 )

2025-06-04 11:24:38 +08:00

meta_test.go

feat: add NotifyDropPartition in mixcoord for droppartition in dc (#42029 )

2025-05-23 18:32:26 +08:00

meta_util.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

meta.go

fix: Only mark segment compacting for sort stats task (#42516 )

2025-06-04 22:46:32 +08:00

metrics_info_test.go

enhance: Refine task meta with key lock (#40613 )

2025-03-14 15:44:22 +08:00

metrics_info.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

mock_channel_manager.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

mock_channel_store.go

fix: ChannelManager double assignment (#41837 )

2025-05-23 14:16:29 +08:00

mock_cluster.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

mock_compaction_inspector.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

mock_compaction_meta.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

mock_handler.go

enhance: Check loaded segments before gc (#42639 )

2025-06-13 17:44:38 +08:00

mock_import_meta.go

fix: Pre-check import message to prevent pipeline block indefinitely (#42415 )

2025-06-11 13:40:38 +08:00

mock_index_engine_version_manager.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

mock_segment_manager.go

feat: add NotifyDropPartition in mixcoord for droppartition in dc (#42029 )

2025-05-23 18:32:26 +08:00

mock_stats_job_manager.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

mock_sub_cluster.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

mock_test.go

enhance: Check loaded segments before gc (#42639 )

2025-06-13 17:44:38 +08:00

mock_trigger_manager.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

mock_trigger.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

OWNERS

[skip ci]Update OWNERS files (#11898 )

2021-11-16 15:41:11 +08:00

partition_stats_meta_test.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

partition_stats_meta.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

policy_test.go

fix: Fix channel not balance on datanodes (#40422 )

2025-03-11 14:56:16 +08:00

policy.go

fix: Fix channel not balance on datanodes (#40422 )

2025-03-11 14:56:16 +08:00

README.md

[skip ci]Change etcd to lowercase (#9983 )

2021-10-15 18:58:37 +08:00

segment_allocation_policy_test.go

fix: Ignore growing segment without start pos for seal policy (#41130 )

2025-04-07 22:16:23 +08:00

segment_allocation_policy.go

fix: Ignore growing segment without start pos for seal policy (#41130 )

2025-04-07 22:16:23 +08:00

segment_info_test.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

segment_info.go

enhance: refine compaction trigger to reduce read/write amplifaction(#41336 ) (#41728 )

2025-06-04 11:24:38 +08:00

segment_manager_test.go

feat: add NotifyDropPartition in mixcoord for droppartition in dc (#42029 )

2025-05-23 18:32:26 +08:00

segment_manager.go

feat: add NotifyDropPartition in mixcoord for droppartition in dc (#42029 )

2025-05-23 18:32:26 +08:00

segment_operator_test.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

segment_operator.go

enhance: Add json key inverted index in stats for optimization (#38039 )

2025-04-10 15:20:28 +08:00

server_test.go

fix: Pre-check import message to prevent pipeline block indefinitely (#42415 )

2025-06-11 13:40:38 +08:00

server.go

enhance: Check loaded segments before gc (#42639 )

2025-06-13 17:44:38 +08:00

services_test.go

fix: Pre-check import message to prevent pipeline block indefinitely (#42415 )

2025-06-11 13:40:38 +08:00

services.go

fix: mixcoord will not handle timetick anymore (#42965 )

2025-06-26 19:14:42 +08:00

stats_inspector_test.go

fix: Just trigger stats task for Flushed segment (#42424 )

2025-06-05 15:42:32 +08:00

stats_inspector.go

feat: optimize Like query with n-gram (#41803 )

2025-07-01 10:08:44 +08:00

stats_task_meta_test.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

stats_task_meta.go

enhance: Add slot and tasks num metrics (#42141 )

2025-05-30 21:52:30 +08:00

sync_segments_scheduler_test.go

enhance: Optimize datacoord meta mutex (#40552 )

2025-03-25 13:46:25 +08:00

sync_segments_scheduler.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

task_analyze_test.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

task_analyze.go

enhance: Add task version monitoring (#42023 )

2025-05-22 23:24:28 +08:00

task_index_test.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

task_index.go

enhance: support run analyzer by loaded collection field (#42113 )

2025-05-29 10:54:30 +08:00

task_stats_test.go

enhance: Pooling for data tasks (#41256 )

2025-05-20 21:06:24 +08:00

task_stats.go

fix: Only mark segment compacting for sort stats task (#42516 )

2025-06-04 22:46:32 +08:00

util_test.go

enhance: Use v2 package name for pkg module (#39990 )

2025-02-22 23:15:58 +08:00

util.go

fix: datacoord stop get stuck After upgrading from 2.5 to 2.6 (#42674 )

2025-06-12 16:56:36 +08:00

README.md

Data Coordinator

Data cooridnator(datacoord for short) is the component to organize DataNodes and segments allocations.

Dependency

KV store: a kv store has all the meta info datacoord needs to operate. (etcd)
Message stream: a message stream to communicate statistics information with data nodes. (Pulsar)
Root Coordinator: timestamp, id and meta source.
Data Node(s): could be an instance or a cluster, actual worker group handles data modification operations.