向量数据库Milvus

向量数据库Milvus

前言

官网介绍地址:Milvus documentation

中文说明文档:Milvus的概述 – Milvus向量库中文文档 (milvus-io.com)

GITHUB地址:GitHub - milvus-io/milvus: A cloud-native vector database, storage for next generation AI applications

使用介绍

功能介绍

Milvus 是一款开源的向量相似度搜索引擎,支持使用多种 AI 模型将非结构化数据向量化,并为向量数据提供搜索服务。Milvus 集成了 Faiss、Annoy 等广泛应用的向量索引库,开发者可以针对不同场景选择不同的索引类型。使用 Milvus 就可以以相当低的成本研发出最简可行产品。

使用场景

以图搜图、智能问答、音频搜索

配置说明

#
# Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software distributed under the License
# is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
# or implied. See the License for the specific language governing permissions and limitations under the License.

version: 0.5

#----------------------+------------------------------------------------------------+------------+-----------------+
# Cluster Config       | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# enable               | If running with Mishards, set true, otherwise false.       | Boolean    | false           |
#----------------------+------------------------------------------------------------+------------+-----------------+
# role                 | Milvus deployment role: rw / ro                            | Role       | rw              |
#----------------------+------------------------------------------------------------+------------+-----------------+
cluster:
  enable: false
  role: rw

#----------------------+------------------------------------------------------------+------------+-----------------+
# General Config       | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# timezone             | Use UTC-x or UTC+x to specify a time zone.                 | Timezone   | UTC+8           |
#----------------------+------------------------------------------------------------+------------+-----------------+
# meta_uri             | URI for metadata storage, using SQLite (for single server  | URI        | sqlite://:@:/   |
#                      | Milvus) or MySQL (for distributed cluster Milvus).         |            |                 |
#                      | Format: dialect://username:password@host:port/database     |            |                 |
#                      | Keep 'dialect://:@:/', 'dialect' can be either 'sqlite' or |            |                 |
#                      | 'mysql', replace other texts with real values.             |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
general:
  timezone: UTC+1
  meta_uri: sqlite://:@:/

#----------------------+------------------------------------------------------------+------------+-----------------+
# Network Config       | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# bind.address         | IP address that Milvus server monitors.                    | IP         | 0.0.0.0         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# bind.port            | Port that Milvus server monitors. Port range (1024, 65535) | Integer    | 19530           |
#----------------------+------------------------------------------------------------+------------+-----------------+
# http.enable          | Enable HTTP server or not.                                 | Boolean    | true            |
#----------------------+------------------------------------------------------------+------------+-----------------+
# http.port            | Port that Milvus HTTP server monitors.                     | Integer    | 19121           |
#                      | Port range (1024, 65535)                                   |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
network:
  bind.address: 0.0.0.0
  bind.port: 19530
  http.enable: true
  http.port: 19121

#----------------------+------------------------------------------------------------+------------+-----------------+
# Storage Config       | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# path                 | Path used to save meta data, vector data and index data.   | Path       | /var/lib/milvus |
#----------------------+------------------------------------------------------------+------------+-----------------+
# auto_flush_interval  | The interval, in seconds, at which Milvus automatically    | Integer    | 1 (s)           |
#                      | flushes data to disk.                                      |            |                 |
#                      | 0 means disable the regular flush.                         |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
storage:
  path: /var/lib/milvus
  auto_flush_interval: 1

#----------------------+------------------------------------------------------------+------------+-----------------+
# WAL Config           | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# enable               | Whether to enable write-ahead logging (WAL) in Milvus.     | Boolean    | true            |
#                      | If WAL is enabled, Milvus writes all data changes to log   |            |                 |
#                      | files in advance before implementing data changes. WAL     |            |                 |
#                      | ensures the atomicity and durability for Milvus operations.|            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# recovery_error_ignore| Whether to ignore logs with errors that happens during WAL | Boolean    | false           |
#                      | recovery. If true, when Milvus restarts for recovery and   |            |                 |
#                      | there are errors in WAL log files, log files with errors   |            |                 |
#                      | are ignored. If false, Milvus does not restart when there  |            |                 |
#                      | are errors in WAL log files.                               |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# buffer_size          | Sum total of the read buffer and the write buffer in Bytes.| String     | 256MB           |
#                      | buffer_size must be in range [64MB, 4096MB].               |            |                 |
#                      | If the value you specified is out of range, Milvus         |            |                 |
#                      | automatically uses the boundary value closest to the       |            |                 |
#                      | specified value. It is recommended you set buffer_size to  |            |                 |
#                      | a value greater than the inserted data size of a single    |            |                 |
#                      | insert operation for better performance.                   |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# path                 | Location of WAL log files.                                 | String     |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
wal:
  enable: true
  recovery_error_ignore: false
  buffer_size: 256MB
  path: /var/lib/milvus/wal

#----------------------+------------------------------------------------------------+------------+-----------------+
# Cache Config         | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# cache_size           | The size of CPU memory used for caching data for faster    | String     | 4GB             |
#                      | query. The sum of 'cache_size' and 'insert_buffer_size'    |            |                 |
#                      | must be less than system memory size.                      |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# insert_buffer_size   | Buffer size used for data insertion.                       | String     | 1GB             |
#                      | The sum of 'insert_buffer_size' and 'cache_size'           |            |                 |
#                      | must be less than system memory size.                      |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# preload_collection   | A comma-separated list of collection names that need to    | StringList |                 |
#                      | be pre-loaded when Milvus server starts up.                |            |                 |
#                      | '*' means preload all existing tables (single-quote or     |            |                 |
#                      | double-quote required).                                    |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
cache:
  cache_size: 4GB
  insert_buffer_size: 1GB
  preload_collection:

#----------------------+------------------------------------------------------------+------------+-----------------+
# GPU Config           | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# enable               | Use GPU devices or not.                                    | Boolean    | false           |
#----------------------+------------------------------------------------------------+------------+-----------------+
# cache_size           | The size of GPU memory per card used for cache.            | String     | 1GB             |
#----------------------+------------------------------------------------------------+------------+-----------------+
# gpu_search_threshold | A Milvus performance tuning parameter. This value will be  | Integer    | 1000            |
#                      | compared with 'nq' to decide if the search computation will|            |                 |
#                      | be executed on GPUs only.                                  |            |                 |
#                      | If nq >= gpu_search_threshold, the search computation will |            |                 |
#                      | be executed on GPUs only;                                  |            |                 |
#                      | if nq < gpu_search_threshold, the search computation will  |            |                 |
#                      | be executed on both CPUs and GPUs.                         |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# search_devices       | The list of GPU devices used for search computation.       | DeviceList | gpu0            |
#                      | Must be in format gpux.                                    |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# build_index_devices  | The list of GPU devices used for index building.           | DeviceList | gpu0            |
#                      | Must be in format gpux.                                    |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
gpu:
  enable: false
  cache_size: 1GB
  gpu_search_threshold: 1000
  search_devices:
    - gpu0
  build_index_devices:
    - gpu0

#----------------------+------------------------------------------------------------+------------+-----------------+
# Logs                 | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# trace.enable         | Whether to enable trace level logging in Milvus.           | Boolean    | true            |
#----------------------+------------------------------------------------------------+------------+-----------------+
# debug.enable         | Whether to enable debug level logging in Milvus.           | Boolean    | true            |
#----------------------+------------------------------------------------------------+------------+-----------------+
# info.enable          | Whether to enable info level logging in Milvus.            | Boolean    | true            |
#----------------------+------------------------------------------------------------+------------+-----------------+
# warning.enable       | Whether to enable warning level logging in Milvus.         | Boolean    | true            |
#----------------------+------------------------------------------------------------+------------+-----------------+
# error.enable         | Whether to enable error level logging in Milvus.           | Boolean    | true            |
#----------------------+------------------------------------------------------------+------------+-----------------+
# fatal.enable         | Whether to enable fatal level logging in Milvus.           | Boolean    | true            |
#----------------------+------------------------------------------------------------+------------+-----------------+
# path                 | Absolute path to the folder holding the log files.         | String     |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
# max_log_file_size    | The maximum size of each log file, size range [512, 4096]  | Integer    | 1024 (MB)       |
#----------------------+------------------------------------------------------------+------------+-----------------+
# log_rotate_num       | The maximum number of log files that Milvus keeps for each | Integer    | 0               |
#                      | logging level, num range [0, 1024], 0 means unlimited.     |            |                 |
#----------------------+------------------------------------------------------------+------------+-----------------+
logs:
  level: info
  trace.enable: true
  path: /var/lib/milvus/logs
  max_log_file_size: 512MB
  log_rotate_num: 100

#----------------------+------------------------------------------------------------+------------+-----------------+
# Metric Config        | Description                                                | Type       | Default         |
#----------------------+------------------------------------------------------------+------------+-----------------+
# enable               | Enable monitoring function or not.                         | Boolean    | false           |
#----------------------+------------------------------------------------------------+------------+-----------------+
# address              | Pushgateway address                                        | IP         | 127.0.0.1       +
#----------------------+------------------------------------------------------------+------------+-----------------+
# port                 | Pushgateway port, port range (1024, 65535)                 | Integer    | 9091            |
#----------------------+------------------------------------------------------------+------------+-----------------+
metric:
  enable: false
  address: 127.0.0.1
  port: 9091

问题处理

  1. 重启失败,尝试下列任一方法
    1. remove the WAL folder.
    2. set configuration item to ignore the error
      wal:
      recovery_error_ignore: true
    3. set configuration item to disable the WAL
      wal:
      enable: false