weaviate 向量库集群多节点部署(一)

声明:本文适用docker方式在单台主机或者Kubernetes环境下部署,如果是在不同的主机分别用docker部署,在实践时会有通信问题,导致部署失败。

单台机器+Docker模式

在单台机器上,使用docker模式部署向量库集群是可以成功的,此方案己测试通过。

根据官网文档(
https://weaviate.io/developers/weaviate/installation/docker-compose#multi-node-configuration)在Kubernetes (K8S)环境下,使用docker多节点部署是可以成功的。

但是在不同的主机上,使用docker模式部署,由于软件内部端口无法连接转发,导致部署测试失败:

{"build_git_commit":"87a0924","build_go_version":"go1.22.8","build_image_tag":"","build_wv_version":"","level":"debug","msg":" memberlist: Failed UDP ping: node1 (timeout reached)","time":"2024-10-12T08:43:58Z"}

准备机器

部署主机列表

Kubernetes | WeaviateGitHub

Troubleshooting

If you see No private IP address found, and explicit IP not provided, set the pod subnet to be in an valid ip address range of the following:

10.0.0.0/8

100.64.0.0/10

172.16.0.0/12

192.168.0.0/16

198.19.0.0/16

准备的主机ip地址必须是上面提到的IP段内的,否则会启动失败。以下IP仅供参考:

  • 10.1.1.1 (单节点)

创建目录

在节点机器创建相关目录:

#用于存放weaviate 相关配置文件,
mkdir -p /data/weaviate/conf
# 用于存放weaviate向量库的数据
mkdir -p /data/weaviate/data
mkdir -p /data/weaviate/data/data-node-1
mkdir -p /data/weaviate/data/data-node-2
mkdir -p /data/weaviate/data/data-node-3

创建配置文件

在单节点 /data/weaviate/conf 目录下创建 docker-compose-3-node.yml ,内容如下:

services:
  weaviate-node-1:
    container_name: weaviate-node-1
    init: true
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: cr.weaviate.io/semitechnologies/weaviate:1.27.0
    ports:
    - 8080:8080
    - 6060:6060
    - 50051:50051
    restart: on-failure:0
    volumes:
      - /data/weaviate/data/data-node-1:/var/lib/weaviate
    environment:
      LOG_LEVEL: 'debug'
      QUERY_DEFAULTS_LIMIT: 200
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'false'
      AUTHENTICATION_APIKEY_ENABLED: 'true'
      AUTHENTICATION_APIKEY_ALLOWED_KEYS: '8R09Z1SqUC5q3vrJP1y0'
      AUTHENTICATION_APIKEY_USERS: 'knowledge-dev'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      ENABLE_MODULES: ''
      DEFAULT_VECTORIZER_MODULE: 'none'
      CLUSTER_HOSTNAME: 'node1'
      CLUSTER_GOSSIP_BIND_PORT: '7100'
      CLUSTER_DATA_BIND_PORT: '7101'
      RAFT_JOIN: 'node1,node2,node3'
      RAFT_BOOTSTRAP_EXPECT: 3
      REPLICATION_MINIMUM_FACTOR: 3

  weaviate-node-2:
    container_name: weaviate-node-2
    init: true
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: cr.weaviate.io/semitechnologies/weaviate:1.27.0
    ports:
    - 8081:8080
    - 6061:6060
    - 50052:50051
    restart: on-failure:0
    volumes:
      - /data/weaviate/data/data-node-2:/var/lib/weaviate
    environment:
      LOG_LEVEL: 'debug'
      QUERY_DEFAULTS_LIMIT: 200
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'false'
      AUTHENTICATION_APIKEY_ENABLED: 'true'
      AUTHENTICATION_APIKEY_ALLOWED_KEYS: '8R09Z1SqUC5q3vrJP1y0'
      AUTHENTICATION_APIKEY_USERS: 'knowledge-dev'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      ENABLE_MODULES: ''
      DEFAULT_VECTORIZER_MODULE: 'none'
      CLUSTER_HOSTNAME: 'node2'
      CLUSTER_GOSSIP_BIND_PORT: '7102'
      CLUSTER_DATA_BIND_PORT: '7103'
      CLUSTER_JOIN: 'weaviate-node-1:7100'
      RAFT_JOIN: 'node1,node2,node3'
      RAFT_BOOTSTRAP_EXPECT: 3
      REPLICATION_MINIMUM_FACTOR: 3

  weaviate-node-3:
    container_name: weaviate-node-3
    init: true
    command:
    - --host
    - 0.0.0.0
    - --port
    - '8080'
    - --scheme
    - http
    image: cr.weaviate.io/semitechnologies/weaviate:1.27.0
    ports:
    - 8082:8080
    - 6062:6060
    - 50053:50051
    restart: on-failure:0
    volumes:
      - /data/weaviate/data/data-node-3:/var/lib/weaviate
    environment:
      LOG_LEVEL: 'debug'
      QUERY_DEFAULTS_LIMIT: 200
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'false'
      AUTHENTICATION_APIKEY_ENABLED: 'true'
      AUTHENTICATION_APIKEY_ALLOWED_KEYS: '8R09Z1SqUC5q3vrJP1y0'
      AUTHENTICATION_APIKEY_USERS: 'knowledge-dev'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      ENABLE_MODULES: ''
      DEFAULT_VECTORIZER_MODULE: 'none'
      CLUSTER_HOSTNAME: 'node3'
      CLUSTER_GOSSIP_BIND_PORT: '7104'
      CLUSTER_DATA_BIND_PORT: '7105'
      CLUSTER_JOIN: 'weaviate-node-1:7100'
      RAFT_JOIN: 'node1,node2,node3'
      RAFT_BOOTSTRAP_EXPECT: 3
      REPLICATION_MINIMUM_FACTOR: 3


启动服务

cd /data/weaviate/conf
docker-compose -f docker-compose-3-node.yml up -d

关于如何docker镜像拉取失败的网络问题,可以手工下载好镜像或者从源码构建出镜像,再导入到相关docker环境里。

查看服务

等待安装完成后,可运行下述命令检查是否安装完成

docker ps -a

如果成功安装,是有weaviate的镜像会显示的:

停止服务

docker-compose -f docker-compose-3-node.yml down

查看日志

docker logs -f --tail=10 weaviate-node-1

docker logs -f --tail=10 weaviate-node-2

docker logs -f --tail=10 weaviate-node-3

检查服务是否正常

由于开启了APIKEY认证的话,请求时需要传入Token为8R09Z1SqUC5q3vrJP1y0

查看节点

请求http://10.1.1.1:8080/v1/nodes查看节点状态

{
    "nodes": [
        {
            "batchStats": {
                "queueLength": 0,
                "ratePerSecond": 0
            },
            "gitHash": "87a0924",
            "name": "node1",
            "shards": null,
            "status": "HEALTHY",
            "version": "1.27.0-alpha"
        },
        {
            "batchStats": {
                "queueLength": 0,
                "ratePerSecond": 0
            },
            "gitHash": "87a0924",
            "name": "node2",
            "shards": null,
            "status": "HEALTHY",
            "version": "1.27.0-alpha"
        },
        {
            "batchStats": {
                "queueLength": 0,
                "ratePerSecond": 96
            },
            "gitHash": "87a0924",
            "name": "node3",
            "shards": null,
            "status": "HEALTHY",
            "version": "1.27.0-alpha"
        }
    ]
}


查看集群信息

请求接口http://10.1.1.1
:8080/v1/cluster/statistics 查看集群主从节点状态
(Follower、Leader)

一定要注意节点的"state"状态值,如果所有节点的状态都是Leader的话,那可能是启动是的节点数量不是奇数(2n+1)。

{
    "statistics": [
        {
            "bootstrapped": true,
            "candidates": {},
            "dbLoaded": true,
            "isVoter": true,
            "leaderAddress": "172.18.0.3:8300",
            "leaderId": "node3",
            "name": "node1",
            "open": true,
            "raft": {
                "appliedIndex": "2",
                "commitIndex": "2",
                "fsmPending": "0",
                "lastContact": "2.058561ms",
                "lastLogIndex": "2",
                "lastLogTerm": "2",
                "lastSnapshotIndex": "0",
                "lastSnapshotTerm": "0",
                "latestConfiguration": [
                    {
                        "address": "172.18.0.4:8300",
                        "id": "node1",
                        "suffrage": 0
                    },
                    {
                        "address": "172.18.0.2:8300",
                        "id": "node2",
                        "suffrage": 0
                    },
                    {
                        "address": "172.18.0.3:8300",
                        "id": "node3",
                        "suffrage": 0
                    }
                ],
                "latestConfigurationIndex": "0",
                "numPeers": "2",
                "protocolVersion": "3",
                "protocolVersionMax": "3",
                "protocolVersionMin": "0",
                "snapshotVersionMax": "1",
                "snapshotVersionMin": "0",
                "state": "Follower",
                "term": "2"
            },
            "ready": true,
            "status": "HEALTHY"
        },
        {
            "bootstrapped": true,
            "candidates": {},
            "dbLoaded": true,
            "isVoter": true,
            "leaderAddress": "172.18.0.3:8300",
            "leaderId": "node3",
            "name": "node2",
            "open": true,
            "raft": {
                "appliedIndex": "2",
                "commitIndex": "2",
                "fsmPending": "0",
                "lastContact": "15.768885ms",
                "lastLogIndex": "2",
                "lastLogTerm": "2",
                "lastSnapshotIndex": "0",
                "lastSnapshotTerm": "0",
                "latestConfiguration": [
                    {
                        "address": "172.18.0.4:8300",
                        "id": "node1",
                        "suffrage": 0
                    },
                    {
                        "address": "172.18.0.2:8300",
                        "id": "node2",
                        "suffrage": 0
                    },
                    {
                        "address": "172.18.0.3:8300",
                        "id": "node3",
                        "suffrage": 0
                    }
                ],
                "latestConfigurationIndex": "0",
                "numPeers": "2",
                "protocolVersion": "3",
                "protocolVersionMax": "3",
                "protocolVersionMin": "0",
                "snapshotVersionMax": "1",
                "snapshotVersionMin": "0",
                "state": "Follower",
                "term": "2"
            },
            "ready": true,
            "status": "HEALTHY"
        },
        {
            "bootstrapped": true,
            "candidates": {},
            "dbLoaded": true,
            "isVoter": true,
            "leaderAddress": "172.18.0.3:8300",
            "leaderId": "node3",
            "name": "node3",
            "open": true,
            "raft": {
                "appliedIndex": "2",
                "commitIndex": "2",
                "fsmPending": "0",
                "lastContact": "0",
                "lastLogIndex": "2",
                "lastLogTerm": "2",
                "lastSnapshotIndex": "0",
                "lastSnapshotTerm": "0",
                "latestConfiguration": [
                    {
                        "address": "172.18.0.2:8300",
                        "id": "node2",
                        "suffrage": 0
                    },
                    {
                        "address": "172.18.0.4:8300",
                        "id": "node1",
                        "suffrage": 0
                    },
                    {
                        "address": "172.18.0.3:8300",
                        "id": "node3",
                        "suffrage": 0
                    }
                ],
                "latestConfigurationIndex": "0",
                "numPeers": "2",
                "protocolVersion": "3",
                "protocolVersionMax": "3",
                "protocolVersionMin": "0",
                "snapshotVersionMax": "1",
                "snapshotVersionMin": "0",
                "state": "Leader",
                "term": "2"
            },
            "ready": true,
            "status": "HEALTHY"
        }
    ],
    "synchronized": true
}


查看数据

查看各个节点是否有多副本数据

单机器多端口


curl --location 'http://10.1.1.1:8080/v1/objects' --header 'Authorization: Bearer 8R09Z1SqUC5q3vrJP1y0'
curl --location 'http://10.1.1.1:8081/v1/objects' --header 'Authorization: Bearer 8R09Z1SqUC5q3vrJP1y0'
curl --location 'http://10.1.1.1:8082/v1/objects' --header 'Authorization: Bearer 8R09Z1SqUC5q3vrJP1y0'


环境变量参数说明

配置环境变量, 所有环境变量请访问:
https://weaviate.io/developers/weaviate/config-refs/env-vars

重要参数说明:

配置项

描述

REPLICATION_MINIMUM_FACTOR

配置全局副本数