ElasticSearch-master选举

2025-01-22 08:19:30 1.1k 字 #elasticSearch

This post is also available in English and alternative languages.

ElasticSearch版本：6.5.0(点击跳转官方文档)

ElasticSearch 采用 Master - Slave（主从）模式；使用Master节点管理元信息，数据去中心化。

master 主节点选举，是集群从已知的候选主节点中选举主节点。关于节点角色可以在 ElasticSearch-基础-02-基本概念这篇中了解。

生产环境，尽量将 DataNode 和 MasterNode 角色分开。node.master 配置为true 的节点，才有资格参与Master节点的选举。（称之为：master eligible node，也叫 master候选节点。）

1. 选举相关参数

master选举过程是自动完成的，一下几个参数对选举很重要：

discovery.zen.ping_timeout：选举超时时间，默认3秒，网络不好可以增加。
discovery.zen.join_timeout：加入超时节点；新节点加入集群时，会向主节点发送加入请求，如果一次超时，默认重试20次。
discovery.zen.master_election.ignore_non_master_pings：默认值false；如果master节点意外宕机，集群重新选举，如果值为true，那只有 master候选节点 才有资格参与选举；如果为false，那所有节点都可以参与选举。
discovery.zen.minimum_master_nodes：配置最小master候选节点数；如果没有足够master候选节点时，就不要进行master节点选举。等master候选节点数量足够了才进行选举。

discovery.zen.minimum_master_nodes、discovery.zen.ping_timeout 这两个参数，用于防止出现脑裂。

2. 基本规则

ElasticSearch选举采用quorum（多数派）思想，只有获得了超过半数选票的节点才能成为Master。

quorum 为 discovery.zen.minimum_master_nodes 配置。

3. Bully算法

Bully算法，Master节点选举的基本算法之一。

假定所有节点（候选主节点）都有一个唯一的ID，该ID对节点进行排序。拥有最高ID的节点，会成为Master节点。

不过该算法在某些场景下会有问题，会造成脑裂的情况。

ElasticSearch的选举与Bully算法有所出入，它选举的是ID最小的节点。

3.1. Bully 选举流程

集群中有6个节点，每个节点都维护着和其他节点的联系，P6节点是当前集群的master。
某时刻，P6节点/master节点宕机了
P3节点是集群中最先发现master节点宕机的，P3节点通知了比自己大编号大的P4、P5、P6节点。
因为P6节点已宕机，所以只有P4、P5节点返回响应，它们通知P3节点（P4、P5），会取代P6节点称为master节点。
P4节点向P5、P6节点发送通知
因为P6节点以及宕机，只有P5节点返回响应。
P5节点向P6发送通知，因为P6宕机没有响应，于是P5成了整个集群的master节点。
P5节点（master）管理其他 Slave 节点。

4. 主节点选举大致流程

ElectMasterService.java

/**
 * compares two candidates to indicate which the a better master.
 * A higher cluster state version is better
 *
 * @return -1 if c1 is a batter candidate, 1 if c2.
 */
public static int compare(MasterCandidate c1, MasterCandidate c2) {
    // we explicitly swap c1 and c2 here. the code expects "better" is lower in a sorted
    // list, so if c2 has a higher cluster state version, it needs to come first.
    int ret = Long.compare(c2.clusterStateVersion, c1.clusterStateVersion);
    if (ret == 0) {
        ret = compareNodes(c1.getNode(), c2.getNode());
    }
    return ret;
}

/** master nodes go before other nodes, with a secondary sort by id **/
 private static int compareNodes(DiscoveryNode o1, DiscoveryNode o2) {
    if (o1.isMasterNode() && !o2.isMasterNode()) {
        return -1;
    }
    if (!o1.isMasterNode() && o2.isMasterNode()) {
        return 1;
    }
    return o1.getId().compareTo(o2.getId());
}

clusterStateVersion：elasticsearch版本

先判断 clusterStateVersion，clusterStateVersion越大，优先级越高。

clusterStateVersion 版本相同，则进入 compareNodes()，其内部按照节点ID进行比较。

clusterStateVersion越大，优先级越高。
clusterStateVersion相同，节点ID越小，优先级越高（倾向ID小的节点）。

寻找 clusterStateVersion 比自己大的候选主节点，向其发送选举消息。
如果。如果 clusterStateVersion 一样，则计算自己能找到候选主节点中（包括自己）节点id最小的一个节点，并向该节点发送选票。
如果一个节点收到足够多的选票（minimum_master_nodes 设置），并且它也向自己投票了，那么该节点就成为Master节点；

成功从’候选Master主节点’中选举出Master节点之后，该Master节点开始发布集群状态、负责集群和索引操作的请求处理。

5. 详细流程

源码位置：org.elasticsearch.discovery.zen.ZenDiscovery#startInitialJoin

挖坑待填