# 主节点扩容 OushuDB 主节点是无状态的,因此可以在线扩容。 假设原有集群主节点分别为 oushu1,oushu2,oushu3,Magma 节点为Magma1,Magma2,Magma3,HDFS 节点为 HDFS1,HDFS2,新添加的节点为 oushu4. ## 安装 配置yum源,安装lava命令行管理工具, yum 源需自行配置 ```sh ssh oushu4 # 从yum源所在机器(假设为192.168.1.10)获取repo文件 scp root@192.168.1.10:/etc/yum.repos.d/oushu.repo /etc/yum.repos.d/oushu.repo # 追加yum源所在机器信息到/etc/hosts文件 yum clean all yum makecache yum install -y lava ``` 使用 yum install 的安装方式: ```bash yum install -y oushudb ``` ## 配置 **1. 系统配置** 在 oushu4 节点的系统配置文件 /etc/sysctl.conf 中追加如下内容: ```bash kernel.shmmax = 3000000000 kernel.shmmni = 4096 kernel.shmall = 4000000000 kernel.sem = 250 512000 100 2048 kernel.sysrq = 1 kernel.core_uses_pid = 1 kernel.msgmnb = 65536 kernel.msgmax = 65536 kernel.msgmni = 2048 net.ipv4.tcp_syncookies = 0 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.tcp_tw_recycle = 1 net.ipv4.tcp_max_syn_backlog = 200000 net.ipv4.conf.all.arp_filter = 1 net.ipv4.ip_local_port_range = 10000 65535 net.core.netdev_max_backlog = 200000 net.netfilter.nf_conntrack_max = 524288 fs.nr_open = 3000000 kernel.threads-max = 798720 kernel.pid_max = 798720 net.core.rmem_max=2097152 net.core.wmem_max=2097152 net.core.somaxconn=4096 kernel.core_pattern=/data1/oushudb/cores/core-%e-%s-%u-%g-%p-%t ``` 如果集群计划部署在麒麟操作系统上,需要额外追加网络配置参数: ```bash net.ipv4.tcp_tw_recycle = 0 net.ipv4.tcp_tw_reuse = 1 net.ipv4.ipfrag_high_thresh = 41943040 net.ipv4.ipfrag_low_thresh = 40894464 net.ipv4.udp_mem = 9242685 12323580 18485370 net.ipv4.tcp_mem = 9240912 12321218 18481824 ``` 为了方便软件调试分析,我们建议允许 OushuDB 生成 core dump 文件。 创建文件 ` /etc/security/limits.d/oushu.conf` ```bash touch /etc/security/limits.d/oushu.conf ``` 并向其写入内容: ```bash * soft nofile 1048576 * hard nofile 1048576 * soft nproc 131072 * hard nproc 131072 oushu soft core unlimited oushu hard core unlimited ``` 配置原始 OushuDB 集群节点IP到节点的 `/etc/hosts` 中 ```bash echo 192.168.1.11 oushu1 >>/etc/hosts echo 192.168.1.12 oushu2 >>/etc/hosts echo 192.168.1.13 oushu3 >>/etc/hosts ``` 配置 Magma 集群节点IP到节点的`/etc/hosts`中 ```bash echo 192.168.1.21 magma1 >>/etc/hosts echo 192.168.1.22 magma2 >>/etc/hosts echo 192.168.1.23 magma3 >>/etc/hosts ``` 如果配置 HDFS 作为存储引擎,则需要将 HDFS 集群节点 IP 添加到本机的`/etc/hosts`中: ```bash echo 192.168.1.21 hdfs1 >>/etc/hosts echo 192.168.1.22 hdfs2 >>/etc/hosts ``` 如果 HDFS 存储引擎配置了 Kerberos, 则需要将对应的 KDC 节点的 IP 添加到集群的 `/etc/hosts`中,并安装kerberos client: ```bash echo 192.168.1.31 kdcserver >>/etc/hosts yum install -y krb5-libs krb5-workstation ``` **2. 集群配置** 在 oushu 用户下,创建 `oushuhosts` 文件,包含 OushuDB 集群中所有机器: ```bash touch ~/oushuhosts ``` 编写 `oushuhosts` 文件内容如下: ``` oushu1 oushu2 oushu3 oushu4 ``` 与集群进行交换ssh-key `oushudb ssh-exkeys -f ~/oushuhostfile` 添加本机 IP 到集群内 hosts 中: ``` sudo su root lava ssh -f ~/oushuhosts -e "echo '192.168.1.14 oushu4' >>/etc/hosts" ``` **3. 配置文件** 配置 main 文件目录: ``` mkdir -p /data1/oushudb/masterdd mkdir -p /data1/oushudb/tmp chown -R oushu:oushu /data1/oushudb ``` 配置生成 core 的文件目录: ``` mkdir -p /data1/oushudb/cores chmod 777 /data1/oushudb/cores ``` **4. 修改配置文件** 从原有数据库 main 节点的 `/usr/local/oushu/conf/`中获取所有配置文件,并保存到本节点相同路径下。 修改 oushudb-topology.yaml 文件: 假设其为: ```yaml nodes: - id: m[001] addr: 192.168.1.11 label: { region: "regionA", zone: "zoneA"} - id: m[002] addr: 192.168.1.12 label: { region: "regionA", zone: "zoneA"} - id: m[003] addr: 192.168.1.13 label: { region: "regionA", zone: "zoneA"} vc: - name: mains vci: - nodes: m[001], m[002] - name: vc_default vci: - name: vci1 nodes: m[001-002] - name: vc1 vci: - name: vci2 nodes: m[003] ``` 将本节点添加到 nodes 下,并在 mains 的 vci 内添加本节点: ```yaml nodes: - id: m[001] addr: 192.168.1.11 label: { region: "regionA", zone: "zoneA"} - id: m[002] addr: 192.168.1.12 label: { region: "regionA", zone: "zoneA"} - id: m[003] addr: 192.168.1.13 label: { region: "regionA", zone: "zoneA"} - id: m[004] addr: 192.168.1.14 label: { region: "regionA", zone: "zoneA"} vc: - name: mains vci: - nodes: m[001], m[002], m[004] - name: vc_default ..... ``` 将 oushudb-topology.yaml 发送到其他节点: ``` lava scp -f ~/oushuhosts /usr/local/oushu/conf/oushudb-topology.yaml =:/usr/local/oushu/conf/ ``` ## 初始化 初始化本节点 ```bash oushudb init main ``` 在当前节点进行集群重载 oushudb-topology.yaml: ```bash oushudb reload cluster -a oushudb reload vcluster -a ``` ## 验证 登录数据库,执行如下检查: ```psql select * from gp_segment_configuration; ``` 检查是否有 oushu4 节点的条目,并且其 status 为 u,vc 为 mains;