Today I shall not write about Arch linux but about Proxmox VE, since I faced a problem after rebooting one of the cluster’s nodes and see it had lost all network configuration due the horrible and broken Debian’s apt autoremove feature… one is used to pacman and apt needs a major rewrite to avoid those dependency hell which it cannot leave.
Returning to the topic, if you need to add or readd a node to a existing cluster you should do it with this command from the node you want to add:
# pvecm add clustered_node_IP_or_name
Then, the usual behavior if you add the node for the first time, is to copy the keys from the cluster node to the new node, and modify cluster.conf to add an entry for the new node, and the start all related daemons, like cman or rgmanager.
But if you are adding again this node, you probably end with this error:
# pvecm add clustered_node_IP_or_name
authentication key already exists
I’d searched on Internet for this message and many people ended reinstalling the conflicting node, not a good solution at all, so I tried to get a better one.
Obviously, somwhere on the current cluster configuration is the key for that node, and after some time searching for it everywhere on the system, I decided to do some trick, taking the advantage that the key is already on the configuration.
So, the first thing we need to do is to modify cluster.conf manually and add this node, in proxmox, we need to copy /etc/pve/cluster.conf into a file called /etc/pve/cluster.conf.new and edit that copied file
# cp /etc/pve/cluster.conf /etc/pve/cluster.conf.new
# nano /etc/pve/cluster.conf.new
<cluster name=”pvecluster” config_version=”5“>
<clusternode name=”pve01″ votes=”1″ nodeid=”1″/>
<clusternode name=”pve02″ votes=”1″ nodeid=”2″/>
<clusternode name=”quormox” votes=”1″ nodeid=”3″/>
We need to increase the config_version value in one, and then we will add the line <clusternode name=”quormox” votes=”1″ nodeid=”3″/> giving the desired name and ID.
Then, on the proxmox GUI, under the HA tab, we’ll press “Activate” as shown down here
And we will see the changes with the new node.
Now, we need to copy all needed files on the node we want to add. So first we will delete (do a backup first just in case) the folders on that node, but for that, we need to do it in this order, following the red lines commands. In my example, the node I want to add is called quormox and the node with the working configuration is pve01. I also removed all references to quormox on .ssh/known_hosts in all nodes on the cluster.
root@quormox:~# /etc/init.d/pve-cluster stop
Stopping pve cluster filesystem: pve-cluster.
root@quormox:~# umount /etc/pve
umount: /etc/pve: not mounted
root@quormox:~# /etc/init.d/cman stop
Stopping dlm_controld… [ OK ]
Stopping fenced… [ OK ]
Stopping cman… [ OK ]
Unloading kernel modules… [ OK ]
Unmounting configfs… [ OK ]
root@quormox:~# rm /etc/cluster/cluster.conf
root@quormox:~# rm -rf /var/lib/pve-cluster/*
root@quormox:~# scp pve01:/etc/cluster/cluster.conf /etc/cluster/
The authenticity of host ‘pve01 (192.168.96.11)’ can’t be established.
ECDSA key fingerprint is 89:02:2e:79:f3:2a:54:30:2d:78:a8:9c:2c:55:03:e5.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘pve01,192.168.96.11’ (ECDSA) to the list of known hosts.
cluster.conf 100% 340 0.3KB/s 00:00
root@quormox:~# mkdir -p /var/lib/pve-cluster
root@quormox:~# scp pve01:/var/lib/pve-cluster/* /var/lib/pve-cluster/
config.db 100% 64KB 64.0KB/s 00:00
config.db-shm 100% 32KB 32.0KB/s 00:00
config.db-wal 100% 1028KB 1.0MB/s 00:00
corosync.authkey 100% 128 0.1KB/s 00:00
root@quormox:~# /etc/init.d/pve-cluster start
Starting pve cluster filesystem : pve-cluster
After that, a reboot is needed to start all the daemons in the right order. Once rebooted the node is correclty added to the cluster! 😀