‘authentication key already exists’ error when adding a proxmox node to a cluster

Standard

Today I shall not write about Arch linux but about Proxmox VE, since I faced a problem after rebooting one of the cluster’s nodes and see it had lost all network configuration due the horrible and broken Debian’s apt autoremove feature… one is used to pacman and apt needs a major rewrite to avoid those dependency hell which it cannot leave.

Returning to the topic, if you need to add or readd a node to a existing cluster you should do it with this command from the node you want to add:

# pvecm add clustered_node_IP_or_name

Then, the usual behavior if you add the node for the first time, is to copy the keys from the cluster node to the new node, and modify cluster.conf to add an entry for the new node, and the start all related daemons, like cman or rgmanager.

But if you are adding again this node, you probably end with this error:

# pvecm add clustered_node_IP_or_name
authentication key already exists

I’d searched on Internet for this message and many people ended reinstalling the conflicting node, not a good solution at all, so I tried to get a better one.

Obviously, somwhere on the current cluster configuration is the key for that node, and after some time searching for it everywhere on the system, I decided to do some trick, taking the advantage that the key is already on the configuration.

So, the first thing we need to do is to modify cluster.conf manually and add this node, in proxmox, we need to copy /etc/pve/cluster.conf into a file called /etc/pve/cluster.conf.new and edit that copied file

# cp /etc/pve/cluster.conf /etc/pve/cluster.conf.new
# nano /etc/pve/cluster.conf.new

<?xml version=”1.0″?>
<cluster name=”pvecluster” config_version=”5“>

<cman keyfile=”/var/lib/pve-cluster/corosync.authkey”>
</cman>

<clusternodes>
<clusternode name=”pve01″ votes=”1″ nodeid=”1″/>
<clusternode name=”pve02″ votes=”1″ nodeid=”2″/>
<clusternode name=”quormox” votes=”1″ nodeid=”3″/>  
</clusternodes>

</cluster>

We need to increase the config_version value in one, and then we will add the line <clusternode name=”quormox” votes=”1″ nodeid=”3″/>  giving the desired name and ID.

Then, on the proxmox GUI, under the HA tab, we’ll press “Activate” as shown down here

And we will see the changes with the new node.

Now, we need to copy all needed files on the node we want to add. So first we will delete (do a backup first just in case) the folders on that node, but for that, we need to do it in this order, following the red lines commands. In my example, the node I want to add is called quormox and the node with the working configuration is pve01. I also removed all references to quormox on .ssh/known_hosts in all nodes on the cluster.

root@quormox:~# /etc/init.d/pve-cluster stop
Stopping pve cluster filesystem: pve-cluster.
root@quormox:~# umount /etc/pve
umount: /etc/pve: not mounted
root@quormox:~# /etc/init.d/cman stop
Stopping cluster:
Stopping dlm_controld… [  OK  ]
Stopping fenced… [  OK  ]
Stopping cman… [  OK  ]
Unloading kernel modules… [  OK  ]
Unmounting configfs… [  OK  ]
root@quormox:~# rm /etc/cluster/cluster.conf
root@quormox:~# rm -rf /var/lib/pve-cluster/*
root@quormox:~# scp pve01:/etc/cluster/cluster.conf /etc/cluster/
The authenticity of host ‘pve01 (192.168.96.11)’ can’t be established.
ECDSA key fingerprint is 89:02:2e:79:f3:2a:54:30:2d:78:a8:9c:2c:55:03:e5.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘pve01,192.168.96.11’ (ECDSA) to the list of known hosts.
root@pve01’s password:
cluster.conf                                                100%  340     0.3KB/s   00:00
root@quormox:~# mkdir -p /var/lib/pve-cluster
root@quormox:~# scp pve01:/var/lib/pve-cluster/* /var/lib/pve-cluster/
root@pve01’s password:
config.db                                                   100%   64KB  64.0KB/s   00:00
config.db-shm                                               100%   32KB  32.0KB/s   00:00
config.db-wal                                               100% 1028KB   1.0MB/s   00:00
corosync.authkey                                            100%  128     0.1KB/s   00:00
root@quormox:~# /etc/init.d/pve-cluster start
Starting pve cluster filesystem : pve-cluster

After that, a reboot is needed to start all the daemons in the right order. Once rebooted the node is correclty added to the cluster! 😀

 

Advertisements