TL;DR Keep control of the entire cluster pool of IPs from your networking plane. Avoid potential IP conflicts and streamline automated deployments with DHCP managed, albeit statically reserved assignments.
ORIGINAL POST DHCP setup of a cluster
PVE static network configuration ^ is not actually a real prerequisite, not even for clusters. The intended use case for this guide is to cover a rather stable environment, but allow for centralised management.
CAUTION While it actually is possible to change IPs or hostnames without a reboot (more on that below), you WILL suffer from the same issues as with static network configuration in terms of managing the transition.
IMPORTANT This guide assumes that the nodes satisfy all of the below requirements, latest before you start adding them to the cluster and at all times after.
- have reserved their IP address at DHCP server; and
- obtain reasonable lease time for the IPs; and
- get nameserver handed out via DHCP Option 6;
- can reliably resolve their hostname via DNS lookup;
TIP There is also a much simpler guide for single node DHCP setups which does not pose any special requirements.
Taking dnsmasq ^ for an example, you will need at least the equivalent of the following (excerpt):
dhcp-range=set:DEMO_NET,10.10.10.100,10.10.10.199,255.255.255.0,1d domain=demo.internal,10.10.10.0/24,local dhcp-option=tag:DEMO_NET,option:domain-name,demo.internal dhcp-option=tag:DEMO_NET,option:router,10.10.10.1 dhcp-option=tag:DEMO_NET,option:dns-server,10.10.10.11 dhcp-host=aa:bb:cc:dd:ee:ff,set:DEMO_NET,10.10.10.101 host-record=pve1.demo.internal,10.10.10.101
There are appliance-like solutions, e.g. VyOS ^ that allow for this in an error-proof way.
Some tools that will help with troubleshooting during the deployment:
ip -c a
should reflect dynamically assigned IP address (excerpt):
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether aa:bb:cc:dd:ee:ff brd ff:ff:ff:ff:ff:ff inet 10.10.10.101/24 brd 10.10.10.255 scope global dynamic enp1s0
hostnamectl
checks the hostname, if static is unset or set tolocalhost
, the transient one is decisive (excerpt):
Static hostname: (unset) Transient hostname: pve1
dig nodename
confirms correct DNS name lookup (excerpt):
;; ANSWER SECTION: pve1. 50 IN A 10.10.10.101
hostname -I
can essentially verify all is well the same way the official docs actually suggest.
You may use any of the two manual installation methods. Unattended install is out of scope here.
The ISO installer ^ leaves you with static configuration.
Change this by editing /etc/network/interfaces
- your vmbr0
will look like this (excerpt):
iface vmbr0 inet dhcp bridge-ports enp1s0 bridge-stp off bridge-fd 0
Remove the FQDN hostname entry from /etc/hosts
and remove the /etc/hostname
file. Reboot.
See below for more details.
There is official Debian installation walkthrough, ^ simply skip the initial (static) part, i.e. install plain (i.e. with DHCP) Debian. You can fill in any hostname, (even localhost
) and any domain (or no domain at all) to the installer.
After the installation, upon the first boot, remove the static hostname file:
rm /etc/hostname
The static hostname will be unset and the transient one will start showing in hostnamectl
output.
NOTE
If your initially chosen hostname was localhost
, you could get away with keeping this file populated, actually.
It is also necessary to remove the 127.0.1.1 hostname
entry from /etc/hosts
.
Your /etc/hosts
will be plain like this:
127.0.0.1 localhost # NOTE: Non-loopback lookup managed via DNS # The following lines are desirable for IPv6 capable hosts ::1 localhost ip6-localhost ip6-loopback ff02::1 ip6-allnodes ff02::2 ip6-allrouters
This is also where you should actually start the official guide - “Install Proxmox VE”. ^
TIP This guide may ALSO be used to setup a SINGLE NODE. Simply do NOT follow the instructions beyond this point.
This part logically follows manual installs.
Unfortunately, PVE tooling populates the cluster configuration (corosync.conf
) ^ with resolved IP addresses upon the inception.
Creating a cluster from scratch:
pvecm create demo-cluster
Corosync Cluster Engine Authentication key generator. Gathering 2048 bits for key from /dev/urandom. Writing corosync key to /etc/corosync/authkey. Writing corosync config to /etc/pve/corosync.conf Restart corosync and cluster filesystem
While all is well, the hostname got resolved and put into cluster configuration as an IP address:
cat /etc/pve/corosync.conf
logging { debug: off to_syslog: yes } nodelist { node { name: pve1 nodeid: 1 quorum_votes: 1 ring0_addr: 10.10.10.101 } } quorum { provider: corosync_votequorum } totem { cluster_name: demo-cluster config_version: 1 interface { linknumber: 0 } ip_version: ipv4-6 link_mode: passive secauth: on version: 2 }
This will of course work just fine, but It defeats the purpose. You may choose to do the following now (one by one as nodes are added), or may defer the repetitive work till you gather all nodes for your cluster. The below demonstrates the former.
All there is to do is to replace the ringX_addr
with the hostname. The official docs ^ are rather opinionated how such edits should be performed.
CAUTION
Be sure to include the domain as well in case your nodes do not share one. Do NOT change the name
entry for the node.
At any point, you may check journalctl -u pve-cluster
to see that all went well:
[dcdb] notice: wrote new corosync config '/etc/corosync/corosync.conf' (version = 2) [status] notice: update cluster info (cluster name demo-cluster, version = 2)
Now, when you are going to add a second node to the cluster (in CLI, this is done counter-intuitively from to-be-added node referencing a node already in the cluster):
pvecm add pve1.demo.internal
Please enter superuser (root) password for 'pve1.demo.internal': ********** Establishing API connection with host 'pve1.demo.internal' The authenticity of host 'pve1.demo.internal' can't be established. X509 SHA256 key fingerprint is 52:13:D6:A1:F5:7B:46:F5:2E:A9:F5:62:A4:19:D8:07:71:96:D1:30:F2:2E:B7:6B:0A:24:1D:12:0A:75:AB:7E. Are you sure you want to continue connecting (yes/no)? yes Login succeeded. check cluster join API version No cluster network links passed explicitly, fallback to local node IP '10.10.10.102' Request addition of this node cluster: warning: ring0_addr 'pve1.demo.internal' for node 'pve1' resolves to '10.10.10.101' - consider replacing it with the currently resolved IP address for stability Join request OK, finishing setup locally stopping pve-cluster service backup old database to '/var/lib/pve-cluster/backup/config-1726922870.sql.gz' waiting for quorum...OK (re)generate node files generate new node certificate merge authorized SSH keys generated new node certificate, restart pveproxy and pvedaemon services successfully added node 'pve2' to cluster.
It hints you about using the resolved IP as static entry (fallback to local node IP '10.10.10.102'
) for this action (despite hostname was provided) and indeed you would have to change this second incarnation of corosync.conf
again.
So your nodelist (after the second change) should look like this:
nodelist { node { name: pve1 nodeid: 1 quorum_votes: 1 ring0_addr: pve1.demo.internal } node { name: pve2 nodeid: 2 quorum_votes: 1 ring0_addr: pve2.demo.internal } }
NOTE If you wonder about the warnings on “stability” and how corosync actually supports resolving names, you may wish to consult ^ (excerpt):
ADDRESS RESOLUTION
corosync resolves
ringX_addr
names/IP addresses using thegetaddrinfo(3)
call with respect oftotem.ip_version
setting.
getaddrinfo()
function uses a sophisticated algorithm to sort node addresses into a preferred order and corosync always chooses the first address in that list of the required family. As such it is essential that your DNS or/etc/hosts
files are correctly configured so that all addresses forringX
appear on the same network (or are reachable with minimal hops) and over the same IP protocol.CAUTION At this point, it is suitable to point out the importance of
ip_version
parameter (defaults toipv6-4
when unspecified, but PVE actually populates it toipv4-6
), ^ but also the configuration of hosts innsswitch.conf
. ^
You may want to check if everything is well with your cluster at this point, either with pvecm status
^ or generic corosync-cfgtool
. Note you will still see IP addresses and IDs in this output, as they got resolved.
Particularly useful to check at any time is netstat (you may need to install net-tools
):
netstat -pan | egrep '5405.*corosync'
This is especially true if you are wondering why your node is missing from a cluster. Why could this happen? If you e.g. have improperly configured DHCP and your node suddenly gets a new IP leased, corosync will NOT automatically take this into account:
DHCPREQUEST for 10.10.10.103 on vmbr0 to 10.10.10.11 port 67 DHCPNAK from 10.10.10.11 DHCPDISCOVER on vmbr0 to 255.255.255.255 port 67 interval 4 DHCPOFFER of 10.10.10.113 from 10.10.10.11 DHCPREQUEST for 10.10.10.113 on vmbr0 to 255.255.255.255 port 67 DHCPACK of 10.10.10.113 from 10.10.10.11 bound to 10.10.10.113 -- renewal in 57 seconds. [KNET ] link: host: 2 link: 0 is down [KNET ] link: host: 1 link: 0 is down [KNET ] host: host: 2 (passive) best link: 0 (pri: 1) [KNET ] host: host: 2 has no active links [KNET ] host: host: 1 (passive) best link: 0 (pri: 1) [KNET ] host: host: 1 has no active links [TOTEM ] Token has not been received in 2737 ms [TOTEM ] A processor failed, forming new configuration: token timed out (3650ms), waiting 4380ms for consensus. [QUORUM] Sync members[1]: 3 [QUORUM] Sync left[2]: 1 2 [TOTEM ] A new membership (3.9b) was formed. Members left: 1 2 [TOTEM ] Failed to receive the leave message. failed: 1 2 [QUORUM] This node is within the non-primary component and will NOT provide any services. [QUORUM] Members[1]: 3 [MAIN ] Completed service synchronization, ready to provide service. [status] notice: node lost quorum [dcdb] notice: members: 3/1080 [status] notice: members: 3/1080 [dcdb] crit: received write while not quorate - trigger resync [dcdb] crit: leaving CPG group
This is because corosync has still link bound to the old IP, what is worse however, even if you restart the corosync service on the affected node, it will NOT be sufficient, the remaining cluster nodes will be rejecting its traffic with:
[KNET ] rx: Packet rejected from 10.10.10.113:5405
It is necessary to restart corosync on ALL nodes to get them back into (eventually) the primary component of the cluster. Finally, you ALSO need to restart pve-cluster service on the affected node (only).
TIP If you see wrong IP address even after restart, and you have all correct configuration in thecorosync.conf
, you need to troubleshoot starting withjournalctl -t dhclient
(and checking the DHCP server configuration if necessary), but eventually may even need to checknsswitch.conf
^ andgai.conf
. ^