Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: introduce VLAN for private networking #38

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

displague
Copy link
Member

@displague displague commented Jul 10, 2024

This is a non-functional, work in progress, attempt to change the network configuration of this module such that all nodes except for the bastion will have restricted access to the internet (relying on the bastion for forwarding). This change will remove EM assigned IPs these nodes. They will rely on DHCP from the bastion, using a range of addresses assigned to an EM Gateway VRF.

Inspiration is drawn from other Equinix Metal Terraform modules, including https://github.com/equinix-labs/terraform-equinix-metal-nutanix-cluster

@@ -0,0 +1,59 @@
## template: jinja
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: create an equivalent RHEL 7 config

:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
COMMIT
- path: /etc/network/interfaces
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is certainly not correct for non-Debian based systems

Copy link
Member Author

@displague displague Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a starting point for changes to consider, ChatGPT suggests:

...
- path: /etc/sysconfig/iptables
  content: |
    ... same
- path: /etc/sysconfig/ip6tables
  content: |
    ... same
- path: /etc/sysconfig/network-scripts/ifcfg-bond0.${metal_vlan_id}
  content: |
    DEVICE=bond0.${metal_vlan_id}
    BOOTPROTO=static
    ONBOOT=yes
    IPADDR=${address}
    NETMASK=${netmask}
    VLAN=yes
...

Copy link
Member Author

@displague displague Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a little more prodding (again, this is just to serve as a pointer in the general direction of what may be needed):

- path: /etc/sysconfig/network-scripts/ifcfg-bond0.${metal_vlan_id}
  content: |
    DEVICE=bond0.${metal_vlan_id}
    NAME=bond0.${metal_vlan_id}
    TYPE=Bond
    BONDING_MASTER=yes
    IPADDR=${address}
    NETMASK=${netmask}
    ONBOOT=yes
    BOOTPROTO=none
    BONDING_OPTS="mode=1 miimon=100"
    NM_CONTROLLED="no"
- path: /etc/sysconfig/network-scripts/ifcfg-eth0
  content: |
    DEVICE=eth0
    NAME=bond0-slave
    TYPE=Ethernet
    ONBOOT=yes
    BOOTPROTO=none
    MASTER=bond0
    SLAVE=yes
    NM_CONTROLLED="no"
- path: /etc/sysconfig/network-scripts/ifcfg-eth1
  content: |
    DEVICE=eth1
    NAME=bond0-slave
    TYPE=Ethernet
    ONBOOT=yes
    BOOTPROTO=none
    MASTER=bond0
    SLAVE=yes
    NM_CONTROLLED="no"
...
runcmd:
- modprobe bonding
- ifup eth0
- ifup eth1
- ifup bond0.${metal_vlan_id}
...

Copy link
Member Author

@displague displague Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we'll have eth0/eth1 names. On the bootstrap node this may be tricky unless we can disable predictable names for the first boot (Arch guidance may have relevance: https://wiki.archlinux.org/title/Network_configuration#Revert_to_traditional_interface_names)

We'll also need to add kernel command line args for the IPXE scripts to have them come up bonded, which may also need to include net.ifnames=0 (bootloader --location=mbr --append="net.ifnames=0 biosdevname=0")

Copy link
Member Author

@displague displague Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, enp0sX naming may work fine if we document the server plans (c3.medium, etc) that these names are known to work within.

port_id = [for p in equinix_metal_device.lb.ports : p.id if p.name == "bond0"][0]
layer2 = false
bonded = true
vlan_ids = [var.vlan]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to metal_vlan for consistency

@@ -72,7 +103,7 @@ resource "null_resource" "ipxe_files" {

content = templatefile("${path.module}/assets/ipxe.tpl", {
node_type = each.value
bastion_ip = equinix_metal_device.lb.access_public_ipv4
bastion_ip = equinix_metal_device.lb.access_private_ipv4
Copy link
Member Author

@displague displague Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

access_private_ipv4 needs to be replaced with an address from the VRF range, 192.168.100.2
(every mention of access_private_ipv4 should be replaced since we are not using the EM private management network, using VRF over VLAN 1000 instead)


resource "equinix_metal_vrf" "openshift" {
count = var.create_vrf ? 1 : 0
description = "VRF with ASN 65000 and a pool of address space that includes 192.168.100.0/25"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description = "VRF with ASN 65000 and a pool of address space that includes 192.168.100.0/25"
description = "VRF with ASN 65000 and a pool of address space"

@@ -0,0 +1,11 @@
output "vrf_id" {
Copy link
Member Author

@displague displague Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire module needs a terraform fmt -recursive, for existing code too.
Opting to do that in a separate PR to avoid noise.

required_providers {
equinix = {
source = "equinix/equinix"
version = "~> 1.14"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider bumping all Equinix provider version deps to ~> 2

@@ -8,5 +8,9 @@ resource "equinix_metal_device" "node" {
count = var.node_count
billing_cycle = "hourly"
project_id = var.project_id
ip_address {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed. There was a starting thought that all nodes could use the EM Private IPs, but DHCP management needs, configuring a gateway, meant that VRF and DHCP services running on the bastion would be a better fit.

@@ -1,5 +1,5 @@
output "node_ip" {
value = equinix_metal_device.node.*.access_public_ipv4
value = equinix_metal_device.node.*.access_private_ipv4
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may pose a problem. access_private_ipv4 should certainly be removed, but we can not depend on access_public_ipv4 either since the nodes will not have this address. The node_ip is only used for DNS (that I recall at the moment) and this is used within the cluster. Either we don't need to assign these addresses to DNS (they may be optional in OpenShift install guidance) or we'll need to pin the DHCP assigned addresses such that we can predict the DHCP assigned IP for assignment to the DNS name.

@@ -32,6 +32,11 @@ variable "metal_project_id" {
description = "Your Equinix Metal Project ID"
}

variable "metal_vlan" {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want to allow for existing VLANs to be plumbed in. The network module already supports that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant