Skip to content

Commit

Permalink
Basic docs for A2 release
Browse files Browse the repository at this point in the history
  • Loading branch information
Frostman committed Dec 22, 2023
1 parent bba8110 commit afdd5dd
Show file tree
Hide file tree
Showing 44 changed files with 1,725 additions and 171 deletions.
7 changes: 4 additions & 3 deletions docs/.pages
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
# https://github.com/lukasgeiter/mkdocs-awesome-pages-plugin
nav:
- index.md
- getting-started
- concepts
- Wiring Diagram: wiring
- Concepts: concepts
- Getting Started: getting-started
- Virtual Lab (VLAB): vlab
- Install & Upgrade: install-upgrade
- User Guide: user-guide
- Reference: reference
- Architecture: architecture
- Troubleshooting: troubleshooting
- ...
- release-notes
Expand Down
4 changes: 4 additions & 0 deletions docs/architecture/.pages
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
nav:
- Overview: overview.md
- Fabric Implementation: fabric.md
- ...
File renamed without changes
File renamed without changes
92 changes: 92 additions & 0 deletions docs/architecture/fabric.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Hedgehog Network Fabric

The Hedgehog Open Network Fabric is an open source network architecture that provides connectivity between virtual and
physical workloads and provides a way to achieve network isolation between different groups of workloads using standar
BGP EVPN and vxlan technology. The fabric provides a standard kubernetes interfaces to manage the elements in the
physical network and provides a mechanism to configure virtual networks and define attachments to these virtual networks.
The Hedgehog Fabric provides isolation between different groups of workloads by placing them in different virtual
networks called VPC's. To achieve this we define different abstractions starting from the physical network where we
define `Connection` which defines how a physical server on the network connects to a physical switch on the fabric.

## Underlay Network

The Hedgehog Fabric currently support two underlay network topologies.

### Collapsed Core

A collapsed core topology is just a pair of switches connected in a mclag configuration with no other network elements.
All workloads attach to these two switches.

![image](./fabric-collapsedcore.png)

The leaf's in this setup are configured to be in a mclag pair and servers can either be connected to both switches as
a mclag port channel or as orphan ports connected to only one switch. both the leaves peer to external networks using
BGP and act as gateway for workloads attached to them. The configuration of the underlay in the collapsed core is very
simple and is ideal for very small deployments.

### Spine - Leaf

A spine-leaf topology is a standard clos network with workloads attaching to leaf switches and spines providing
connectivity between different leaves.

![image](./fabric-spineleaf.png)

This kind of topology is useful for bigger deployments and provides all the advantages of a typical clos network.
The underlay network is established using eBGP where each leaf has a separate ASN and peers will all spines in the
network. We used [RFC7938](https://datatracker.ietf.org/doc/html/rfc7938) as the reference for establishing the
underlay network.

## Overlay Network

The overlay network runs on top the underlay network to create a virtual network. The overlay network isolates control
and data plane traffic between different virtual networks and the underlay network. Vitualization is achieved in the
hedgehog fabric by encapsulating workload traffic over vxlan tunnels that are source and terminated on the leaf switches
in the network. The fabric using BGP-EVPN/Vxlan to enable creation and management of virtual networks on top of the
virtual. The fabric supports multiple virtual networks over the same underlay network to support multi-tenancy. Each
virtual network in the hedgehog fabric is identified by a VPC. In the following sections we will dive a bit deeper into
a high level overview of how are vpc's implemented in the hedgehog fabric and it's associated objects.

## VPC
We know what is a VPC and how to attach workloads to a specific VPC. Let us now take a look at how is this actually
implemented on the network to provice the view of a private network.

- Each VPC is modeled as a vrf on each switch where there are VPC attachments defined for this vpc.
The Vrf is allocated its own VNI. The Vrf is local to each switch and the VNI is global for the entire fabric. By
mapping the vrf to a VNI and configuring an evpn instance in each vrf we establish a shared l3vni across the entire
fabric. All vrf participating in this vni can freely communicate with each other without a need for a policy. A Vlan
is allocated for each vrf which functions as a IRB Vlan for the vrf.
- The vrf created on each switch corresponding to a VPC configures a BGP instance with evpn to advertise its locally
attached subnets and import routes from its peered VPC's. The BGP instance in the tenant vrf's does not establish
neighbor relationships and is purely used to advertise locally attached routes into the VPC (all vrf's with the same
l3vni) across leafs in the network.
- A VPC can have multuple subnets. Each Subnet in the VPC is modeled as a Vlan on the switch. The vlan is only locally
significant and a given subnet might have different Vlan's on different leaves on the network. We assign a globally
significant vni for each subnet. This VNI is used to extend the subnet across different leaves in the network and
provides a view of single streched l2 domain if the applications need it.
- The hedgehog fabric has a built-in DHCP server which will automatically assign IP addresses to each workload
depending on the VPC it belongs to. This is achieved by configuring a DHCP relay on each of the server facing vlans.
The DHCP server is accesible through the underlay network and is shared by all vpcs in the fabric. The inbuilt DHCP
server is capable of identifying the source VPC of the request and assigning IP addresses from a pool allocated to the
VPC at creation.
- A VPC by default cannot communicate to anyone outside the VPC and we need to define specific peering rules to allow
communication to external networks or to other VPCs.

## VPC Peering
To enable communication between 2 different VPC's we need to configure a VPC peering policy. The hedgehog fabric
supports two different peering modes.

- Local Peering - A local peering directly imports routers from the other VPC locally. This is achieved by a simple
import route from the peer VPC. In case there are no locally attached worloads to the peer VPC the fabric
automatically creates a stub vpc for peering and imports routes from it. This allows VPC's to peer with each other
without the need for dedicated peering leaf. If a local peering is done for a pair of VPC's which have locally
attached workloads the fabric automatically allocates a pair of ports on the switch to router traffic between these
vrf's using static routes. This is required because of limitations in the underlying platform. The net result of
this is that the bandwidth between these 2 VPC's is limited by the bandwidth of the loopback interfaces allocated
on the switch.
- Remote Peering - Remote peering is implemented using a dedicated peering switch/switches which is used as a
rendezvous point for the 2 VPC's in the fabric. The set of switches to be used for peering is determined by
configuration in the peering policy. When a remote peering policy is applied for a pair of VPC's the vrf's
corresponding to these VPC's on the peering switch advertise default routes into their specific vrf's identified
by the l3vni. All traffic that does not belong to the VPC's is forwarded to the peering switch and which has routes
to the other VPC's and gets forwarded from there. The bandwith limitation that exists in the local peering solution
is solved here as the bandwith between the two VPC's is determined by the fabric cross section bandwidth.
4 changes: 4 additions & 0 deletions docs/architecture/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Overview

!!! warning ""
Under construction.
98 changes: 97 additions & 1 deletion docs/concepts/overview.md
Original file line number Diff line number Diff line change
@@ -1 +1,97 @@
# Concepts
# Concepts

## Introduction

Hedgehog Open Network Fabric is build on top of Kubernetes and uses Kubernetes API to manage its resources. It means
that all user-facing APIs are [Kubernetes Custom Resources (CRDs)](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)
and so you can use standard Kubernetes tools to manage Fabric resources.

Hedgehog Fabric consists of the following components:

* Fabricator - special tool that allows to install and configre Fabric as well as run virtual labs
* Control Node - one or more Kubernetes nodes in a single clusters running Fabric software
* Das Boot - set of services providing switch boot and installation
* Fabric Controller - main control plane component that manages Fabric resources
* Fabric Kubectl plugin (Fabric CLI) - plugin for kubectl that allows to manage Fabric resources in an easy way
* Fabric Agent - runs on every switch and manages switch configuration

## Fabric API

All infrastructure is represented as a set of Fabric resource (Kubernetes CRDs) and named Wiring Diagram. It allows to
define switches, servers, control nodes, external systems and connections between them in a single place and then use
it to deploy and manage the whole infrastructure. On top of it Fabric provides a set of APIs to manage the VPCs and
connections between them and between VPCs and External systems.

### Wiring Diagram API

Wiring Diagram consists of the following resources:

* "Devices": describes any device in the Fabric
* __Switch__: configuration of the switch, like port group speeds, port breakouts, switch IP/ASN, etc.
* __Server__: any physical server attached to the Fabric including control nodes
* __Connection__: *any* logical connection for devices
* usually it's a connection between two or more ports on two different devices
* incl. MCLAG Peer Link, Unbundled/MCLAG server connections, Fabric connection between spine and leaf etc.
* __VLANNamespace__ -> non-overlapping VLAN ranges for attaching servers
* __IPv4Namespace__ -> non-overlapping IPv4 ranges for VPC subnets

### User-facing API

* VPC API
* __VPC__: Virtual Private Cloud, similar to the public cloud VPC it provides an isolated private network for the
resources with support for multiple subnets each with user-provided VLANs and on-demand DHCP
* __VPCAttachment__: represents a specific VPC subnet assignemnt to the Connection object which means exact server port to a VPC binding
* __VPCPeering__: enables VPC to VPC connectivity (could be Local where VPCs are used or Remote peering on the border/mixed leafs)
* External API
* __External__: definition of the "external system" to peer with (could be one or multiple devices such as edge/provider routers)
* __ExternalAttachment__: configuration for a specific switch (using Connection object) describing how it connects to an external system
* __ExternalPeering__: enables VPC to External connectivity by exposing specific VPC subnets to the external system and allowing inbound routes from it

## Fabricator

Installer builder and VLAB.

* Installer builder based on a preset (currently vlab for virtual & lab for physical)
* Main input - wiring diagram
* All input artifacts coming from OCI registry
* Always full airgap (everything running from private registry)
* Flatcar Linux for control node, generated ignition.json
* Automatic k3s installation and private registry setup
* All components and their dependencies running in K8s
* Integrated Virtual Lab (VLAB) management
* Future:
* In-cluster (control) Operator to manage all components
* Upgrades handling for everything starting control node OS
* Installation progress, status and retries
* Disaster recovery and backups

## Das Boot

Switch boot and installation.

* Seeder
* Actual switch provisioing
* ONIE on a switch discovers control node using LLDP
* It loads and runs our multi-stage installer
* Network configuration & identity setup
* Performs device registration
* Hedgehog identity partion gets created on the switch
* Downloads SONiC installer and runs it
* Downloads Agent and it's config and installs to the switch
* Registration Controller
* Device identity and registration
* Actual SONiC installers
* Misc: rsyslog/ntp

## Fabric

Control plane and switch agent.

* Currently Fabric is basically single controller manager running in K8s
* It includes controllers for different CRDs and needs
* For example, auto assigning VNIs to VPC or generating Agent config
* Additionally, it's running admission webhook for our CRD APIs
* Agent is watching for the corresonding Agent CRD in K8s API
* It applies the changes and saves new config locally
* It reports back some status and information back to API
* Can perform reinstall and reboot of SONiC
4 changes: 4 additions & 0 deletions docs/contribute/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Overview

!!! warning ""
Under construction.
3 changes: 3 additions & 0 deletions docs/getting-started/.pages
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
nav:
- download.md
- ...
35 changes: 35 additions & 0 deletions docs/getting-started/download.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Download

## Getting access

Prior to the General Availability, access to the full software is limited and requires Design Partner Agreement.
Please submit a ticket with the request using [Hedgehog Support Portal](https://support.githedgehog.com/).

After that you will be provided with the credentials to access the software on [GitHub Package](https://ghcr.io).
In order to use it you need to login to the registry using the following command:

```bash
docker login ghcr.io
```

## Downloading the software

The main entry point for the software is the Hedgehog Fabricator CLI named `hhfab`. All software is published into the
OCI registry [GitHub Package](https://ghcr.io) including binaries, container images, helm charts and etc.
The `hhfab` binary can be downloaded from the [GitHub Package](https://ghcr.io) using the following command:

```bash
curl -fsSL https://i.hhdev.io/hhfab | VERSION=alpha-2 bash
```

The `VERSION` environment variable can be used to specify the version of the software to download. If it's not specified
the latest release will be downloaded. You can pick specific release series (e.g. `alpha-2`) or specific release.

It requires [ORAS](https://oras.land/) to be installed which is used to download the binary from the OCI registry and
could be installed using following command:

```bash
curl -fsSL https://i.hhdev.io/oras | bash
```

Currently only Linux x86 is supported for running `hhfab`.
1 change: 0 additions & 1 deletion docs/getting-started/overview.md

This file was deleted.

12 changes: 12 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,14 @@
# Introduction

Hedgehog Open Network Fabric is an open networking platform that brings the user experience enjoyed by so many in the
public cloud to the private environments. Without vendor lock-in.

Fabric is built around concept of VPCs (Virtual Private Clouds) similar to the public clouds and provides a multi-tenant
API to define user intent on network isolation, connectivity and etc which gets automatically transformed into switches
and software appliances configuration.

You can read more about [concepts](concepts/overview.md), [how to get started](getting-started/overview.md) and
[architecture](architecture/overview.md) in the documentation.

You can find how to [download](getting-started/download.md) and try Fabric on the self-hosted
[fully virtualized lab](vlab/overview.md) or on the [hardware](install-upgrade/overview.md).
7 changes: 7 additions & 0 deletions docs/install-upgrade/.pages
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
nav:
- Overview: overview.md
- Supported Devices: supported-devices.md
- System Requirements: requirements.md
- Build Wiring Diagram: build-wiring.md
- ONiE Update (prepare switch): onie-update.md
- ...
34 changes: 34 additions & 0 deletions docs/install-upgrade/build-wiring.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Build Wiring Diagram

!!! warning ""
Under construction.

In the meantime, to have a look at the working wiring diagram for the Hedgehog Fabric, please run sample generator that
produces VLAB-compatible wiring diagrams:

```bash
ubuntu@sl-dev:~$ hhfab wiring sample -h
NAME:
hhfab wiring sample - sample wiring diagram (would work for vlab)

USAGE:
hhfab wiring sample [command options] [arguments...]

OPTIONS:
--brief, -b brief output (only warn and error) (default: false)
--fabric-mode value, -m value fabric mode (one of: collapsed-core, spine-leaf) (default: "spine-leaf")
--help, -h show help
--verbose, -v verbose output (includes debug) (default: false)

wiring generator options:

--chain-control-link chain control links instead of all switches directly connected to control node if fabric mode is spine-leaf (default: false)
--control-links-count value number of control links if chain-control-link is enabled (default: 0)
--fabric-links-count value number of fabric links if fabric mode is spine-leaf (default: 0)
--mclag-leafs-count value number of mclag leafs (should be even) (default: 0)
--mclag-peer-links value number of mclag peer links for each mclag leaf (default: 0)
--mclag-session-links value number of mclag session links for each mclag leaf (default: 0)
--orphan-leafs-count value number of orphan leafs (default: 0)
--spines-count value number of spines if fabric mode is spine-leaf (default: 0)
--vpc-loopbacks value number of vpc loopbacks for each switch (default: 0)
```
File renamed without changes
File renamed without changes
Loading

0 comments on commit afdd5dd

Please sign in to comment.