Network-Automation with Salt, NAPALM and Kubernetes

February 6, 2018

Since the native integration of NAPALM into the Salt core which was officially done since Salt Carbon (2016.11) by Mircea Ulinic it is possible to manage network-devices directly with Salt. I thought about how to scale this out and how to manage a lot of (hundreds or even thousands) network-devices with this solution. To be more clear my goal was to manage legacy devices that are not able to install software natively on themselves. In this case the salt-minion client can’t be used. The job has to be done by proxy minions instead. Per salt-managed network-device a salt-proxy process which reserves about 60 MB of memory is required. The following image shows the high-level logic of this construct.

salt-minion-proxy

If we think about possible solutions for an environment running this there are different options:

  1. One very big salt-master server with a huge amount of memory/CPU and all of the salt-proxies running locally on it
  2. One salt-master and a lot of additional instances (e.g. VMs) where the salt-proxies are running
  3. Multiple salt-masters which could be for example in different locations so that the salt-proxies can communicate with the master next to them
  4. One salt-master and one container which only runs the salt-proxy process per managed network-device
  5. Same approach like the fourth but with multiple salt-masters

In the following sections I describe how I tried to implement the fourth possibility.

The components

Because I wanted the solution to be very scalable and I like containers a lot for me it was a good idea to continue with a containerized solution. But managing hundreds or even thousands of containers is no easy task.

This was when Kubernetes came to my mind. Some days ago I was just reading the great book The Kubernetes Book which brought the key concepts of Kubernetes nearer to me. Kubernetes is great for managing a huge number of containers and provides a lot of very cool stuff to do this in a best manner. If you look at the features of it you can find super helpful things like: automatic binpacking, self-healing, horizontal-scaling, service discovery and load balancing, automated rollouts and rollbacks, secret and configuration management, storage orchestration, batch execution. Wow! - especially the self-healing and automated rollouts/rollbacks is what I looked for.

So I decided to spin up a test-environment with all of these components: One salt-master and a Kubernetes-Cluster running one container per salt-managed network-device. The platform I used for testing was just my local Mac OS installation and some cool software on it: Vagrant with an ubuntu/xenial64 virtual-machine acting as salt-master, minikube as a local Kubernetes cluster and Docker for Mac with minideb to build a small container image which can be used by the salt-proxy containers.

The picture shows these pieces connected together to build the full solution. Not every single detail is contained in the picture, so please consider it as a high-level overview. In the next sections I will describe the setup of the different tools and what to take account of when configuring them for our use-case. In one of the last sections I will show some working examples.

high-level-network

If you look at the routing table of the minikube-VM you will see something like this:

$ ip route
default via 192.168.64.1 dev eth0 proto dhcp src 192.168.64.2 metric 1024
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1
192.168.64.0/24 dev eth0 proto kernel scope link src 192.168.64.2
192.168.64.1 dev eth0 proto dhcp scope link src 192.168.64.2 metric 1024

Each Pod (which is the smallest piece in Kubernetes and contains the container) will get its own IP and has a default-route to 172.17.0.1. Communications to the outside will be source-NATTed with 192.168.64.2 (that’s the ip of the Kubernetes node running the Pods). This is done inside the minikube-VM with iptables. If traffic coming from the bridge100 interface, where the minikube-VM is connected, wants to communicate with outside networks it also gets source-NATTed to the dhcp-allocated IP of the Macbook. There are NAT-rules in place on Mac OS which were automatically built when installing minikube:

nat on en0 inet from 192.168.64.0/24 to any -> (en0:0)
no nat on bridge100 inet from 192.168.64.1 to 192.168.64.0/24

For communication to the salt-master only the two required TCP-ports are forwarded from the Macbook (not limited to one IP) to the Vagrant-VM (TCP 4505/4506).

$ vagrant port
The forwarded ports for the machine are listed below. Please note that
these values may differ from values configured in the Vagrantfile if the
provider supports automatic port collision detection and resolution.

    22 (guest) => 2222 (host)
  4505 (guest) => 4505 (host)
  4506 (guest) => 4506 (host)

Thats how the traffic gets to my “Macbook” and from there communications with the salt-master, device1 and vice versa are made possible.

Set up the salt-master VM with Vagrant

We need a server where the salt-master will run. In our testing environment Vagrant will be used to setup an ubuntu/xenial server. Follow the instructions on the Vagrant-Homepage to install it on your operating system. Also install a vagrant provider, which is mostly virtualbox. I recommend you to follow the install-instructions on the links and not to install the software with package-managers of your operating system.

After installing Vagrant and virtualbox now create a new folder called salt-master and create the Vagrantfile. We will forward the ports 4505 and 4506 from our host-system to the salt-master VM and start the VM.

$ mkdir salt-master
$ cd salt-master
$ echo "VAGRANTFILE_API_VERSION = "2"
Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|

   config.vm.box = "ubuntu/xenial64"

   config.vm.network "forwarded_port", guest: 4505, host: 4505
   config.vm.network "forwarded_port", guest: 4506, host: 4506
end" >> Vagrantfile
$ vagrant up

[... snipped output ...]

$ vagrant ssh

Vagrant will download the needed ubuntu image and spin up the VM. As soon as it is finished you can connect to it with vagrant ssh. Now we need to set up salt-master and NAPALM. At the moment of writing the blog-post Salt 2017.7.3 and NAPALM 2.3.0 will be installed.

$ wget -O bootstrap-salt.sh https://bootstrap.saltstack.com
[... snipped output ...]

$ sudo sh bootstrap-salt.sh
[... snipped output ...]

$ wget https://bootstrap.pypa.io/get-pip.py
[... snipped output ...]

$ python get-pip.py
[... snipped output ...]

$ pip install napalm
[... snipped output ...]

$ pip install -U pyOpenSSL
[... snipped output ...]

We will configure the file_roots and pillar_roots variables for our master and set up a pillar file for our test-device which we will name device1 and which is reachable in the network with 192.168.0.201. We will also configure the salt-proxy to use it as an initial test directly on our master. The proxy configuration file will also be needed later when the container image is built. I will not describe details about how Salt works in this blog post. For the final testing I also included a formula for NTP, which could be found with install-instructions on GitHub. The formula uses the NTP parts of the OpenConfig system YANG model which makes it vendor agnostic and perfect for integration with NAPALM. If you are new to Salt I recommend you reading the documentation on the Salt-Homepage or on Mircea Ulinic’s Blog.

$ cat /etc/salt/master
file_roots:
  base:
    - /etc/salt/pillar
    - /etc/salt/states
    - /etc/salt/reactors
    - /etc/salt/templates
    - /etc/salt/extmods
    - /etc/salt/formulas/napalm-ntp-formula

pillar_roots:
  base:
    - /etc/salt/pillar
$ cat /etc/salt/proxy
master: localhost
pki_dir: /etc/salt/pki/proxy
cachedir: /var/cache/salt/proxy
multiprocessing: false
mine_enabled: true
$ cat /etc/salt/pillar/top.sls
base:
  '*':
    - openconfig_ntp_servers
  'device1':
    - device1_pillar
$ cat device1_pillar.sls
proxy:
  proxytype: napalm
  driver: ios
  hostname: 192.168.0.201
  username: admin
  password: password
$ salt-run pillar.show_pillar device1
[... snipped output ...]

  proxy:
      ----------
      proxytype:
          napalm
      driver:
          ios
      hostname:
          192.168.0.201
      username:
          admin
      password:
          password

We can now do a first test: start a local salt-proxy for device1, accept the exchanged key and check if the device is managed by Salt. Surely you have to ensure that the user/password combination defined in the pillar-file is correct for connecting to the Cisco IOS device.

$ sudo salt-proxy --proxyid device1 -l debug
[... snipped output ...]
$ salt-key -L
Accepted Keys:
Denied Keys:
Unaccepted Keys:
device1
Rejected Keys:
$ salt-key -A device1
[... snipped output ...]
$ salt-key -L
Accepted Keys:
device1
Denied Keys:
Unaccepted Keys:
Rejected Keys:
$ salt-run manage.status device1
down:
up:
    - device1

Your salt-master is now in a state which is good enough to continue with the setup of our other components.

Set up Docker and build the salt-proxy container image

For easy availability of Docker on my Macbook I decided to use Docker for MAC which installs a complete docker-environment with the help of a docker-helper-VM in which the docker-daemon runs and in which you can spin up and manage your docker containers. The whole thing gets managed locally with the known docker cli-commands. Thats an easy solution to get started with docker locally on your client and helps a lot when developing things with containers. To install Docker for MAC please look at the given documentation on the Docker-Homepage.

When thinking about our container image there are different important things to consider: the image should be small, salt-minion and salt-proxy should be installed and ready for use, the proxy configuration file has to be configured for the correct master, NAPALM has to be installed. There is also one more special thing: Because of the self-healing functions of Kubernetes it is absolutely possible that in a failure-case the Pod/Container with our running salt-proxy just gets deleted and replaced by a completely new Pod. If we let everything like it is we would run into problems with the Salt key-exchange. A salt-minion which was well known before would appear with the same name and another key after the Pod was deleted and another one re-deployed. This would lead to a non-functional master/minion relationship and the Kubernetes self-healing would badly impact our solution. Thats why in this case I decided to use the same key-pair for every Pod which gets deployed. The easiest way to accomplish this is to just include the two files into the container-image. If you want to do this in production be sure to use a private container-registry which is secured. The files to use are located on our salt-master VM: /etc/salt/pki/proxy/minion.pub and /etc/salt/pki/proxy/minion.pem.

To get a small, but also stable salt-proxy image I decided to use minideb which is a small image based on Debian designed for use in containers. The following Dockerfile contains all information needed to build our container image. Be sure to have the three files in the same directory: proxy (the proxy configuration file -> take the one from our master), minion.pub, minion.pem. In the proxy configuration file we change the line master: localhost to master: 192.168.64.1. Remember: 192.168.64.1 is the IP of the bridge100-interface built by minikube on our Macbook. Because of our Vagrant configuration the two TCP ports 4505/4506 are forwarded to the salt-master VM. So the proxies are able to use the mentioned IP as salt-master IPv4-address.

$ cat ~/code/docker/salt-minion-proxy
FROM bitnami/minideb:stretch
MAINTAINER No reply smnmtzgr@gmail.com

RUN apt-get update && install_packages iputils-ping wget gnupg ca-certificates python-setuptools && wget -O gpgkey https://repo.saltstack.com/apt/debian/9/amd64/latest/SALTSTACK-GPG-KEY.pub && apt-key add gpgkey && echo 'deb http://repo.saltstack.com/apt/debian/9/amd64/latest stretch main' >> /etc/apt/sources.list.d/saltstack.list && apt-get update && install_packages salt-minion=2017.7.3+ds-1 python-pip && pip --no-cache-dir install -U pyOpenSSL

ADD ./proxy /etc/salt/proxy
ADD ./minion.pub /etc/salt/pki/proxy/minion.pub
ADD ./minion.pem /etc/salt/pki/proxy/minion.pem

This Dockerfile will pull the minideb base-image, install salt-minion (and proxy), install napalm and copy the needed files into the correct directory. I have built and pushed the image to Docker Hub with the following docker cli-commands:

$ docker build -t salt-minion-proxy .
[... snipped output ...]

$ docker tag salt-minion-proxy smnmtzgr/salt-minion-proxy:0.1.14
[... snipped output ...]

$ docker push smnmtzgr/salt-minion-proxy:0.1.14
[... snipped output ...]

If you want to build this lab/solution for your own you could directly use my container image and skip the image build parts: https://hub.docker.com/r/smnmtzgr/salt-minion-proxy/. The container image is now ready to be used by Kubernetes.

Set up of the Kubernetes cluster with minikube

When you install minikube it builds up a VM on your host-system and runs the Kubernetes-master/node inside of the VM. Inside the VM this is done with a localkube construct which runs a Kubernetes node and a Kubernetes master. The master includes the API server and all other components of the Kubernetes control plane. The container runtime is pre-installed and defaults to docker.

Kubernetes will be managed directly from your host-system via the kubectl cli-tool which is normally also used for all other kinds of Kubernetes installations. For the VM-hypervisor we use the lightweight xhyve. The following steps are required to install all of these components:

$ brew install kubectl
[... snipped output ...]

$ brew cask install minikube
[... snipped output ...]

$ brew install docker-machine-driver-xhyve
[... snipped output ...]

$ sudo chown root:wheel $(brew --prefix) /opt/docker-machine-driver-xhyve/bin/docker-machine-driver-xhyve
$ sudo chmod u+s $(brew --prefix) /opt/docker-machine-driver-xhyve/bin/docker-machine-driver-xhyve

$ minikube start --vm-driver=xhyve
[... snipped output ...]

$ kubectl config current-context minikube
$ kubectl get nodes
[... snipped output ...]

Configuration of the Kubernetes Deployment

Now that we have a working Kubernetes-cluster installation directly available on our system we can continue and fill it with some life. Our goal is to use it for running one container per salt-managed legacy network-device. We also want to benefit of the self-healing, update- and rollback-features of Kubernetes. The thing we want to use is named a Deployment. It is a construct that wraps an object which is called ReplicaSet, and the ReplicaSet itself wraps the Pod object (which is our container). The Deployment gives us update and rollback possibilities. The ReplicaSet ensures that our Pod is self-healing, scalable and has the desired state (which is READY). The following picture shows these relationships.

k8s-deployment

source: the-k8s-book

The best-practice to do something in Kubernetes is to use manifest-files. You can define your desired state in this files in a well-structured format, which is normally yaml. In production environments it is highly recommended to manage these manifest-files with a version-control system like git. For our use-case it is enough to just create the file locally on the client, lets say we use the folder ~/code/kubernetes/Deployments and create the file deploy_device1.yml in it.

$ cat ~/code/kubernetes/Deployments/deploy_device1.yml
apiVersion: apps/v1beta2
kind: Deployment
metadata:
  name: device1
spec:
  replicas: 1
  revisionHistoryLimit: 1
  selector:
    matchLabels:
      app: device1
  minReadySeconds: 10
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  template:
    metadata:
      labels:
        app: device1
    spec:
      containers:
      - name: device1-container
        image: smnmtzgr/salt-minion-proxy:0.1.14
        command: ["salt-proxy", "--proxyid", "device1", "-l", "debug"]
        resources:
          limits:
            cpu: 70m
            memory: 70Mi
          requests:
            cpu: 60m
            memory: 60Mi

The file contains the Deployment completely defined in yaml. The Deployment will be named device1, we will have one replica which means one Pod, and the Pod will contain one container named device1-container. Here we also reference to the container image we built before and which is now located on Docker Hub. Thats where Kubernetes will pull the image from and spin up the containers with. It is also important that we define the command which to run in the container. In our case this is salt-proxy --proxyid device1 -l debug which starts the salt-proxy, gives it the id device1 and activates logging at debug level. If we wouldn’t give a command the Pod would just start up, don’t know what to do and shut down. This would lead to a loop: Kubernetes would destroy the Pod and deploy a new one, because the Pod-state was not READY. Thats the self-healing mechanism. We also limit the resources the container can use.

If you really want to use such a solution with hundreds or thousands of devices it would for example be possible to build the manifest file in a more generic way so that you don’t use the hard-coded device1 name, but rather than that use a variable which you fill with content when using the file.

We also have to talk about the self-healing functionality here. If we deploy the Deployment like defined in the manifest Kubernetes would only re-deploy the Pod when the command we call will return an exit-code of 1. This is for example the case when the command just fails. But it’s also possible that it throws some ERRORS, but doesn’t exit completely. This happens for example when the IP of the network-device is simply not reachable at the moment. The proxy-process will just try the connection again and again. But for Kubernetes this means that everything is fine and it considers the state of the Pod as READY. In a production environment we should be more clear at this point and define a Liveness state in the Deployment manifest. Details are described in the official Kubernetes documentation.

The last remaining step is to use the manifest-file to deploy the Deployment to the Kubernetes-cluster. If you want to do such things in Kubernetes you use normally the kubectl tool to do a HTTP-POST against the Kubernetes API Server which provides a RESTful API. The API Server runs as part of the Kubernetes control-plane on the Kubernetes master. After receiving the POST it inspects the content of the file and knows exactly what to do to get to the desired state defined in the file. This desired state will also be saved to the Kubernetes cluster store which is based on etcd and acts as the brain of the cluster. To ensure the desired state ReplicaSets implement a background reconciliation loop that is constantly monitoring the cluster. It checks if the current state matches the desired state. If that’s not the case it wakes up the control-plane and Kubernetes fixes the situation with whatever steps are needed to get back to the desired state. The following command does the HTTP-POST:

$ kubectl create -f deploy_device1.yml
[... snipped output ...]

This is all what is needed to instruct Kubernetes to set up all of these cool objects. After successful execution you can check the state with different commands:

$ kubectl get deployment
NAME      DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
device1   1         1         1            1           1d
$ kubectl get rs
NAME                 DESIRED   CURRENT   READY     AGE
device1-859ffc6cb7   1         1         1         1d
$ kubectl get pod
NAME                       READY     STATUS    RESTARTS   AGE
device1-859ffc6cb7-fbwrc   1/1       Running   0          1d
$ kubectl logs <pod_name>
[... snipped output ...]

If you continue to check the logs of the Pod you should see after some time that the salt-proxy is active, connected as a minion to the master and acting as proxy for the network-device. On the master you have to accept the key for device1 (if not already done in our first steps):

$ salt-key -A device1
[... snipped output ...]

Manage the network-device with Salt

After setting up all these things we should now be able to manage device1 with Salt. On the salt-master we execute some commands to show this. Most of them should be self-explaining. If you need more examples or ideas what Salt is able to do with network devices I recommend you to read through the great book Network Automation at Scale.

$ salt device1 net.load_config text='ntp server 172.17.18.1'
device1:
    ----------
    already_configured:
        False
    comment:
    diff:
        +!
        +ntp server 172.17.18.1
        +switch-01#^@
    loaded_config:
    result:
        True
$ salt device1 grains.get interfaces
device1:
    - Vlan1
    - FastEthernet0/1
    - FastEthernet0/2
    - FastEthernet0/3
    - FastEthernet0/4
    - FastEthernet0/5
    - FastEthernet0/6
    - FastEthernet0/7
    - FastEthernet0/8
    - GigabitEthernet0/1
$ salt device1 grains.get model
device1:
    WS-C2960-8TC-L
$ salt device1 grains.get model --output=json
{
    "device1": "WS-C2960-8TC-L"
}
$ salt device1 state.sls ntp.netconfig
device1:
----------
          ID: oc_ntp_netconfig
    Function: netconfig.managed
      Result: True
     Comment: Configuration changed!
     Started: 20:31:50.924403
    Duration: 23426.954 ms
     Changes:
              ----------
              diff:
                  +!
                  -no ntp
                  +ntp server 172.17.19.1 prefer
                  +ntp server 172.17.19.2

Summary for device1
------------
Succeeded: 1 (changed=1)
Failed:    0
------------
Total states run:     1
Total run time:  23.427 s

Note: For the last command to work you have to set up the napalm-ntp-formula.

Running the solution in production?

If you really think about running a solution like this in production there are surely things you have to go deeper and make decisions. I want to mention three sticking points.

The first is how to handle the key-exchange between salt-master and salt-minions (proxies). In my solution we just used the same key-pair over and over again. We even integrated it into the container image we built. This works fine but if you think about securing your application that’s not the best way to go.

The second thing is how you want to set up your Kubernetes cluster. That’s no easy task and has to be discussed with Kubernetes and networking experts. There are a lot of questions how to design it, where to run it and what pieces to use. For example Kubernetes has its own model how networking should work. It imposes the following fundamental requirements on any networking implementation:

  • all containers can communicate with all other containers without NAT
  • all nodes can communicate with all containers (and vice-versa) without NAT
  • the IP that a container sees itself as is the same IP that others see it as

In practice this means that you can’t just take two docker-hosts and run Kubernetes on top of it to manage it. That’s not how docker-networking is implemented. In the Kubernetes documentation you can read more about these aspects and go into detail. An interesting part is how to achieve this behaviour. Kubernetes gives some ideas and references for setting up the networking in a Kubernetes-model way. Those are: Cisco ACI, Cilium, Contiv, Contrail, Flannel, Google Compute Engine (GCE), Kube-router, L2 networks and linux bridging, Multus, NSX-T, Nuage Networks VCS, OpenVSwitch, OVN, Project Calico, Romana, Weave Net from Weaveworks, CNI-Genie from Huawei.

The third is how you design the salt-master. I’ve just used a single Vagrant-VM which was enough for testing purposes. But in production you should eventually setup a HA salt-master construct or think about running multiple salt-masters in different locations. This depends a bit on what you want to do with the solution.

Summary

Seeing all these tools and components working together is very cool, makes a lot of fun and builds a robust framework for network-automation purposes. At the end I want to mention that I’m neither a Salt expert nor a Kubernetes guru. I started using these tools two weeks before writing this blog-post. I just wanted to integrate them into my overall solution. So certainly there are other and even better ways to do the stuff I’ve done here. I would be happy if you get in touch with me in such cases. You can reach me (@smnmtzgr) via Twitter or directly in the networktocode slack community (networktocode.slack.com).

Back...