Network Automation Architecture – An Example
In the previous blogs of this series about Network Automation Architecture, 1 and 2, we have presented the key architectural components and their respective functions. Now, it’s time to apply that architecture to a real scenario: a firewall rule automation in a hybrid network environment.
This is not the last blog of this series, more detailed ones for each of the primary components will come later. But, before they come, we want to bring the architecture to life by showing how to map a network automation solution.
The Example: Firewall Rule Automation in a Hybrid Environment
Firewall rule automation is a common network operation task that usually gets complicated due to the number of potential requestors and applications, and the variety of different platforms involved.
To illustrate it, we will use a simple scenario with only two firewall platforms: one running on-premises and another running in a cloud provider. Each environment has different configuration processes, and until both are updated and in sync, the network communication between the source and target application will not succeed.
From a user perspective, we want to offer an easy interface where the user could define the required communication, and from the information provided, the firewall services will be updated to allow the network flow between the desired applications. (Notice that we are not talking about IPs, leaving this more general to an application composed of multiple IP endpoints and different network service ports.)
Hopefully this scenario looks familiar to you. Now it’s time to start defining the automation solution that could implement it. Always, the first step to automate a process/workflow, is to understand it. And this is exactly what we start next.
Describe the Manual Operations Workflow
Even though describing an operation that your network team performs frequently could seem a trivial exercise, we recommend taking your time and doing it, collecting information from different perspectives. Not only from the network operators performing the firewall changes (two persons usually do things in a slightly different way), but also from other persons involved in the process (the requestors and also other teams, such as the security policy team who defines the security policy).
You can check other NTC blogs covering this topic in detail in https://blog.networktocode.com/tags/work-intake.
At this stage, automation is not the main topic. We should focus first on capturing, step-by-step, what happens from the workflow kick-off until it is completed.
In our example, a simplified workflow could look like this:
- An Application Owner notifies us about the source and destination of the flow.
- A network operator checks if it matches the security policy rules.
- If accepted, he runs some traceroutes to determine which firewall devices must be updated.
- He connects to the different firewalls to update the rules according to the communication flow requested.
- He notifies the Application Owner that the configuration is ready, and he can proceed with the verification.
Disclaimer: Do not take this as an exhaustive analysis. In this example we are using the minimum information to illustrate the point. A real example will contain much more details.
With this draft of the manual operational steps to perform the network task, we can start thinking about how this could look like in an automated version.
Translate the Workflow Steps to Automated Tasks
Once the manual workflow steps are defined, it’s time to translate these steps into automated tasks, understanding what is required and adding some improvements along the way.
The next figure shows a candidate automated workflow to solve the firewall rule automation operation described.
You can notice that some steps are just taken from the manual workflow, and defined as automated tasks, such as “Application Owner requests access from App A to App B”. However, even though not described here in detail (for brevity), each one of these steps comes with specific requirements about data management.
Every automated workflow requires structured data that could be understood by a machine. So, in the workflow entry point we will have to define clearly which is the minimum information requested. In our example, to simplify we will assume that an application name (source and destination) will suffice and the actual IPs and services will be taken from some place, making the user experience simpler.
Simply by automating the same steps, we are already getting some benefits. Automation enforces data normalization and validation that minimizes potential misunderstandings or copy & paste errors. Also, something we will get out-of-the-box from automation is consistency. Every operation will behave the same, not depending on operator’s criteria, but on the whole team who defined who the automation solution.
But, once automation is in place, we can introduce some advanced steps that were not possible, or were harder, in the manual workflow. For instance, we can execute a pre- and post-validation of the firewall changes, and get feedback about the change that we are deploying.
When a workflow is automated, it is not mandatory to automate it 100 percent. In some cases, adding a manual judgment step could make sense, especially during the adoption phase.
Once we have described and understood what is needed in each of the automated workflow steps it is time to use the network automation architecture.
Map the Automated Tasks to Architecture Components
At this point, we know what we require and expect from each of the steps that the workflow needs to implement. By using the network automation architecture, we can group some of these tasks in the same functional blocks, so we could have better insights when we determine which tools better solve the requirements.
In the next figure you can observe how the workflow tasks are mapped to the different architecture components:
- User Interaction: contains the first and the last steps of the workflow. Initiating it, providing the required data, and getting the confirmation that the request has been executed.
- Source of Truth: processes and ingests the data from the user and, using other data that defines the network intent, is able to validate whether it matches the security policy and determine which are the firewalls that are in the network path between the two applications. We should notice here that implicitly this SoT will convert the application names into the network data we require: IP addresses and service ports.
- Orchestration: allows the triggering of the automation engine process, via different ways.
- Automation Engine: here is where the magic happens. It converts the intent from the Source of Truth into real configuration artifacts that can be pre-validated, and finally deployed to the different network elements.
- Telemetry and Observability: after the network state is updated, this component collects the data that provides evidence that the change has succeeded, or not, and sends a user notification accordingly.
Even though this is a really simplified explanation of the mapping process, it should give you an initial understanding of how the network automation architecture can be used.
Choose the Tools to Implement Each Component’s Tasks
Finally, it’s time to figure out which tools are the best to solve the needs of each architectural component. Keep in mind: the reality is that, usually, this process is strongly influenced by the current tech stack or other automation projects that could add some constraints in the decision.
In the next figure, we can see selection of open-source tooling that could solve the requirements of the automation workflow presented in this example.
Disclaimer: To illustrate this example we have selected some open-source tools. But there are many other options, both open-source and from vendors, that are totally valid to solve it.
- User Interaction: Mattermost is an instant messaging application to allow user interaction, and Grafana offers a dashboard for visualization of the flow statistics.
- Source of Truth: Consul provides a dynamic mapping of application names to IP addresses. Git contains the Jinja templates to create the CLI configuration artifacts for the on-premises firewall. And Nautobot, with the Firewall Models extension, offers the abstraction of firewall rules for both environments, along with the definition of the on-premises and cloud network inventory and CMDB.
- Orchestration: AWX will start the workflow execution via a manual trigger, or from a webhook triggered from Nautobot when a new firewall rule is created, updated, or deleted.
- Automation Engine: Batfish provides pre-validation of network configurations. Then Ansible will be used to configure the on-premises network firewall, rendering the configuration from the Source of Truth, and Terraform will also use the Source of Truth intent to provision the cloud firewall services.
- Telemetry and Observability: Telegraf collector will get network data metrics from the firewall services, and will store them in Prometheus for later consumption from other automation processes or from Grafana.
And this is just the beginning of the game! Once all these tools are in place, you can look to reuse them for other workflows, or add new functionalities that they provide that were not in the initial set of requirements. Now that they are broken out by component, you can also replace tools if they are no longer the correct tool or add new ones to complement them when new requirements appear.
What’s Next
Hopefully you liked this blog and got a better understanding of how the network automation architecture can help you to approach building network automation solutions.
But fasten your seat belts, because the meat and potatoes comes now! The next blogs will cover in more detail each of the architecture components, describing the features and challenges that must be taken into account.
-Christian
Contact Us to Learn More
Share details about yourself & someone from our team will reach out to you ASAP!