This past week I had the honor and privilege of traveling out to Santa Clara, CA with some of my esteemed colleagues to participate in Networking Field Day 21 on behalf of Network to Code. My contribution to our joint presentation was an overview of the various components that go into building a successful network automation platform. While this was only one section of our overall presentation, the delegates proved to be very engaged with these concepts. At Network to Code, we try not to focus on individual technologies, and instead focus on transformational ideas and workflows which bring value to our clients. To that end, this section dealt with the higher level concepts and components that go into building a successful network automation platform. I want to call out a couple of sections and points here, but be sure to checkout the full video from NFD21 that goes into even more detail and covers other architectural components such as Configuration Management, Orchestration, and Verification & Testing.
The tools and technologies that fall into this section deal with directly exposing interfaces to the automation that we build for the network. These are things like our IT Service Management (ITSM) ticketing applications, but also chat and communication platforms. ChatOps is a term used a lot in the industry and is continuing to pick up steam in network automation. Integrations with chat platforms allow a business to push network data and workflows into channels where conversations are already taking place. Perhaps more importantly, these avenues allow our network automation to be exposed to business stakeholders outside of the networking teams directly.
If I were to pick a single section in my talk to call out as the most important, it would be this one. In terms of network automation, the industry is not talking about data enough. As with any other vertical in the tech space, data underpins everything we do, and network automation is no different. As the network automation community has grown, so has understanding in the concept of using a Source of Truth (SoT), which is an authoritative system of record for a particular data domain. That last part is key, because we can actually (and realistically do) have multiple sources of truth that do not overlap. For example, our IPAM and DCIM can be different systems because they control different domains of data. This is valid as long as we do not have more than one IPAM or DCIM tool, as this is where the phrase “Single Source of Truth” comes from, not that there is only one total system.
Still though, having many different systems creates problems of its own. At first pass, each system in our network automation toolbox would potentially need to reference many different systems to get the data needed to perform automation. More importantly, this tightly couples the client to the data source and format. To combat this, we strive to implement an aggregation layer between the sources of truth and the systems consuming their data. This aggregation serves a couple of important use cases.
First, it creates a single pane of glass for accessing all of the data from our various authoritative systems, thus allowing our tooling to reference a single place. Second, the aggregator implements a data translation layer which transforms the data coming from each source of true into an abstracted data model. This data model intentionally strips away any features of the data or its structure which make it identifiable with any vendor or source implementation.
In doing so, we segway into the third point, which is that the aggregator interacts with the various source of true systems in a pluggable way. By implementing an adapter, the aggregator understands the data and format coming from an individual source of true and how to transform the data to conform to the abstracted data model. This allows the aggregator to easily implement support for different source of true platforms, such that they can be swapped out at anytime. If you want to switch IPAM vendors, all you have to do is create (or engage with NTC) an adaptor for the aggregator that understands what the data looks like coming out of the new IPAM.
It may seem a bit odd to be talking about monitoring and alerting in the context of network automation, but there is more to what we do than just configuration management. In fact, the core of this topic is centered around the concept of “closed loop automation,” or manufacturing a feedback loop into the automation platform that we build. In the video, you will hear me talk about the automation platform as a stack, and on one side we travel down the stack to the actual network infrastructure, but on the other side, events come out of the network and travel back up the stack. Indeed those events come in the traditional forms of SNMP polling, syslog messages, etc. They can also come in new forms such as time series metrics, and streaming telemetry. We have also revisited the storage and processing layer to implement more modern time series databases which allow for tagging data points with metadata labels which opens the door to more intelligent querying and visualization. Speaking of visualization, we want to empower business stakeholders with network data, and we want to do it in a self service fashion through modern dashboarding tools. Again, this is a case of bringing the network data and workflows to the business.
Back to the storage engine, we need to get those network events out of their monitoring silos, and fed back into the automation platform. We do this with the aid of rules processing engines which assert business logic based on the metadata which is attached to the collected data points. Once the event streams have been plumbed back into our network automation platform, our feedback loop begins to take shape. We can now build automated remediation workflows which allow engineers to focus on actual engineering and architecture, and less on troubleshooting and remediating known, repeated events. In situations where human involvement is needed, the platform can at least collect important context from the network as a form of first level triage, obviating the need for manual data collection and reducing the overall time necessary to respond to network events.
The final topic I want to bring up is the idea that no one network automation platform will ever suit the needs of all networks and organizations. The more important takeaways from this talk are the various components that go into architecting the platform that best suits your needs. It is true that company A may be able to purchase an off the shelf automation product from a vendor and be perfectly happy, while company B may require an entirely custom solution over a longer evolution timeline. In all cases, Network to Code is here to provide training, enablement, and implementation services.
-John Anderson (@lampwins)
Share details about yourself & someone from our team will reach out to you ASAP!