Introducing Design Builder: Design Driven Network Automation

Blog Detail

Most people involved in network automation are familiar with the concept of a Source of Truth (SoT). The SoT is usually some form of database that maintains intended state of objects as well as their interdependency. The SoT provides a way to quickly ascertain what a network’s intended state should be, while often providing a way to see what the network’s state actually is. A new concept is emerging, known as Design Oriented Source of Truth. This idea takes network designs and codifies them, attaching additional meaning to the objects within the SoT. Nautobot is a source of truth that contains all sorts of information about a network’s state. Although many of the pieces of information within Nautobot are related, they are discretely managed. A new Nautobot App aims to simplify the process of codifying network designs and populating Nautobot objects based on these designs.

Introduction

It is very common to have a small set of standardized designs that are used to deploy many sites and services in enterprise networks. For example, branch office sites may have a few different designs depending on their size. There could be a design that uses a single branch office router for small sites. Another design could have two routers and an access switch for sites with a moderate user base. A third design could include a more complex switching infrastructure for sites with many employees. When companies do tech refreshes or new site builds, these standardized designs are used and new data must be created in the source of truth. The newly open-sourced Design Builder application was created to address this problem, and fulfills the idea that a standardized design can be taken from a network engineer and transformed into a format that can be consumed and executed by Nautobot. Design Builder can expand a minimal set of inputs into a full-fledged set of configuration objects within Nautobot. This includes any kind of data object that Nautobot can model. Everything from Rack and Device objects to IP addresses and BGP peering information.

Design Builder provides powerful mechanisms that make simple designs possible. The first is the ability to represent interrelated data in a meaningful hierarchy. For example, devices have interfaces and interfaces have IP addresses. Conceptually this seems like a very simple structure. However, if we were to manually use the REST API or ORM to handle creating objects like this, we would first have to create a device object and keep its ID in memory. We would then have to create interfaces with their device foreign-key set to the device ID we just created. Finally, we’d have to save all of the interface IDs and do the same with IP addresses. Design Builder provides a means to represent objects in YAML and produce their representation within the Nautobot database. A typical design workflow follows the following diagram:

Following this process, we can produce YAML files that intuitively represent the structure of the data we want to create. An example of a Design Builder YAML design can be seen in the following YAML document:

devices:
  - name: "Router 1"
    status__name: "Active"
    interfaces:
      - name: "GigabitEthernet0"
        type: "1000base-t"
        status__name: "Active"
        ip_addresses:
          - address: "192.168.0.1/24"
            status__name: "Active"

This YAML document would produce a single device, with a single Gigabit Ethernet interface. The interface itself has a single IP address. As demonstrated in the example, Design Builder automatically associates the parent/child relationships correctly, and there is no need to keep copies of primary and foreign keys. We can visually represent this YAML design with the following diagram:

Design Builder also provides a system to query for existing related objects using some attribute of the associated object. In the above example, the status field is actually a related object. Statuses are not just simple strings, they are first-class objects within the Nautobot database. In this case, the Status object with the name Active is predefined in Nautobot and does not need to be created. It does, however, need to be associated with the Device, the Interface, and the IPAddress objects.

This object relationship is actually a foreign-key relationship in the database and ORM. If we were using the Django ORM to associate objects, we would first need to look up the status before creating the associated objects. Design Builder provides a way to perform that lookup as part of the model hierarchy. Note that we’re looking up the status by its name: status__name. Design Builder has adopted similar syntax to Django’s field lookup. The field name and related field are separated by double underscores.

Use Cases

There are many use cases that are covered by the Design Builder, but we will highlight a very simple one in this post. Our example use case handles the creation of edge site designs within Nautobot. This use case is often seen when doing tech refreshes or new site build-outs.

Engineers commonly need to add a completely new set of data for a site. This could be the result of a project to refresh a site’s network infrastructure or it could be part of deploying a new site entirely. Even with small sites, the number of objects needing to be created or updated in Nautobot could be dozens or even hundreds. However, if a standardized design is developed then Design Builder can be used to auto-populate all of the data for new or refreshed sites.

Consider the following design, which will create a new site with edge routers, a single /24 prefix and two circuits for the site:

---
sites:
  - name: "LWM1"
    status__name: "Staging"
    prefixes:
      - prefix: "10.37.27.0/24"
        status__name: "Reserved"
    devices:
      - name: "LWM1-LR1"
        status__name: "Planned"
        device_type__model: "C8300-1N1S-6T"
        device_role__name: "Edge Router"
        interfaces:
          - name: "GigabitEthernet0/0"
            type: "1000base-t"
            description: "Uplink to backbone"
            status__name: "Planned"
      - name: "LWM1-LR2"
        status__name: "Planned"
        device_type__model: "C8300-1N1S-6T"
        device_role__name: "Edge Router"      
        interfaces:
          - name: "GigabitEthernet0/0"
            type: "1000base-t"
            description: "Uplink to backbone"
            status__name: "Planned"

circuits:
  - cid: "LWM1-CKT-1"
    status__name: "Planned"
    provider__name: "NTC"
    type__name: "Ethernet"
    terminations:
      - term_side: "A"
        site__name: "LWM1"
      - term_side: "Z"
        provider_network__name: "NTC-WAN"

  - cid: "LWM1-CKT-2"
    status__name: "Planned"
    provider__name: "NTC"
    type__name: "Ethernet"
    terminations:
      - term_side: "A"
        site__name: "LWM1"
      - term_side: "Z"
        provider_network__name: "NTC-WAN"

This is still quite a bit of information to write. Luckily, the Design Builder application can consume Jinja templates to produce the design files. Using some Jinja templating, we can reduce the above design a bit:


---
sites:
  - name: "LWM1"
    status__name: "Staging"
    prefixes:
      - prefix: "10.37.27.0/24"
        status__name: "Reserved"
    devices:
    {% for i in range(2) %}
      - name: "LWM1-LR{{ i }}"
        status__name: "Planned"
        device_type__model: "C8300-1N1S-6T"
        device_role__name: "Edge Router"
        interfaces:
          - name: "GigabitEthernet0/0"
            type: "1000base-t"
            description: "Uplink to backbone"
            status__name: "Planned"
    {% endfor %}
circuits:
  {% for i in range(2) %}
  - cid: "LWM1-CKT-{{ i }}"
    status__name: "Planned"
    provider__name: "NTC"
    type__name: "Ethernet"
    terminations:
      - term_side: "A"
        site__name: "LWM1"
      - term_side: "Z"
        provider_network__name: "NTC-WAN"
  {% endfor %}

The above design file gets closer to a re-usable design. It has reduced the amount of information we have to represent by leveraging Jinja2 control structures, but there is still statically defined information. At the moment, the design includes hard coded site information (for the site name, device names and circuit IDs) as well as a hard coded IP prefix. Design Builder also provides a way for this information to be gathered dynamically. Fundamentally, all designs are just Nautobot Jobs. Therefore, a design Job can include user-supplied vars that are then copied into the Jinja2 render context. Consider the design job for our edge site design:

class EdgeDesign(DesignJob):
    """A basic design for design builder."""
    site_name = StringVar(label="Site Name", regex=r"\w{3}\d+")
    site_prefix = IPNetworkVar(label="Site Prefix")

#...

This design Job collects a site_name variable as well as a site_prefix variable from the user. Users provide values for these variables through the normal Job launch entrypoint:

Once the job has been launched, the Design Builder will provide these input variables to the Jinja rendering context. The variable names, within the jinja2 template, will match the attribute names used in the Design Job class. With the site_name and site_prefix variables now being defined dynamically, we can produce a final design document using them:

---

sites:
  - name: "{{ site_name }}"
    status__name: "Staging"
    prefixes:
      - prefix: "{{ site_prefix }}"
        status__name: "Reserved"
    devices:
    {% for i in range(2) %}
      - name: "{{ site_name }}-LR{{ i }}"
        status__name: "Planned"
        device_type__model: "C8300-1N1S-6T"
        device_role__name: "Edge Router"
        interfaces:
          - name: "GigabitEthernet0/0"
            type: "1000base-t"
            description: "Uplink to backbone"
            status__name: "Planned"
    {% endfor %}
circuits:
  {% for i in range(2) %}
  - cid: "{{ site_name }}-CKT-{{ i }}"
    status__name: "Planned"
    provider__name: "NTC"
    type__name: "Ethernet"
    terminations:
      - term_side: "A"
        site__name: "{{ site_name }}"
      - term_side: "Z"
        provider_network__name: "NTC-WAN"
  {% endfor %}

The design render context is actually much more flexible than simple user entry via script vars. Design Builder provides a complete system for managing the render context, including loading variables from YAML files and providing dynamic content via Python code. The official documentation covers all of the capabilities of the design context.

In addition to the YAML rendering capabilities, Design Builder includes a way to perform just-in-time operations while creating and updating Nautobot objects. For instance, in the above example, the site prefix is specified by the user that launches the job. It may be desirable for this prefix to be auto-assigned and provisioned out of a larger parent prefix. Design Builder provides a means to perform these just-in-time lookups and calculations in the form of something called an “action tag”. Action tags are evaluated during the object creation phase of a design’s implementation. That means that database lookups can occur and computations can take place as the design is being implemented. One of the provided action tags is the next_prefix action tag. This tag accepts query parameters to find a parent prefix, and also a parameter that specifies the length of the required new prefix. For example, if we want to provision a /24 prefix from the 10.0.0.0/16 parent, we could use the following:

prefixes:
  - "!next_prefix":
      prefix: "10.0.0.0/16"
      length: 24
    status__name: "Active"

The next_prefix action tag will find the parent prefix 10.0.00/16 and look for the first available /24 in that parent. Once found, Design Builder will create that child prefix with the status Active.

Several action tags are provided out of the box, but one of the most powerful features of Design Builder is the ability to include custom action tags in a design. Action tags are implemented in Python as specialized classes, and can perform any operation necessary to produce a just-in-time result.

There is quite a lot to understand with Design Builder, and we have only touched on a few of its capabilities. While there are several moving parts, the following diagram illustrates the high-level process that the Design Builder application uses to go from design files and templates to an implemented design.

Design Builder starts with some optional input variables from the Nautobot job and combines them with optional context variables written either in YAML or Python or both. This render context is used by the Jinja2 renderer to resolve variable names in Jinja2 templates. The Jinja2 templates are rendered into YAML documents that are unmarshaled as Python dictionaries and provided to the Builder. The Builder iterates all of the objects in this dictionary and performs necessary database creations and updates. In the process of creating and updating objects, any action tags that are present are evaluated. The final result is a set of objects in Nautobot that have been created or updated by Design Builder.

Roadmap

Our plans for Design Builder are far from over. There are many more features we’re currently working on, as well as some that are still in the planning stages. Some of the near-term features include design lifecycle and object protection.

The design lifecycle feature allows the implementations of a design to be tracked. Design instances can be created (such as an instance of the edge site design above) and can be subsequently decommissioned. Objects that belong to a design instance will be reverted to their state prior to the design implementation, or they may be removed entirely (if created specifically for a design). Designs can also track inter-design dependencies so that a design cannot be decommissioned if other design instances depend on it. The design lifecycle feature will also allow designs to be versioned so that an implementation can be updated over time.

The ability to protect objects that belong to a design is also planned. The idea is that if an object is created as part of a design implementation, any attributes that were initially set in this design cannot be updated outside of that design’s lifecycle. This object protection assures that our source of truth has data that complies with a design and prevents manually introduced errors.


Conclusion

Design Builder is a great tool that ensures your network designs are used for every deployment, and simplifies populating data in Nautobot along the way. It provides a streamlined way to represent hierarchical relationships with a clear syntax and concepts that should be familiar to those that have started to embark on their NetDevOps journey. I encourage you to try it out.

-Andrew, Christian and Paddy



ntc img
ntc img

Contact Us to Learn More

Share details about yourself & someone from our team will reach out to you ASAP!

Circuit Maintenance Parser Powered by AI/ML

Blog Detail

More than two years ago, NTC released the circuit-maintenance-parser Python library to facilitate the arduous job of understanding what network service providers say when sending circuit maintenance notifications without any normalized format. We explained the why and how in these two blogs: 1 and 2. This has proven useful, but recently we challenged ourselves: how could technology like Artificial Intelligence and Machine Learning (AI/ML) make it even better?

Recap of Two Years, and What’s Next?

The circuit-maintenance-parser library provides several parsers for transforming circuit maintenance notifications from many network providers into normalized ones, making it very easy to digest them programmatically.

In two years, we have seen the addition of new parsers together with updates and fixes for the existing ones. (You can check the complete list of currently supported ones in the repository README), but just to name a few: NTT, AWS, Equinix, Cogent, COLT, EXA, etc. Also, we received notice of many users of the library worldwide!

An example of an application leveraging the library is the Nautobot Circuit Maintenance App that fetches emails from network providers, parses them, and updates the related circuits in Nautobot.

The parsers can work on many different data types (e.g., ICal, plain text, HTML, CSV, etc.). There is a generic implementation that works on a proposed reference format in this BCOP.

To better understand the new changes introduced, it’s convenient to explain first the four basic entities of the library:

  • Provider: represents a network service provider that can leverage several Processors, in a specific order (if one fails, it tries the next).
  • Processor: combines the structured data parsed by one or several Parsers to create one or several Maintenances.
  • Parser: extracts structured data from a raw notification.
  • Maintenance: it’s the outcome of the parsing process, and it adheres to the reference format mentioned above.

So far, so good. The library has been able to evolve and adapt to new requirements. However, every update requires a human modifying or creating a new parser (i.e., developing the logic, creating a PR, and accepting and releasing the parser).

Nowadays, with the explosion of Large Language Models (LLM) as a subset of the Machine Learning technologies, the text processing is getting transformed by new opportunities, and we believe the circuit-maintenance-parser is a great use case to explore them. So, let’s see how we approached it.

Understanding How LLM Parsers Work

In short, a circuit maintenance notification is a text that contains key information that needs to be extracted and normalized according to the library requirements. This is what we tried to solve following the next guidelines:

  • A new Parser, called LLM, has been created to implement the logic required to ask the question that should provide the parsed response. The LLM needs to be implemented for a specific platform (e.g., OpenAI) to interact with it using the predefined hooks (i.e., to craft the API calls that every platform provides).
  • Every Provider could include, as the last resort, a Processor that contains LLM parser implementation, when some conditions are met. Thus, the LLM parser is not the first parsing option at all. Human-defined parsers are used first. Only if all of them fail are the LLM parsers taken into account.
  • The Maintenance object comes with a new Metadata attribute which provides information about the ProviderProcessor, and Parsers used in the information extraction. This is very important to allow library users to consider when using the data, because the level of confidence is not the same for all the parsers.

Hopefully this makes sense to you; so now it’s time to see it in action.

Let’s Use It

First, we need to install into the library the openai extension (it’s the only implemented LLM provider for now).

pip install circuit-maintenance-parser[openai]

Then, using the built-in CLI tool (i.e., circuit-maintenance-parser), we can see how it works, leveraging example data from the tests.

You could reproduce the same interacting directly with the library, but the CLI offers a simpler interface for demonstrating it.

Before getting into the magic of LLM, let’s see how the library works without LLM-powered parsers (the default option).

$ circuit-maintenance-parser --data-file tests/unit/data/aws/aws1.eml --data-type email  --provider-type aws -v
Circuit Maintenance Notification #0
{
  "account": "0000000000001",
  "circuits": [
    {
      "circuit_id": "aaaaa-00000001",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000002",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000003",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000004",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000005",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000006",
      "impact": "OUTAGE"
    }
  ],
  "end": 1621519200,
  "maintenance_id": "15faf02fcf2e999792668df97828bc76",
  "organizer": "aws-account-notifications@amazon.com",
  "provider": "aws",
  "sequence": 1,
  "stamp": 1620337976,
  "start": 1621497600,
  "status": "CONFIRMED",
  "summary": "Planned maintenance has been scheduled on an AWS Direct Connect router in A Block, New York, NY from Thu, 20 May 2021 08:00:00 GMT to Thu, 20 May 2021 14:00:00 GMT for 6 hours. During this maintenance window, your AWS Direct Connect services listed below may become unavailable.",
  "uid": "0"
}
Metadata #0
provider='aws' processor='CombinedProcessor' parsers=['EmailDateParser', 'TextParserAWS1', 'SubjectParserAWS1'] generated_by_llm=False

At this point, you can see that the parsing was run successfully producing one Maintenance, with the new Metadata providing info of how it has been parsed.

You can see that it leveraged the provider-type to tell the library which provider had to be used (aws). However, without this information, the library can’t parse it properly, because it defaults to the GenericProvider which only understands the ICal data type using the BCOP recommended format. Let’s try it:

$ circuit-maintenance-parser --data-file tests/unit/data/aws/aws1.eml --data-type email -v
Provider processing failed: Failed creating Maintenance notification for GenericProvider.
Details:
- Processor SimpleProcessor from GenericProvider failed due to: None of the supported parsers for processor SimpleProcessor (ICal) was matching any of the provided data types (email-header-date, email-header-subject, text/plain).

Now, let’s see how the new OpenAI parser (implementing the LLM) can help us. The only mandatory thing to activate is to set the PARSER_OPENAI_API_KEY environmental variable:

export PARSER_OPENAI_API_KEY="use your token here"

By default, it uses ChatGPT 3.5 model; but you can change it with the PARSER_OPENAI_MODEL environmental variable. To see all the available options (including options to customize the LLM question), check the docs.

At this point, every Provider will have the OpenAI parser as the last resort.

Let’s repeat the previous example without providing the provider-type (your output can differ, it’s not deterministic), and notice the Metadata associated to this output that mentions the parsers being used. You will also see how this takes slightly longer than before because the OpenAI API is being used.

$ circuit-maintenance-parser --data-file tests/unit/data/aws/aws1.eml --data-type email -v
Circuit Maintenance Notification #0
{
  "account": "Amazon Web Services",
  "circuits": [
    {
      "circuit_id": "aaaaa-00000001",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000002",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000003",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000004",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000005",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000006",
      "impact": "OUTAGE"
    }
  ],
  "end": 1621519200,
  "maintenance_id": "aaaaa-00000001",
  "organizer": "unknown",
  "provider": "genericprovider",
  "sequence": 1,
  "stamp": 1620337976,
  "start": 1621497600,
  "status": "CONFIRMED",
  "summary": "Planned maintenance has been scheduled on an AWS Direct Connect router in A Block, New York, NY for 6 hours.",
  "uid": "0"
}
Metadata #0
provider='genericprovider' processor='CombinedProcessor' parsers=['EmailDateParser', 'OpenAIParser'] generated_by_llm=True

The output should provide a “similar” successful parsing like the above one. However, a closer look will reveal some differences. Some of them may be acceptable, and others not. Having the metadata (including a generated_by_llm boolean), the library user can choose how this information should be managed, maybe adding extra validation before accepting it.

If you use any of the available tools to extract the difference between the JSON objects (such as https://www.jsondiff.com/), you can see which are the differences (you may get different output depending on your results). Keep in mind that you may need to discard or adjust some information.

{
  "account": "Amazon Web Services",
  "maintenance_id": "aaaaa-00000001",
  "organizer": "unknown",
  "provider": "genericprovider",
  "summary": "Planned maintenance has been scheduled on an AWS Direct Connect router in A Block, New York, NY for 6 hours."
}

And, if you are wondering what would happen if you properly set the provider type, the result will be exactly the same as before because the aws provider knows how to parse it properly, and the LLM parser is not actually hit.


Conclusion

At NTC, we are constantly considering how to leverage AI/ML technologies to support network automation use cases for all the different components of our recommended architecture (more info in this blog series), and this new feature is an example of how our open source projects can be powered by them.

We would like to encourage you to give it a try, and provide constructive feedback in the form of Issues or Feature Requests in the library repository.

Thanks for reading!

-Christian



ntc img
ntc img

Contact Us to Learn More

Share details about yourself & someone from our team will reach out to you ASAP!

Network Programmability & Automation, 2nd Edition, Is Out There!

Blog Detail

The second edition of Network Programmability and Automation is already out there!

As Jason and I announced more than one year ago in this blog, I had the honor to join the original authors (Scott Lowe and Matt Oswalt) to work on this new edition.

The goal of the book remains the same—to help network engineers who want to explore network automation and transform themselves with the skills that modern network engineering demands. Because of the broad concepts and technologies involved, this is not a simple goal. We did our best revising the first edition of the book by extending existing topics (for instance, covering classes, exceptions, and multi-threading in the Python chapter), and adding new ones, such as:

  • Cloud: Cloud Networking, Containers, Kubernetes
  • Network Development Environments: Text editors, development tools, and emulation tools (e.g., VirtualBox, Vagrant, Containerlab)
  • Go programming language
  • RESTCONF and gRPC/gNMI: new API interfaces with examples in Python and Go
  • Nornir: a Python framework to orchestrate network operations, with examples with Napalm plugin
  • Terraform: provisioning cloud networking resources as code
  • Network Automation Architecture: a structured approach to building network automation solutions integrating complementary solutions

We also wanted to facilitate the reproducibility of the numerous code examples, so we have published a GitHub repository with the examples referenced in the book. And, due to book length constraints, we also had to relocate some content from the first edition into an extras website.

Personally, it has been an amazing opportunity to improve how to communicate technical concepts, and also being able to help all the network engineers like me who are looking forward to learning and getting better. We hope this book helps you get started on your network automation journey. Enjoy it!

-Christian

PS: You can find it at Amazon.com in paperback and Kindle format.



ntc img
ntc img

Contact Us to Learn More

Share details about yourself & someone from our team will reach out to you ASAP!