Introduction to Event-Driven Ansible and Nautobot

Blog Detail

At Network to Code, we are continually working on new solutions to extend automation capabilities for our customers. One project that I recently worked on used Event-Driven Ansible, or EDA, to simplify the process of automating other systems based on changes in Nautobot. This blog post will cover the basics of EDA, and how we used it to update ServiceNow CMDB records based on changes in Nautobot.

What Was the Problem We Were Trying to Solve?

The customer is using ServiceNow as their CMDB and Nautobot as their source of truth for network infrastructure. They wanted to be able to update ServiceNow records when changes were made in Nautobot. For example, when a device is added to Nautobot, they wanted to create a corresponding record in ServiceNow. There are other systems that we are integrating with Nautobot using EDA, but for this blog post we will focus on ServiceNow. Any system with an API or Ansible Galaxy role/collection can be integrated with Nautobot using EDA.

What Is Event-Driven Ansible?

Event-Driven Ansible was developed by Red Hat to allow listening to events from various sources and then taking action on those events using Rulebooks to define three components — sources, rules, and actions.

  • Sources — where the events are coming from. This can be a webhook, Kafka, Azure Service Bus, or other sources.
  • Rules — define the conditions that must be met for an action to be taken.
  • Actions — an action is commonly running a local playbook, but could also be generating an event, running a job template in AAP, or other actions.

How Did We Use EDA to Update ServiceNow Based on an Event from Nautobot?

We developed a small custom plugin for Nautobot that utilizes Nautobot Job Hooks to publish events to an Azure Service Bus queue. An added benefit to using ASB as our event bus was that Event-Driven Ansible already had a source listener plugin built for ASB, so no additional work was needed! See event source plugins. This allows us to initiate the connection to Azure Service Bus from Nautobot and then send events to Azure Service Bus when changes are made in Nautobot.

The flow of events is as follows:

  1. Nautobot device create (or update, delete) triggers a Job Hook.
  2. A Nautobot App publishes the event to Azure Service Bus queue. This App receives the Job Hook event from Nautobot and publishes the payload to the defined Azure Service Bus queue.
  3. Ansible EDA source plugin connects and subscribes to the Azure Service Bus queue and listens for events.
  4. EDA runs Ansible playbook to update ServiceNow.

What Do the Rulebooks and Playbooks Look Like?

Below is an example of a basic rulebook we are using. This rulebook will run the playbook add_device_to_servicenow.yml when a device is created in Nautobot.

Rulebook

---
- name: "LISTEN TO ASB QUEUE"
    hosts: localhost
    sources:
      - ansible.eda.azure_service_bug:
          connection_string: ""
          queue_name: ""

    rules:
      - name: "ADD DEVICE TO SERVICENOW"
        condition: "event.body.data.action =='create'"
        action:
          run_playbook:
            name: "add_device_to_servicenow.yml"
            verbosity: 1

You can add different sources, conditions, and rules as needed. Any information that you can extract from the event can be used in the condition.

Playbook

---
- name: "ADD DEVICE TO SERVICENOW"
  hosts: localhost
  connection: local
  gather_facts: false
  tasks:
    - name: "ADD DEVICE TO SERVICENOW"
      servicenow.servicenow.snow_record:
        state: present
        table: "cmdb_ci_netgear"
        instance: ""
        username: ""
        password: ""
        name: ""
        description: ""
        serial_number: ""
        model_id: ""
        manufacturer_id: ""

Playbooks are structured as normal, with the addition of the event variable. This variable contains the event data that was sent from Nautobot. In this example, we are using the event.body.data to extract the device name, description, serial number, platform, and manufacturer.

In the above example, we used the ServiceNow Ansible Collection to update ServiceNow. You can use any Ansible module, role, or collection to update the system you are integrating with Nautobot. One of the systems I was updating did not have an Ansible module, so I used the uri module to make API calls to the system.

Resources


Conclusion

Event-Driven Ansible is a powerful tool that can be used to integrate Nautobot with other systems. It can solve the very real problem of keeping multiple systems in sync and can be used to automate many different tasks. Feel free to join us at the Network to Code Slack channel to discuss this and other automation topics.

-Susan



ntc img
ntc img

Contact Us to Learn More

Share details about yourself & someone from our team will reach out to you ASAP!

Circuit Maintenance Parser Powered by AI/ML

Blog Detail

More than two years ago, NTC released the circuit-maintenance-parser Python library to facilitate the arduous job of understanding what network service providers say when sending circuit maintenance notifications without any normalized format. We explained the why and how in these two blogs: 1 and 2. This has proven useful, but recently we challenged ourselves: how could technology like Artificial Intelligence and Machine Learning (AI/ML) make it even better?

Recap of Two Years, and What’s Next?

The circuit-maintenance-parser library provides several parsers for transforming circuit maintenance notifications from many network providers into normalized ones, making it very easy to digest them programmatically.

In two years, we have seen the addition of new parsers together with updates and fixes for the existing ones. (You can check the complete list of currently supported ones in the repository README), but just to name a few: NTT, AWS, Equinix, Cogent, COLT, EXA, etc. Also, we received notice of many users of the library worldwide!

An example of an application leveraging the library is the Nautobot Circuit Maintenance App that fetches emails from network providers, parses them, and updates the related circuits in Nautobot.

The parsers can work on many different data types (e.g., ICal, plain text, HTML, CSV, etc.). There is a generic implementation that works on a proposed reference format in this BCOP.

To better understand the new changes introduced, it’s convenient to explain first the four basic entities of the library:

  • Provider: represents a network service provider that can leverage several Processors, in a specific order (if one fails, it tries the next).
  • Processor: combines the structured data parsed by one or several Parsers to create one or several Maintenances.
  • Parser: extracts structured data from a raw notification.
  • Maintenance: it’s the outcome of the parsing process, and it adheres to the reference format mentioned above.

So far, so good. The library has been able to evolve and adapt to new requirements. However, every update requires a human modifying or creating a new parser (i.e., developing the logic, creating a PR, and accepting and releasing the parser).

Nowadays, with the explosion of Large Language Models (LLM) as a subset of the Machine Learning technologies, the text processing is getting transformed by new opportunities, and we believe the circuit-maintenance-parser is a great use case to explore them. So, let’s see how we approached it.

Understanding How LLM Parsers Work

In short, a circuit maintenance notification is a text that contains key information that needs to be extracted and normalized according to the library requirements. This is what we tried to solve following the next guidelines:

  • A new Parser, called LLM, has been created to implement the logic required to ask the question that should provide the parsed response. The LLM needs to be implemented for a specific platform (e.g., OpenAI) to interact with it using the predefined hooks (i.e., to craft the API calls that every platform provides).
  • Every Provider could include, as the last resort, a Processor that contains LLM parser implementation, when some conditions are met. Thus, the LLM parser is not the first parsing option at all. Human-defined parsers are used first. Only if all of them fail are the LLM parsers taken into account.
  • The Maintenance object comes with a new Metadata attribute which provides information about the ProviderProcessor, and Parsers used in the information extraction. This is very important to allow library users to consider when using the data, because the level of confidence is not the same for all the parsers.

Hopefully this makes sense to you; so now it’s time to see it in action.

Let’s Use It

First, we need to install into the library the openai extension (it’s the only implemented LLM provider for now).

pip install circuit-maintenance-parser[openai]

Then, using the built-in CLI tool (i.e., circuit-maintenance-parser), we can see how it works, leveraging example data from the tests.

You could reproduce the same interacting directly with the library, but the CLI offers a simpler interface for demonstrating it.

Before getting into the magic of LLM, let’s see how the library works without LLM-powered parsers (the default option).

$ circuit-maintenance-parser --data-file tests/unit/data/aws/aws1.eml --data-type email  --provider-type aws -v
Circuit Maintenance Notification #0
{
  "account": "0000000000001",
  "circuits": [
    {
      "circuit_id": "aaaaa-00000001",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000002",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000003",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000004",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000005",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000006",
      "impact": "OUTAGE"
    }
  ],
  "end": 1621519200,
  "maintenance_id": "15faf02fcf2e999792668df97828bc76",
  "organizer": "aws-account-notifications@amazon.com",
  "provider": "aws",
  "sequence": 1,
  "stamp": 1620337976,
  "start": 1621497600,
  "status": "CONFIRMED",
  "summary": "Planned maintenance has been scheduled on an AWS Direct Connect router in A Block, New York, NY from Thu, 20 May 2021 08:00:00 GMT to Thu, 20 May 2021 14:00:00 GMT for 6 hours. During this maintenance window, your AWS Direct Connect services listed below may become unavailable.",
  "uid": "0"
}
Metadata #0
provider='aws' processor='CombinedProcessor' parsers=['EmailDateParser', 'TextParserAWS1', 'SubjectParserAWS1'] generated_by_llm=False

At this point, you can see that the parsing was run successfully producing one Maintenance, with the new Metadata providing info of how it has been parsed.

You can see that it leveraged the provider-type to tell the library which provider had to be used (aws). However, without this information, the library can’t parse it properly, because it defaults to the GenericProvider which only understands the ICal data type using the BCOP recommended format. Let’s try it:

$ circuit-maintenance-parser --data-file tests/unit/data/aws/aws1.eml --data-type email -v
Provider processing failed: Failed creating Maintenance notification for GenericProvider.
Details:
- Processor SimpleProcessor from GenericProvider failed due to: None of the supported parsers for processor SimpleProcessor (ICal) was matching any of the provided data types (email-header-date, email-header-subject, text/plain).

Now, let’s see how the new OpenAI parser (implementing the LLM) can help us. The only mandatory thing to activate is to set the PARSER_OPENAI_API_KEY environmental variable:

export PARSER_OPENAI_API_KEY="use your token here"

By default, it uses ChatGPT 3.5 model; but you can change it with the PARSER_OPENAI_MODEL environmental variable. To see all the available options (including options to customize the LLM question), check the docs.

At this point, every Provider will have the OpenAI parser as the last resort.

Let’s repeat the previous example without providing the provider-type (your output can differ, it’s not deterministic), and notice the Metadata associated to this output that mentions the parsers being used. You will also see how this takes slightly longer than before because the OpenAI API is being used.

$ circuit-maintenance-parser --data-file tests/unit/data/aws/aws1.eml --data-type email -v
Circuit Maintenance Notification #0
{
  "account": "Amazon Web Services",
  "circuits": [
    {
      "circuit_id": "aaaaa-00000001",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000002",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000003",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000004",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000005",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000006",
      "impact": "OUTAGE"
    }
  ],
  "end": 1621519200,
  "maintenance_id": "aaaaa-00000001",
  "organizer": "unknown",
  "provider": "genericprovider",
  "sequence": 1,
  "stamp": 1620337976,
  "start": 1621497600,
  "status": "CONFIRMED",
  "summary": "Planned maintenance has been scheduled on an AWS Direct Connect router in A Block, New York, NY for 6 hours.",
  "uid": "0"
}
Metadata #0
provider='genericprovider' processor='CombinedProcessor' parsers=['EmailDateParser', 'OpenAIParser'] generated_by_llm=True

The output should provide a “similar” successful parsing like the above one. However, a closer look will reveal some differences. Some of them may be acceptable, and others not. Having the metadata (including a generated_by_llm boolean), the library user can choose how this information should be managed, maybe adding extra validation before accepting it.

If you use any of the available tools to extract the difference between the JSON objects (such as https://www.jsondiff.com/), you can see which are the differences (you may get different output depending on your results). Keep in mind that you may need to discard or adjust some information.

{
  "account": "Amazon Web Services",
  "maintenance_id": "aaaaa-00000001",
  "organizer": "unknown",
  "provider": "genericprovider",
  "summary": "Planned maintenance has been scheduled on an AWS Direct Connect router in A Block, New York, NY for 6 hours."
}

And, if you are wondering what would happen if you properly set the provider type, the result will be exactly the same as before because the aws provider knows how to parse it properly, and the LLM parser is not actually hit.


Conclusion

At NTC, we are constantly considering how to leverage AI/ML technologies to support network automation use cases for all the different components of our recommended architecture (more info in this blog series), and this new feature is an example of how our open source projects can be powered by them.

We would like to encourage you to give it a try, and provide constructive feedback in the form of Issues or Feature Requests in the library repository.

Thanks for reading!

-Christian



ntc img
ntc img

Contact Us to Learn More

Share details about yourself & someone from our team will reach out to you ASAP!

Writing Your First Nautobot Job, Pt.2

Blog Detail

Welcome to Part 2 of our blog series “Writing Your First Nautobot Job.” The goal of this series is to provide Nautobot users with everything they need to start writing Nautobot Jobs from scratch. Now, we assume you have a basic understanding of the Python programming language and you have Nautobot up and running in your environment.

The first entry in this series (Part 1) reviewed fundamental topics, such as the Django ORM and Django data models. Now that we have a good understanding of how we can access and manipulate the data in Nautobot, we can start exploring the mechanisms within Nautobot that allow us to perform those manipulations in a structured and repeatable way.

Introduction

At the time of writing this blog post, Nautobot has just released a new major version 2.0, which comes with some significant changes to Nautobot Jobs. A full explanation of the changes can be found in our Job Migration Guide. As for this blog post, we will be focusing on Jobs compatible with version 2.0.

Historically, when Network Engineers start down their automation journey they usually begin with one of two technologies, which are either Python scripts or Ansible Playbooks. These technologies are great for tasks that you need to execute frequently. However, when it comes time to expose these tools to other people with limited knowledge or experience, the users can easily become overwhelmed by the unfamiliarity of the mediums where these technologies are executed. Nautobot provides us with a framework to safely provide users with access to our automation scripts inside a UI that is intuitive and familiar. This framework is implemented as the Job class inside of Nautobot.

The Job Class

The Job class is defined in the Nautobot source code here. You will see the Job class is empty but inherits all of its functionality from the parent class BaseJob. This is a common development practice that adds a layer of abstraction and minimizes the impact of changes to the BaseJob class on the Jobs you develop.

When we define our Jobs we will always inherit from the Job class and not the BaseJob class. Here is an example of a Job definition.

# jobs.py
from nautobot.apps.jobs import Job

class ExampleJob(Job):
    """This is our example Job definition.""
    ...

In the example above, we first imported the Job class from the Nautobot source code. Then we defined a class called ExampleJob that inherits from Job.

Settings

Nautobot Jobs have several settings that can modify its behavior. These settings can be defined in the jobs meta class definition or the UI.

To override the defaults, we can set each attribute with an appropriate value under a Meta class definition for our Job.

# jobs.py
from nautobot.apps.jobs import Job

class ExampleJob(Job):
    """This is our example Job definition."""

    class Meta:
        name = "Example Job"
        description = "This is the description of my ExampleJob."

The full list of the available settings, their default values, and a general description can be found in the table below.

Class AttributeDefault ValueDescription
name(Name of your Job class)The name of the job as it appears in the UI.
description-A general description of what functions the job performs. This can accept either plain text or Markdown-formatted text.
approval_requiredFalseThis boolean dictates whether or not an approval is required before the job can be executed.
dryrun_defaultFalseThis boolean represents the default state of the Dryrun checkbox when a job is run.
has_sensitive_variablesTrueThis boolean has several implications. The first is that it prevents input parameters from being saved to the database. This protects against inadvertent database exposure to sensitive information such as credentials. This setting also enables/disables the ability to rerun jobs (i.e., refill the job input parameters with a click of a button). This will also prevent the job from being scheduled or being marked as requiring approval.
hiddenFalseThis boolean prevents the job from being displayed by default in the UI and requires users to apply specific filters to the Job list view to be seen.
read_onlyFalseThis boolean is just a flag to indicate that the job does not make any changes to the environment. It is up to the author of the job to ensure the actual behavior of the job is “read only”
soft_time_limit300An integer or float value, in seconds, at which the celery.exceptions.SoftTimeLimitExceeded exception will be raised. Jobs can be written to catch this to clean up anything before the hard cutoff time_limit.
time_limit600An integer or float value, in seconds, at which the task is silently terminated.
task_queues[ ]A list of task queue names that the job is allowed to be routed to. By default only the default queue can be used. The queue listed first will be used for a job run via an API call.
template_name-A path relative to the job source code that contains a Django template which provides additional code to customize the Job’s submission form. Example

User Inputs

Now the next functionality of Nautobot Jobs that we need to consider is input variables. We will often want users to provide input data to set the scope of the Job. User inputs are optional, and it can sometimes be better to have no user inputs when the Job performs specific tasks. When we run a job, the first thing that happens is a user input form will be displayed. The types of user input options that are displayed in this form are controlled by the variable types that we use when defining the attribute of our Job class instance.

All job variables support the following default options:

  • default – The field’s default value
  • description – A brief user-friendly description of the field
  • label – The field name to be displayed in the rendered form
  • required – Indicates whether the field is mandatory (all fields are required by default)
  • widget – The class of form widget to use (see the Django documentation)

The full list of input variable types can be found here. However, I will explain some of the nuances of a few of the variable types below.

ChoiceVar and MultiChoiceVar

This input variable type allows you to define a set of choices from which the user can select one; it is rendered as a dropdown menu.

  • choices – A list of (value, label) tuples representing the available choices. For example:
CHOICES = (
    ('n', 'North'),
    ('s', 'South'),
    ('e', 'East'),
    ('w', 'West')
)

direction = ChoiceVar(choices=CHOICES)

In the example above, we first have to define a set of tuples to represent our choices. Then we pass that set of choices as an input parameter of the ChoiceVar. The user will see a dropdown menu with the list of choices being NorthSouthEast, and West. If a user selects North in the dropdown menu, then the direction variable will equal ‘n’.

Similar to ChoiceVar, the MultiChoiceVar allows for the selection of multiple choices and results in a list of values.

ObjectVar and MultiObjectVar

When your user needs to select a particular object within Nautobot, you will need to use an ObjectVar. Each ObjectVar specifies a particular model and allows the user to select one of the available instances. ObjectVar accepts several arguments, listed below.

  • model – The model class (Device, IPAddress, VLAN, Status, etc.)
  • display_field – The name of the REST API object field to display in the selection list (default: ‘display’)
  • query_params – A dictionary of REST API query parameters to use when retrieving available options (optional)
  • null_option – A label representing a “null” or empty choice (optional)

As you can probably tell, this input type performs an API call to Nautobot itself and constructs the list of options out of the values returned. The display_field argument is useful in cases where using the display API field is not desired for referencing the object. For example, when displaying a list of IP Addresses, you might want to use the dns_name field:

address = ObjectVar(
    model=IPAddress,
    display_field="dns_name",
)

To limit the selections available within the list, additional query parameters can be passed as the query_params dictionary. For example, to show only devices with an “active” status:

device = ObjectVar(
    model=Device,
    query_params={
        'status': 'Active'
    }
)

Multiple values can be specified by assigning a list to the dictionary key. It is also possible to reference the value of other fields in the form by prepending a dollar sign ($) to the variable’s name. The keys you can use in this dictionary are the same ones that are available in the REST API — as an example, it is also possible to filter the Location ObjectVar for its location_type.

location_type = ObjectVar(
    model=LocationType
)
location = ObjectVar(
    model=Location,
    query_params={
        "location_type": "$location_type"
    }
)

Similar to ObjectVar, the MultiObjectVar allows for the selection of multiple objects.

FileVar

An uploaded file. Note that uploaded files are present in memory only for the duration of the job’s execution and they will not be automatically saved for future use. The job is responsible for writing file contents to disk where necessary. This input option is good for when you need to perform bulk operations inside of your job.

IPAddressVar

An IPv4 or IPv6 address, without a mask. Returns a netaddr.IPAddress object. This is rendered as a one-line text box.

10.11.12.13

IPAddressWithMaskVar

An IPv4 or IPv6 address with a mask. Returns a netaddr.IPNetwork object which includes the mask. This is rendered as a one-line text box.

10.11.12.13/24

IPNetworkVar

An IPv4 or IPv6 network with a mask. Returns a netaddr.IPNetwork object. This is rendered as a one-line text box.

Two attributes are available to validate the provided mask:

  • min_prefix_length – Minimum length of the mask
  • max_prefix_length – Maximum length of the mask
10.11.12.0/24

Run()

Now that we have our user inputs defined, we can start defining what actions our job will perform. In our Job class, we define these actions in the run() method. This method takes the self argument and every variable defined on the job as keyword arguments.

# jobs.py
from nautobot.apps.jobs import Job, ObjectVar

class ExampleJob(Job):
    """This is our example Job definition."""

    device = ObjectVar(
        model=Device,
        query_params={
            'status': 'Active'
        }
    )

    class Meta:
        name = "Example Job"
        description = "This is the description of my ExampleJob."
    
    def run(self, device):
        """Do all the things here."""
        ...

In the example above, we can now reference the variable device in our run method and it will contain the object selected by the user when the job is run.

Job Outputs

The main output of Nautobot Jobs, other than the actions the job performs, is the JobResult. The job results page appears once the job starts running. From this page we can watch as the job executes. The status will be set to “running” and if you have log statements in your job they will be displayed as the job runs. Once the job is finished executing, the status will be updated to either “completed” or “failed”. Additional data, such as who ran the job, what the input variables were set to, and how long the job ran for, are also displayed.

Logging

We can log information from inside our jobs using the logger property of the Job class. This returns a logger object from the standard Python logging module (documentation). You can log messages at different logging levels with the different level methods of the logger. So if we wanted to log a warning message to the users, we would simply add the statement logger.warning("My warning message here.") to our job.

An optional grouping and/or object may be provided in log messages by passing them in the log function call’s extra kwarg. If a grouping is not provided, it will default to the function name that logged the message. The object will default to None.

# jobs.py
from nautobot.apps.jobs import Job, ObjectVar

class ExampleJob(Job):
    """This is our example Job definition."""

    device = ObjectVar(
        model=Device,
        query_params={
            'status': 'Active'
        }
    )

    class Meta:
        name = "Example Job"
        description = "This is the description of my ExampleJob."
    
    def run(self, device):
        """Do all the things here."""
        logger.warning("This object is not made by Cisco.", extra={"grouping": "validation", "object": device})
        ...

Status Control

As long as a job is completed without raising any exceptions, the job will be marked as “completed” when it finishes running. However, sometimes we want to mark the job as “failed” if certain conditions aren’t met or the result isn’t what we expected. To do this, we have to raise an Exception within our run() method.

# jobs.py
from nautobot.apps.jobs import Job, ObjectVar

class ExampleJob(Job):
    """This is our example Job definition."""

    device = ObjectVar(
        model=Device,
        query_params={
            'status': 'Active'
        }
    )

    class Meta:
        name = "Example Job"
        description = "This is the description of my ExampleJob."
    
    def run(self, device):
        """Do all the things here."""
        if not device.manufacture.name == "Cisco":
            logger.warning("This object is not made by Cisco.", extra={"grouping": "validation", "object": device})
            raise Exception("A non Cisco device was selected.")

Advanced Features

We have now covered everything you need to write your first Nautobot Job. Most jobs typically perform data manipulation tasks or pull/push data between Nautobot and an external system, but there are some advanced actions that Nautobot Jobs enable us to perform.

  • Job buttons allow us to add a button at the top of a specified object’s ObjectDetailView, which will pass that object as the input parameter for a job and execute it. Read more about Job Buttons here.
  • Job Hooks are jobs that run whenever objects of a specified model are created, updated, or deleted. We define specific tasks to be executed depending on what action was performed against the object. The documentation for Job Hooks can be found here.
  • A job can call another job using the enqueue_job method. We have plans to make this easier in the future.
  • Permissions to run or enable a job can be set per job/user via constraints.
  • Jobs can be approved via API, so an external ticketing system could be configured to read and update approval requests. You can see an example of such an API call here.

Conclusion

In Part 3, we will explore the different ways to load our jobs into Nautobot and the file structure each way requires. Then we will write our first Nautobot Job together.

-Allen



ntc img
ntc img

Contact Us to Learn More

Share details about yourself & someone from our team will reach out to you ASAP!