Circuit Maintenance Parser Powered by AI/ML

More than two years ago, NTC released the circuit-maintenance-parser Python library to facilitate the arduous job of understanding what network service providers say when sending circuit maintenance notifications without any normalized format. We explained the why and how in these two blogs: 1 and 2. This has proven useful, but recently we challenged ourselves: how could technology like Artificial Intelligence and Machine Learning (AI/ML) make it even better?

Recap of Two Years, and What’s Next?

The circuit-maintenance-parser library provides several parsers for transforming circuit maintenance notifications from many network providers into normalized ones, making it very easy to digest them programmatically.

In two years, we have seen the addition of new parsers together with updates and fixes for the existing ones. (You can check the complete list of currently supported ones in the repository README), but just to name a few: NTT, AWS, Equinix, Cogent, COLT, EXA, etc. Also, we received notice of many users of the library worldwide!

An example of an application leveraging the library is the Nautobot Circuit Maintenance App that fetches emails from network providers, parses them, and updates the related circuits in Nautobot.

The parsers can work on many different data types (e.g., ICal, plain text, HTML, CSV, etc.). There is a generic implementation that works on a proposed reference format in this BCOP.

To better understand the new changes introduced, it’s convenient to explain first the four basic entities of the library:

  • Provider: represents a network service provider that can leverage several Processors, in a specific order (if one fails, it tries the next).
  • Processor: combines the structured data parsed by one or several Parsers to create one or several Maintenances.
  • Parser: extracts structured data from a raw notification.
  • Maintenance: it’s the outcome of the parsing process, and it adheres to the reference format mentioned above.

So far, so good. The library has been able to evolve and adapt to new requirements. However, every update requires a human modifying or creating a new parser (i.e., developing the logic, creating a PR, and accepting and releasing the parser).

Nowadays, with the explosion of Large Language Models (LLM) as a subset of the Machine Learning technologies, the text processing is getting transformed by new opportunities, and we believe the circuit-maintenance-parser is a great use case to explore them. So, let’s see how we approached it.

Understanding How LLM Parsers Work

In short, a circuit maintenance notification is a text that contains key information that needs to be extracted and normalized according to the library requirements. This is what we tried to solve following the next guidelines:

  • A new Parser, called LLM, has been created to implement the logic required to ask the question that should provide the parsed response. The LLM needs to be implemented for a specific platform (e.g., OpenAI) to interact with it using the predefined hooks (i.e., to craft the API calls that every platform provides).
  • Every Provider could include, as the last resort, a Processor that contains LLM parser implementation, when some conditions are met. Thus, the LLM parser is not the first parsing option at all. Human-defined parsers are used first. Only if all of them fail are the LLM parsers taken into account.
  • The Maintenance object comes with a new Metadata attribute which provides information about the ProviderProcessor, and Parsers used in the information extraction. This is very important to allow library users to consider when using the data, because the level of confidence is not the same for all the parsers.

Hopefully this makes sense to you; so now it’s time to see it in action.

Let’s Use It

First, we need to install into the library the openai extension (it’s the only implemented LLM provider for now).

pip install circuit-maintenance-parser[openai]

Then, using the built-in CLI tool (i.e., circuit-maintenance-parser), we can see how it works, leveraging example data from the tests.

You could reproduce the same interacting directly with the library, but the CLI offers a simpler interface for demonstrating it.

Before getting into the magic of LLM, let’s see how the library works without LLM-powered parsers (the default option).

$ circuit-maintenance-parser --data-file tests/unit/data/aws/aws1.eml --data-type email  --provider-type aws -v
Circuit Maintenance Notification #0
{
  "account": "0000000000001",
  "circuits": [
    {
      "circuit_id": "aaaaa-00000001",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000002",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000003",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000004",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000005",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000006",
      "impact": "OUTAGE"
    }
  ],
  "end": 1621519200,
  "maintenance_id": "15faf02fcf2e999792668df97828bc76",
  "organizer": "aws-account-notifications@amazon.com",
  "provider": "aws",
  "sequence": 1,
  "stamp": 1620337976,
  "start": 1621497600,
  "status": "CONFIRMED",
  "summary": "Planned maintenance has been scheduled on an AWS Direct Connect router in A Block, New York, NY from Thu, 20 May 2021 08:00:00 GMT to Thu, 20 May 2021 14:00:00 GMT for 6 hours. During this maintenance window, your AWS Direct Connect services listed below may become unavailable.",
  "uid": "0"
}
Metadata #0
provider='aws' processor='CombinedProcessor' parsers=['EmailDateParser', 'TextParserAWS1', 'SubjectParserAWS1'] generated_by_llm=False

At this point, you can see that the parsing was run successfully producing one Maintenance, with the new Metadata providing info of how it has been parsed.

You can see that it leveraged the provider-type to tell the library which provider had to be used (aws). However, without this information, the library can’t parse it properly, because it defaults to the GenericProvider which only understands the ICal data type using the BCOP recommended format. Let’s try it:

$ circuit-maintenance-parser --data-file tests/unit/data/aws/aws1.eml --data-type email -v
Provider processing failed: Failed creating Maintenance notification for GenericProvider.
Details:
- Processor SimpleProcessor from GenericProvider failed due to: None of the supported parsers for processor SimpleProcessor (ICal) was matching any of the provided data types (email-header-date, email-header-subject, text/plain).

Now, let’s see how the new OpenAI parser (implementing the LLM) can help us. The only mandatory thing to activate is to set the PARSER_OPENAI_API_KEY environmental variable:

export PARSER_OPENAI_API_KEY="use your token here"

By default, it uses ChatGPT 3.5 model; but you can change it with the PARSER_OPENAI_MODEL environmental variable. To see all the available options (including options to customize the LLM question), check the docs.

At this point, every Provider will have the OpenAI parser as the last resort.

Let’s repeat the previous example without providing the provider-type (your output can differ, it’s not deterministic), and notice the Metadata associated to this output that mentions the parsers being used. You will also see how this takes slightly longer than before because the OpenAI API is being used.

$ circuit-maintenance-parser --data-file tests/unit/data/aws/aws1.eml --data-type email -v
Circuit Maintenance Notification #0
{
  "account": "Amazon Web Services",
  "circuits": [
    {
      "circuit_id": "aaaaa-00000001",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000002",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000003",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000004",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000005",
      "impact": "OUTAGE"
    },
    {
      "circuit_id": "aaaaa-00000006",
      "impact": "OUTAGE"
    }
  ],
  "end": 1621519200,
  "maintenance_id": "aaaaa-00000001",
  "organizer": "unknown",
  "provider": "genericprovider",
  "sequence": 1,
  "stamp": 1620337976,
  "start": 1621497600,
  "status": "CONFIRMED",
  "summary": "Planned maintenance has been scheduled on an AWS Direct Connect router in A Block, New York, NY for 6 hours.",
  "uid": "0"
}
Metadata #0
provider='genericprovider' processor='CombinedProcessor' parsers=['EmailDateParser', 'OpenAIParser'] generated_by_llm=True

The output should provide a “similar” successful parsing like the above one. However, a closer look will reveal some differences. Some of them may be acceptable, and others not. Having the metadata (including a generated_by_llm boolean), the library user can choose how this information should be managed, maybe adding extra validation before accepting it.

If you use any of the available tools to extract the difference between the JSON objects (such as https://www.jsondiff.com/), you can see which are the differences (you may get different output depending on your results). Keep in mind that you may need to discard or adjust some information.

{
  "account": "Amazon Web Services",
  "maintenance_id": "aaaaa-00000001",
  "organizer": "unknown",
  "provider": "genericprovider",
  "summary": "Planned maintenance has been scheduled on an AWS Direct Connect router in A Block, New York, NY for 6 hours."
}

And, if you are wondering what would happen if you properly set the provider type, the result will be exactly the same as before because the aws provider knows how to parse it properly, and the LLM parser is not actually hit.


Conclusion

At NTC, we are constantly considering how to leverage AI/ML technologies to support network automation use cases for all the different components of our recommended architecture (more info in this blog series), and this new feature is an example of how our open source projects can be powered by them.

We would like to encourage you to give it a try, and provide constructive feedback in the form of Issues or Feature Requests in the library repository.

Thanks for reading!

-Christian



ntc img
ntc img

Contact Us to Learn More

Share details about yourself & someone from our team will reach out to you ASAP!

Author