Developing Batfish – Developer Summary (Part 1)

This is Part 1 of a series of posts to help a programmer learn how to contribute to Batfish. The goal of this series is to get a contributor up to speed and understanding the different elements of Batfish, from the viewpoint of a developer even if you have NO Java/ANTLR experience.

From Batfish’s definition:

Batfish is a network validation tool that provides correctness guarantees for security, reliability, and compliance by analyzing the configuration of network devices. It builds complete models of network behavior from device configurations and finds violations of network policies (built-in, user-defined, and best-practices).

Finally, I can’t say enough about the generous help I received from the Batfish developers throughout my initial contributions. Contributing to an Open Source project can be daunting, especially when using a language you’re not overly familiar with. The developers at Batfish were kind, helpful, and a pleasure to work with to get these updates added into the codebase. See the Batfish Slack#dev channel for any additional help.

Key Concepts and Terminology

When I first cloned down the Batfish repository and started to browse the source code, it quickly became evident that I’d have to do some research before I could get started. A basic understanding of Java projects and their structures should be researched. Next, at a minimum a basic Java tutorial should be followed to get your feet wet into Java. Finally, dive in! I had very limited Java coding experience when I started making contributions, but with a background in Python and a small amount of Golang it was a shallow learning curve.

Grammars

While browsing the code I saw a directory named antlr4, this is where the grammars are stored. Before I started to research what ANTLR is, I looked through a few of the files and could quickly understand that it was translating network configuration files.

What is ANTLR’s:

From ANTLR’s documentation – ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It’s widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build and walk parse trees.

ANTLR’s Definition of a Grammar:

From a formal language description called a grammar, ANTLR generates a parser for that language that can automatically build parse trees, which are data structures representing how a grammar matches the input. ANTLR also automatically generates tree walkers that you can use to visit the nodes of those trees to execute application-specific code.

Batfish uses ANTLR to create structured data in the form of a vendor-specific representation. Extending a grammar was pretty straightforward for 90% of my use case; by investigating other .g4 files in the project, the basic steps are obvious. I will go into further details in Part 2 of this blog series.

A simple representation can be seen below:

Configuration Line:

set switch-options vtep-source-interface lo0.0

Grammar Definition:

s_switch_options
:
  SWITCH_OPTIONS
    (
      so_vtep_source_interface
    )
;

so_vtep_source_interface
:
  VTEP_SOURCE_INTERFACE iface = interface_id
;

Lexer Definitions:

SWITCH_OPTIONS: 'switch-options';

VTEP_SOURCE_INTERFACE
:
   'vtep-source-interface' -> pushMode(M_Interface)
;

This is a simplified view, details will be provided in Part 2.

By investigating the snippets above you can piece together the different elements that are needed, and understand how ANTLR is creating the parsing tree.

Vendor-Specific Representation

The next concept to understand is what to do with the structured data that’s retrieved from the parser tree. In this step you import the grammar contexts and parse the tree in order to take the parsed data and represent it into a vendor-specific data structure. In most cases this is built as a Java class that represents a vendor-specific dataset (e.g., VLAN, PROTOCOL). I will go into further details in Part 3 of this blog series.

Vendor-Independent Data Model

Multiple projects are within the overall codebase for Batfish. The vendor independent-models are held inside the batfish-common-protocol Java project. Within this project there is a data model that represents network aspects as a vendor-independent model. I will go into further details in Part 4 of this blog series.

Most of the vendor-independent data models are pretty complex. A simple example to demonstrate here is the Batfish InterfaceType data model, that is a enum of available options. This is a basic example but other data models such as Vxlan are far more complex and the model has to define the items that are defined by an RFC.

<span role="button" tabindex="0" data-code="public enum InterfaceType { /** Logical interface that aggregates multiple (physical) interfaces */ AGGREGATED, /** Child of a aggregate interface: logical, sub-interface of an AGGREGATED interface */ AGGREGATE_CHILD, /** Generic Logical interface, (e.g., units on Juniper devices) */ LOGICAL, /** Logical loopback interface */
public enum InterfaceType {
  /** Logical interface that aggregates multiple (physical) interfaces */
  AGGREGATED,
  /** Child of a aggregate interface: logical, sub-interface of an AGGREGATED interface */
  AGGREGATE_CHILD,
  /** Generic Logical interface, (e.g., units on Juniper devices) */
  LOGICAL,
  /** Logical loopback interface */
<omitted>
  /** Logical VLAN/irb interface */
  VLAN,
  /** Logical VPN interface, (i.e., IPSec tunnel) */
  VPN,
}

Developer Prep Work

As far as getting started, everything you need is already written up in the Batfish wiki under the “For developers” section. I would also recommend setting up git pre-commit to validate the code quality before submitting a pull request. The pre-commit config is already created and in the root of the Batfish repo. Follow the steps below in order to get started quickly.

Helpers and Utilities

  • Using Pre-commit:

Note: Most likely you would want to do the next step inside a virtual environment.

Install pre-commit:

  pip install pre-commit
  • Next, make contributions and run the commit check:
  pre-commit run --all-files
  • Which results in output similar to:
▶ pre-commit run --all-files
  black....................................................................Passed
  java-format..............................................................Passed
  isort....................................................................Passed
  autoflake................................................................Passed

Pro Tip: You can install pre-commit and it will automatically run when executing a git commit.

      ▶ pre-commit install
      pre-commit installed at .git/hooks/pre-commit
    • rmatting:

    Beyond this, Batfish also comes with a nice helper utility to fix Java formatting. This lives within the tools directory in repo root, and it can be run by executing the shell script.

    ./tools/fix_java_format.sh 
    • eviewable:

    Reviewable is used within the Batfish repo while a PR is being reviewed. It is helpful to know and understand the basics about Reviewable (and know how to utilize the feedback) in order to get your PR merged!

    My First Contributions

    My first few contributions into Batfish were to extend VXLAN/EVPN support for Juniper (Junos).

    I will be covering the details of these Pull Request (PR) in future blog post outlined below; however, the PRs are provided for reference and research and can be used to help prepare a new contributor.


    Conclusion

    In the coming months I will be creating a specific blog post on each of the concepts mentioned above.

    • Developing Batfish – Extending a Grammar (Part 2)
    • Developing Batfish – Converting Config Text into Structured Data (Part 3)
    • Developing Batfish – Converting Vendor-Specific to Vendor-Independent (Part 4)

    -Jeff



    ntc img
    ntc img

    Contact Us to Learn More

    Share details about yourself & someone from our team will reach out to you ASAP!

    Author