Developing Batfish – Extending a Grammar (Part 2)

This is part 2 of a blog series to help learn how to contribute to Batfish. Here’s the previous post in this series: Developing Batfish – Developer Summary (Part 1).

In this post I will be covering how to extend a grammar definition in detail, which includes updating the ANTLR files and adding proper testing to validate the parsing of the new configuration commands is working as expected. There are entire books and blog posts that cover the in-depth details of ANTLR, so this blog will focus on the pieces needed for a successful Batfish contribution.

What Is a Grammar?

In part 1 of this series I covered a generic definition of what a grammar is. Below covers some of the more in-depth grammar related concepts:

  • Lexer – A lexer (often called a scanner) breaks up an input stream of characters into vocabulary symbols for a parser, which applies a grammatical structure to that symbol stream. I think of this as the different command elements, whether that is a parent line or sub parent (e.g., interfacevlanprefix-list). The lexer needs to have all the elements built out for the parsing to work accurately. Lexer rules define token types. They start with an uppercase letter to distinguish them from parser rules.
  • Parser Rules – A parse tree is made up of the structure of parser rules. Parser rules are written in lowercase. ANTLR has detailed documentation on parser rules.

Basic Steps

  1. Create a testconfig file with the lines to be parsed
  2. Update grammar definition and lexer
  3. Add extraction tests

Extending a grammar merges concepts between part 2 and part 3 of this blog series. Specifically with the structured data representation and extraction testing, which commonly requires the representations to be created.

Command Additions

In this blog post I will be covering the steps to add parser rules for a few commands that I need access to for future enhancements. These commands live in the switch-options Junos stanza.

The commands I’d like to add parsing support for are:

set switch-options vrf-target target:65320:7999999
set switch-options vrf-target auto
set switch-options vrf-target import target:65320:7999999
set switch-options vrf-target export target:65320:7999999

More information on these commands can be found in the Juniper documentation.

At the start of this blog, the FlatJuniper_switch_options.g4 file has the contents:

Note: This file lives here in the Batfish directory structure.

parser grammar FlatJuniper_switch_options;

import FlatJuniper_common;

options {
   tokenVocab = FlatJuniperLexer;
}

s_switch_options
:
  SWITCH_OPTIONS
    (
      so_vtep_source_interface
      | so_route_distinguisher
      | so_vrf_target
      | so_vrf_export
      | so_vrf_import
    )
;

so_vtep_source_interface
:
  VTEP_SOURCE_INTERFACE iface = interface_id
;

so_route_distinguisher
:
  ROUTE_DISTINGUISHER route_distinguisher
;

so_vrf_target
:
  VRF_TARGET null_filler
;

so_vrf_export
:
  VRF_EXPORT null_filler
;

so_vrf_import
:
  VRF_IMPORT null_filler
;

You can see that vrf-targetexport, and import sections of the commands are going to null_filler, which is implemented in Batfish specifically for Juniper to raise the error about the command syntax not being supported. null_filler is defined in FlatJuniper_common.g4 and shown below:

null_filler
:
  ~( APPLY_GROUPS | NEWLINE )* apply_groups?
;

Note that null_filler is specific to Juniper within Batfish, and each vendor has its own definition/implementation of similar logic.

We will work on revamping this file and adding the necessary support for these command sections.

Create a Testconfig

As is true with most large-scale projects, testing is required and is an integral piece to be included in any contribution to a project. The first step we need to complete is to create a simple Testconfig file that Batfish’s testing environment can pull in and attempt to parse. The commands in our Testconfigs are very simple. I will create a file per command; this will give me added flexibility when adding testing.

First, I will create a Testconfig file called juniper-so-vrf-target-auto. This file is located in this directory.

I will test the auto vrf-target first.

#
set system host-name juniper-so-vrf-target-auto
#
set switch-options vrf-target auto
#

These files are simple test-configuration files, and they don’t need to be complicated. Their purpose is to test that a command can be parsed as expected. In this example I have a host-name and the command I want to parse out.

Note: A good way to determine how many Testconfig files are needed is to think about whether or not certain configuration lines will overwrite each other. If the configurations can be grouped or scoped a single Testconfig can be used.

Updating the Grammar

Updating this grammar should be straightforward. I want to extend what is mentioned above to become our starting point for the switch-options grammar.

The updated grammar definition is shown below. I will be explaining the changes in the next section.

parser grammar FlatJuniper_switch_options;

import FlatJuniper_common;

options {
   tokenVocab = FlatJuniperLexer;
}

s_switch_options
:
  SWITCH_OPTIONS
    (
      so_route_distinguisher
      | so_vrf_target
      | so_vtep_source_interface
   )
;

so_vrf_target:
   VRF_TARGET
        (
          sovt_auto
          | sovt_community
          | sovt_export
          | sovt_import
        )
;

so_vtep_source_interface
:
  VTEP_SOURCE_INTERFACE iface = interface_id
;

so_route_distinguisher
:
  ROUTE_DISTINGUISHER route_distinguisher
;

sovt_auto
:
  AUTO
;

sovt_community
:
   extended_community
;

sovt_export
:
   EXPORT extended_community
;

sovt_import
:
   IMPORT extended_community
;

Understanding the Grammar Updates

To help with the visualization, a diff is shown below:

<span role="button" tabindex="0" data-code="▶ sdiff -bBWs before.g4 after.g4 so_vtep_source_interface | so_route_distinguisher | so_route_distinguisher
▶ sdiff -bBWs before.g4 after.g4
      so_vtep_source_interface				      |	      so_route_distinguisher
      | so_route_distinguisher				      <
      | so_vrf_export					      |	      | so_vtep_source_interface
      | so_vrf_import					      |	   )
							      >	;
							      >
							      >	so_vrf_target:
							      >	   VRF_TARGET
							      >	        (
							      >	          sovt_auto
							      >	          | sovt_community
							      >	          | sovt_export
							      >	          | sovt_import
so_vrf_target						      |	sovt_auto
  VRF_TARGET null_filler				      |	  AUTO
so_vrf_export						      |	sovt_community
  VRF_EXPORT null_filler				      |	   extended_community
so_vrf_import						      |	sovt_export
  VRF_IMPORT null_filler				      |	   EXPORT extended_community
							      >
							      >	sovt_import
							      >	:
							      >	   IMPORT extended_community
							      >	;

I’ll go through the changes from top to bottom:

  • First, I show the change from so_vrf_target to VRF_TARGET. This one was actually an overall fix from the initial implementation I did in a previous PR into Batfish. Instead of defining so_vrf_target from scratch, I looked through the lexer and noticed that VRF_TARGET was already defined with a Lexer mode: M_VrfTarget push rule that I will be covering in the testing section below. In this case it made sense to reuse what was already built.
  • Next, I added the OR block to catch the multiple different command syntaxes that are supported.
  • sovt_auto will support the syntax set switch-options vrf-target auto.
  • The sovt_community definition is next; it supports set switch-options vrf-target target:65320:7999999. The rule also reuses extended_community which is defined in FlatJuniper_commmon.g4.
  • Finally, the last two are catching statically defined import or export community targets. These simply allow for the command set switch-options vrf-target import target:65320:7999999 or the identical command replacing import with export.

Note that the name of the rules has changed to follow the batfish standard. sovt in this case would be switch-options vrf-target. I also rearranged the rules to be alphabetical within each context block.

Add Boilerplate for Testing

The actual testing of the extraction code is going to be covered in part 3 of this series. For the purposes of validating the grammar I will demonstrate how to create a simple test. This will validate that Batfish can parse the configuration lines. In order to do this we create our Testconfig files and attempt to run the parseJuniperConfig class. At a bare minimum this can ensure the parsing of the ANTLR tree is successful.

This extraction test will be created in FlatJuniperGrammarTest.java. It will test the auto option via the config line set switch-options vrf-target auto.

  @Test
  public void testSwitchOptionsVrfTargetAutoExtraction() {
    parseJuniperConfig("juniper-so-vrf-target-auto");
  }

As seen above, I simply call the parseJuniperConfig class and pass my Testconfig filename into it.

When I run this test I should get PASSED.

Using IntelliJ I can easily execute the single test right within the application. When I run the test, the output below is shown.

<span role="button" tabindex="0" data-code="Testing started at 2:57 PM … <omitted> INFO: Elapsed time: 6.295s, Critical Path: 5.87s INFO: 5 processes: 1 internal, 3 darwin-sandbox, 1 worker. INFO: Build completed successfully, 5 total actions //projects/batfish/src/test/java/org/batfish/grammar/flatjuniper:tests PASSED in 2.6s
Testing started at 2:57 PM ...
<omitted>

INFO: Elapsed time: 6.295s, Critical Path: 5.87s
INFO: 5 processes: 1 internal, 3 darwin-sandbox, 1 worker.
INFO: Build completed successfully, 5 total actions
//projects/batfish/src/test/java/org/batfish/grammar/flatjuniper:tests   PASSED in 2.6s

<omitted>

I see the test has a status of PASSED, which validates the Testconfig file could be parsed by ANTLR.

To demonstrate a failure and how to use the output to help troubleshoot, I’m purposely updating the Testconfig file to have the command set switch-options vrf-target nauto notice nauto instead of auto. I realize this is not a valid config in Junos, and it would never be in show configuration | display set output. This is used for demonstration purposes only.

<span role="button" tabindex="0" data-code="Executed 1 out of 1 test: 1 fails locally. INFO: Build completed, 1 test FAILED, 4 total actions INFO: Build Event Protocol files produced successfully. INFO: Build completed, 1 test FAILED, 4 total actions Parser error org.batfish.main.ParserBatfishException: <omitted> Caused by: org.batfish.common.DebugBatfishException: lexer: FlatJuniperLexer: line 4:30: token recognition error at: 'n' Current rule stack: '[s_switch_options s_common statement set_line_tail set_line flat_juniper_configuration]'. Current rule starts at: line: 4, col 4 Parse tree for current rule: (s_switch_options SWITCH_OPTIONS:'switch-options') Lexer mode: M_VrfTarget Lexer state variables: markWildcards: false Error context lines: 1: # 2: set system host-name juniper-so-vrf-target-auto 3: # >>>4: set switch-options vrf-target nauto 5: #
Executed 1 out of 1 test: 1 fails locally.
INFO: Build completed, 1 test FAILED, 4 total actions
INFO: Build Event Protocol files produced successfully.
INFO: Build completed, 1 test FAILED, 4 total actions

Parser error
org.batfish.main.ParserBatfishException: 
  <omitted>
Caused by: org.batfish.common.DebugBatfishException: 
lexer: FlatJuniperLexer: line 4:30: token recognition error at: 'n'
Current rule stack: '[s_switch_options s_common statement set_line_tail set_line flat_juniper_configuration]'.
Current rule starts at: line: 4, col 4
Parse tree for current rule:
(s_switch_options
  SWITCH_OPTIONS:'switch-options')
Lexer mode: M_VrfTarget
Lexer state variables:
markWildcards: false
Error context lines:
   1:      #
   2:      set system host-name juniper-so-vrf-target-auto
   3:      #
>>>4:      set switch-options vrf-target nauto
   5:      #

  <omitted>
	at org.batfish.grammar.flatjuniper.FlatJuniperCombinedParser.parse(FlatJuniperCombinedParser.java:12)
	at org.batfish.main.Batfish.parse(Batfish.java:410)
	... 36 more

Quite a bit of output here was omitted, but some helpful parts remain.

  1. There is a DebugBatfishException getting raised, and on the next line we get some details on where the issue lies. lexer: FlatJuniperLexer: line 4:30: token recognition error at: 'n'.
  2. The current rule stack can help you trace the grammar resolution order that was followed. In this case, the most recent grammar where the failure occurred is s_switch_options.
  3. The parser tree rule details are shown, and they include helpful information about Lexer mode and the current rule that failed.
  4. Error context lines shows the exact line that failed to parse.

According to this information we can determine that the parsing tree got into the Lexer mode: M_VrfTarget.

Taking a step back to the VRF_TARGET token, which resolves in FlatJuniperLexer.g4, it defines the new mode to push into.

VRF_TARGET
:
   'vrf-target' -> pushMode ( M_VrfTarget )
;

This means that if vrf-target is found in the command output, it will push the lexer into a new mode called M_VrfTarget. In order to troubleshoot that further, I look in the FlatJuniperLexer.g4 for M_VrfTarget.

mode M_VrfTarget;

M_VrfTarget_COLON: ':' -> type ( COLON );
M_VrfTarget_DEC: F_Digit+ -> type ( DEC );
M_VrfTarget_AUTO: 'auto' -> type ( AUTO );
M_VrfTarget_EXPORT: 'export' -> type ( EXPORT );
M_VrfTarget_IMPORT: 'import' -> type ( IMPORT );
M_VrfTarget_L: 'L' -> type ( L );
M_VrfTarget_NEWLINE: F_NewlineChar+ -> type(NEWLINE), popMode;
M_VrfTarget_PERIOD: '.' -> type ( PERIOD );
M_VrfTarget_TARGET: 'target' -> type ( TARGET );
M_VrfTarget_WS: F_WhitespaceChar+ -> channel ( HIDDEN );

This lexer mode allows for additional flexibility in the parser tree when the outputs are more complex using sublexers. More details on mode, types, channels can be found in the ANTLR README.

At a high level this mode allows for any of these types to be found until the NEWLINE is found, in which the popMode will return the lexer back to the previous context.

Now that I see the mode definition for vrf-target clause, it’s evident why my Testconfig is failing. This mode does not allow for the token nauto; therefore, it’s raising the exception seen in the stack trace. If I needed to add additional types, I would define them in the same manner as the others. For this example I could add M_VrfTarget_AUTO: 'nauto' -> type ( NAUTO ); and defined NAUTO: 'nauto'; in FlatJuniperLexer.g4.

It’s important to understand that I’m showing a Lexer mode in this example to show its flexibility. I want to be clear that lexer modes are not the common case. They’re needed only when you want to limit what is lexed after a command; most commands do not need their own lexer mode.

Summary

In this post I provide more details on what a grammar is and how we can define and/or update an existing parser file. I explained the new commands that I wanted to add, along with updating an existing parser file to support the additional commands. In order to to validate the parsing additions I created a simplified test which we will add on to in the next blog post. Finally, I touched on how to utilize push and pop modes to add more flexibility to the lexer context.

In the next post I will be covering how to extract and use the parsed token data to create structured data in a Junos vendor datamodel. Once the datamodel is enhanced to support the additional command data, I will cover how to extend the conversion test we wrote in this blog post to test the datamodel instead of just the simple file parsing capabilities.


Conclusion

Additional posts in the series coming soon.

  • Developing Batfish – Converting Config Text into Structured Data (Part 3)
  • Developing Batfish – Converting Vendor Specific to Vendor Independent (Part 4)

-Jeff



ntc img
ntc img

Contact Us to Learn More

Share details about yourself & someone from our team will reach out to you ASAP!

Author