Developing Batfish – Extending a Grammar (Part 2)
This is part 2 of a blog series to help learn how to contribute to Batfish. Here’s the previous post in this series: Developing Batfish – Developer Summary (Part 1).
In this post I will be covering how to extend a grammar definition in detail, which includes updating the ANTLR files and adding proper testing to validate the parsing of the new configuration commands is working as expected. There are entire books and blog posts that cover the in-depth details of ANTLR, so this blog will focus on the pieces needed for a successful Batfish contribution.
What Is a Grammar?
In part 1 of this series I covered a generic definition of what a grammar is. Below covers some of the more in-depth grammar related concepts:
- Lexer – A lexer (often called a scanner) breaks up an input stream of characters into vocabulary symbols for a parser, which applies a grammatical structure to that symbol stream. I think of this as the different command elements, whether that is a parent line or sub parent (e.g.,
interface
,vlan
,prefix-list
). The lexer needs to have all the elements built out for the parsing to work accurately. Lexer rules define token types. They start with an uppercase letter to distinguish them from parser rules. - Parser Rules – A parse tree is made up of the structure of parser rules. Parser rules are written in lowercase. ANTLR has detailed documentation on parser rules.
Basic Steps
- Create a testconfig file with the lines to be parsed
- Update grammar definition and lexer
- Add extraction tests
Extending a grammar merges concepts between part 2 and part 3 of this blog series. Specifically with the structured data representation and extraction testing, which commonly requires the representations to be created.
Command Additions
In this blog post I will be covering the steps to add parser rules for a few commands that I need access to for future enhancements. These commands live in the switch-options
Junos stanza.
The commands I’d like to add parsing support for are:
set switch-options vrf-target target:65320:7999999
set switch-options vrf-target auto
set switch-options vrf-target import target:65320:7999999
set switch-options vrf-target export target:65320:7999999
More information on these commands can be found in the Juniper documentation.
At the start of this blog, the FlatJuniper_switch_options.g4
file has the contents:
Note: This file lives here in the Batfish directory structure.
parser grammar FlatJuniper_switch_options;
import FlatJuniper_common;
options {
tokenVocab = FlatJuniperLexer;
}
s_switch_options
:
SWITCH_OPTIONS
(
so_vtep_source_interface
| so_route_distinguisher
| so_vrf_target
| so_vrf_export
| so_vrf_import
)
;
so_vtep_source_interface
:
VTEP_SOURCE_INTERFACE iface = interface_id
;
so_route_distinguisher
:
ROUTE_DISTINGUISHER route_distinguisher
;
so_vrf_target
:
VRF_TARGET null_filler
;
so_vrf_export
:
VRF_EXPORT null_filler
;
so_vrf_import
:
VRF_IMPORT null_filler
;
You can see that vrf-target
, export
, and import
sections of the commands are going to null_filler
, which is implemented in Batfish specifically for Juniper to raise the error about the command syntax not being supported. null_filler
is defined in FlatJuniper_common.g4
and shown below:
null_filler
:
~( APPLY_GROUPS | NEWLINE )* apply_groups?
;
Note that
null_filler
is specific to Juniper within Batfish, and each vendor has its own definition/implementation of similar logic.
We will work on revamping this file and adding the necessary support for these command sections.
Create a Testconfig
As is true with most large-scale projects, testing is required and is an integral piece to be included in any contribution to a project. The first step we need to complete is to create a simple Testconfig
file that Batfish’s testing environment can pull in and attempt to parse. The commands in our Testconfigs
are very simple. I will create a file per command; this will give me added flexibility when adding testing.
First, I will create a Testconfig
file called juniper-so-vrf-target-auto
. This file is located in this directory.
I will test the auto
vrf-target first.
#
set system host-name juniper-so-vrf-target-auto
#
set switch-options vrf-target auto
#
These files are simple test-configuration files, and they don’t need to be complicated. Their purpose is to test that a command can be parsed as expected. In this example I have a host-name and the command I want to parse out.
Note: A good way to determine how many Testconfig files are needed is to think about whether or not certain configuration lines will overwrite each other. If the configurations can be grouped or scoped a single Testconfig can be used.
Updating the Grammar
Updating this grammar should be straightforward. I want to extend what is mentioned above to become our starting point for the switch-options
grammar.
The updated grammar definition is shown below. I will be explaining the changes in the next section.
parser grammar FlatJuniper_switch_options;
import FlatJuniper_common;
options {
tokenVocab = FlatJuniperLexer;
}
s_switch_options
:
SWITCH_OPTIONS
(
so_route_distinguisher
| so_vrf_target
| so_vtep_source_interface
)
;
so_vrf_target:
VRF_TARGET
(
sovt_auto
| sovt_community
| sovt_export
| sovt_import
)
;
so_vtep_source_interface
:
VTEP_SOURCE_INTERFACE iface = interface_id
;
so_route_distinguisher
:
ROUTE_DISTINGUISHER route_distinguisher
;
sovt_auto
:
AUTO
;
sovt_community
:
extended_community
;
sovt_export
:
EXPORT extended_community
;
sovt_import
:
IMPORT extended_community
;
Understanding the Grammar Updates
To help with the visualization, a diff is shown below:
▶ sdiff -bBWs before.g4 after.g4
so_vtep_source_interface | so_route_distinguisher
| so_route_distinguisher <
| so_vrf_export | | so_vtep_source_interface
| so_vrf_import | )
> ;
>
> so_vrf_target:
> VRF_TARGET
> (
> sovt_auto
> | sovt_community
> | sovt_export
> | sovt_import
so_vrf_target | sovt_auto
VRF_TARGET null_filler | AUTO
so_vrf_export | sovt_community
VRF_EXPORT null_filler | extended_community
so_vrf_import | sovt_export
VRF_IMPORT null_filler | EXPORT extended_community
>
> sovt_import
> :
> IMPORT extended_community
> ;
I’ll go through the changes from top to bottom:
- First, I show the change from
so_vrf_target
toVRF_TARGET
. This one was actually an overall fix from the initial implementation I did in a previous PR into Batfish. Instead of definingso_vrf_target
from scratch, I looked through the lexer and noticed thatVRF_TARGET
was already defined with aLexer mode: M_VrfTarget
push rule that I will be covering in the testing section below. In this case it made sense to reuse what was already built. - Next, I added the
OR
block to catch the multiple different command syntaxes that are supported. sovt_auto
will support the syntaxset switch-options vrf-target auto
.- The
sovt_community
definition is next; it supportsset switch-options vrf-target target:65320:7999999
. The rule also reusesextended_community
which is defined inFlatJuniper_commmon.g4
. - Finally, the last two are catching statically defined
import
orexport
community targets. These simply allow for the commandset switch-options vrf-target import target:65320:7999999
or the identical command replacingimport
withexport
.
Note that the name of the rules has changed to follow the batfish standard. sovt in this case would be
switch-options vrf-target
. I also rearranged the rules to be alphabetical within each context block.
Add Boilerplate for Testing
The actual testing of the extraction code is going to be covered in part 3 of this series. For the purposes of validating the grammar I will demonstrate how to create a simple test. This will validate that Batfish can parse the configuration lines. In order to do this we create our Testconfig files and attempt to run the parseJuniperConfig
class. At a bare minimum this can ensure the parsing of the ANTLR tree is successful.
This extraction test will be created in FlatJuniperGrammarTest.java
. It will test the auto
option via the config line set switch-options vrf-target auto
.
@Test
public void testSwitchOptionsVrfTargetAutoExtraction() {
parseJuniperConfig("juniper-so-vrf-target-auto");
}
As seen above, I simply call the parseJuniperConfig
class and pass my Testconfig filename into it.
When I run this test I should get PASSED
.
Using IntelliJ I can easily execute the single test right within the application. When I run the test, the output below is shown.
Testing started at 2:57 PM ...
<omitted>
INFO: Elapsed time: 6.295s, Critical Path: 5.87s
INFO: 5 processes: 1 internal, 3 darwin-sandbox, 1 worker.
INFO: Build completed successfully, 5 total actions
//projects/batfish/src/test/java/org/batfish/grammar/flatjuniper:tests PASSED in 2.6s
<omitted>
I see the test has a status of PASSED, which validates the Testconfig file could be parsed by ANTLR.
To demonstrate a failure and how to use the output to help troubleshoot, I’m purposely updating the Testconfig file to have the command set switch-options vrf-target nauto
notice nauto
instead of auto
. I realize this is not a valid config in Junos, and it would never be in show configuration | display set
output. This is used for demonstration purposes only.
Executed 1 out of 1 test: 1 fails locally.
INFO: Build completed, 1 test FAILED, 4 total actions
INFO: Build Event Protocol files produced successfully.
INFO: Build completed, 1 test FAILED, 4 total actions
Parser error
org.batfish.main.ParserBatfishException:
<omitted>
Caused by: org.batfish.common.DebugBatfishException:
lexer: FlatJuniperLexer: line 4:30: token recognition error at: 'n'
Current rule stack: '[s_switch_options s_common statement set_line_tail set_line flat_juniper_configuration]'.
Current rule starts at: line: 4, col 4
Parse tree for current rule:
(s_switch_options
SWITCH_OPTIONS:'switch-options')
Lexer mode: M_VrfTarget
Lexer state variables:
markWildcards: false
Error context lines:
1: #
2: set system host-name juniper-so-vrf-target-auto
3: #
>>>4: set switch-options vrf-target nauto
5: #
<omitted>
at org.batfish.grammar.flatjuniper.FlatJuniperCombinedParser.parse(FlatJuniperCombinedParser.java:12)
at org.batfish.main.Batfish.parse(Batfish.java:410)
... 36 more
Quite a bit of output here was omitted, but some helpful parts remain.
- There is a DebugBatfishException getting raised, and on the next line we get some details on where the issue lies.
lexer: FlatJuniperLexer: line 4:30: token recognition error at: 'n'
. - The current rule stack can help you trace the grammar resolution order that was followed. In this case, the most recent grammar where the failure occurred is
s_switch_options
. - The parser tree rule details are shown, and they include helpful information about Lexer mode and the current rule that failed.
Error context lines
shows the exact line that failed to parse.
According to this information we can determine that the parsing tree got into the Lexer mode: M_VrfTarget
.
Taking a step back to the VRF_TARGET
token, which resolves in FlatJuniperLexer.g4
, it defines the new mode
to push into.
VRF_TARGET
:
'vrf-target' -> pushMode ( M_VrfTarget )
;
This means that if vrf-target
is found in the command output, it will push the lexer into a new mode called M_VrfTarget
. In order to troubleshoot that further, I look in the FlatJuniperLexer.g4
for M_VrfTarget
.
mode M_VrfTarget;
M_VrfTarget_COLON: ':' -> type ( COLON );
M_VrfTarget_DEC: F_Digit+ -> type ( DEC );
M_VrfTarget_AUTO: 'auto' -> type ( AUTO );
M_VrfTarget_EXPORT: 'export' -> type ( EXPORT );
M_VrfTarget_IMPORT: 'import' -> type ( IMPORT );
M_VrfTarget_L: 'L' -> type ( L );
M_VrfTarget_NEWLINE: F_NewlineChar+ -> type(NEWLINE), popMode;
M_VrfTarget_PERIOD: '.' -> type ( PERIOD );
M_VrfTarget_TARGET: 'target' -> type ( TARGET );
M_VrfTarget_WS: F_WhitespaceChar+ -> channel ( HIDDEN );
This lexer mode allows for additional flexibility in the parser tree when the outputs are more complex using sublexers. More details on mode, types, channels can be found in the ANTLR README.
At a high level this mode allows for any of these types
to be found until the NEWLINE
is found, in which the popMode
will return the lexer back to the previous context.
Now that I see the mode definition for vrf-target
clause, it’s evident why my Testconfig is failing. This mode does not allow for the token nauto
; therefore, it’s raising the exception seen in the stack trace. If I needed to add additional types
, I would define them in the same manner as the others. For this example I could add M_VrfTarget_AUTO: 'nauto' -> type ( NAUTO );
and defined NAUTO: 'nauto';
in FlatJuniperLexer.g4
.
It’s important to understand that I’m showing a Lexer mode in this example to show its flexibility. I want to be clear that lexer modes are not the common case. They’re needed only when you want to limit what is lexed after a command; most commands do not need their own lexer mode.
Summary
In this post I provide more details on what a grammar is and how we can define and/or update an existing parser file. I explained the new commands that I wanted to add, along with updating an existing parser file to support the additional commands. In order to to validate the parsing additions I created a simplified test which we will add on to in the next blog post. Finally, I touched on how to utilize push
and pop
modes to add more flexibility to the lexer context.
In the next post I will be covering how to extract and use the parsed token data to create structured data in a Junos vendor datamodel. Once the datamodel is enhanced to support the additional command data, I will cover how to extend the conversion test we wrote in this blog post to test the datamodel instead of just the simple file parsing capabilities.
Conclusion
Additional posts in the series coming soon.
- Developing Batfish – Converting Config Text into Structured Data (Part 3)
- Developing Batfish – Converting Vendor Specific to Vendor Independent (Part 4)
-Jeff
Tags :
Contact Us to Learn More
Share details about yourself & someone from our team will reach out to you ASAP!