How to Build an ACL Auditor with Batfish

One of the domains that I have worked in during my career is network security. And in this space, when it comes to firewalls, I’ve seen many problem areas such as:

Human error – ACL updates that have resulted in ACL entries being added incorrectly i.e wrong position within the list (think “deny any any” or “permit any any” at position 1!) resulting in either service outages or unintentional access into your network.
ACL clutter – Over time, ACLs are added to the rule base. However, when these rules are added, often they encompass other rule sets that are no longer required as they are never hit. Not only does this only present unnecessary clutter for anyone reading the ACL, but also requires greater cycles for the firewall to process.
Bad actors – Bad actors – an individual, group, or organization interested in attacking IT systems – that have managed to get into the firewall because of a vulnerability or another attack vector and open ports to allow them access into the network.

Based on these pain points I wanted to write an ACL auditing tool based on Batfish, that would automate the checks needed to prevent these issues from occurring, whilst also providing you with a springboard into the world of Batfish and network security automation.

Why Batfish? Batfish provides a great open source, vendor agnostic way to validate ACLs, as we will dive into later.

Note: To fully follow this guide you will need to have both Docker and Docker Compose installed.

Lets begin…

Batfish 101

What is Batfish?

Batfish is an open-source network configuration analysis tool that provides the ability to validate configuration data, query network adjacencies, verify firewall ACL rule sets and also analyze routing/flow paths.¹

Batfish runs as a service, i.e a dockerized container. Snapshots of your network are then uploaded to the Batfish service. A snapshot is a collection of information that represents your network, such as device configurations, link/connectivity data and server details such as IP and IPtable settings. Therefore, Batfish requires no direct access to your network, and operates via a purely offline based model.

Batfish then ingests your network snapshot and builds a series of internal vendor agnostic models about your network. These models not only include configuration, but also control plane state such as BGP sessions etc. Questions are then issued to the Batfish service about your network via the Python SDK (pybatfish) or an Ansible Batfish role. Available question types include:

Furthermore, Batfish also supports the uploading of multiple snapshots from which you can then perform comparison against, as we will later. Below is an example of using pybatfish to check the session status of BGP.

>>> bfq.bgpSessionStatus(nodes="/spine|leaf/").answer().frame()
status: TERMINATEDNORMALLY
.... Wed Jun 26 15:01:16 2019 DST Begin job.
     Node      VRF Local_AS Local_Interface Local_IP Remote_AS Remote_Node Remote_Interface Remote_IP   Session_Type Established_Status
0   leaf1  default    64521            None  3.3.3.3     64520      spine1             None   1.1.1.1  EBGP_MULTIHOP        ESTABLISHED
1   leaf1  default    64521            None  3.3.3.3     64520      spine2             None   2.2.2.2  EBGP_MULTIHOP        ESTABLISHED
2   leaf2  default    64522            None  4.4.4.4     64520      spine1             None   1.1.1.1  EBGP_MULTIHOP        ESTABLISHED
3   leaf2  default    64522            None  4.4.4.4     64520      spine2             None   2.2.2.2  EBGP_MULTIHOP        ESTABLISHED
4  spine1  default    64520            None  1.1.1.1     64521       leaf1             None   3.3.3.3  EBGP_MULTIHOP        ESTABLISHED
5  spine1  default    64520            None  1.1.1.1     64522       leaf2             None   4.4.4.4  EBGP_MULTIHOP        ESTABLISHED
6  spine2  default    64520            None  2.2.2.2     64521       leaf1             None   3.3.3.3  EBGP_MULTIHOP        ESTABLISHED
7  spine2  default    64520            None  2.2.2.2     64522       leaf2             None   4.4.4.4  EBGP_MULTIHOP        ESTABLISHED

Installation

To install Batfish the following commands are run to pull down and then run our Batfish container image.²

docker pull batfish/allinone
docker run --name batfish -d -v batfish-data:/data -p 8888:8888 -p 9997:9997 -p 9996:9996 batfish/allinone

However, for this tutorial we can use a pre-built environment via docker-compose using the following commands.

git clone git@github.com:networktocode/ntc-soteria.git -b v0.1

cd ntc-soteria

docker-compose build
docker-compose up -d
docker-compose exec ntc-soteria bash

Once run, you will have 2 running containers (Batfish and ntc-soteria) and will be placed into the shell of the ntc-soteria container. This container will have pybatfish installed and access to the Batfish container.

We will be using ntc-soteria further when building our ACL auditor, and dive into this further later on in this guide.

Example

Let’s look at a small example. From the ntc-soteria repo previously cloned, we will use an example Cisco ASA configuration and run a question against our Batfish service.

Next, we fire up our Python interpreter, import the required pybatfish modules and create a snapshot from the ASA configuration contained within the ./data directory.

from pybatfish.client.commands import bf_session
from pybatfish.question import bfq
from pybatfish.question.question import load_questions
from acl_auditor.helpers import read_file

asa_config = read_file('data/asa.cfg')

bf_session.host = 'batfish'
load_questions()
bf_session.init_snapshot_from_text(asa_config, snapshot_name="base", overwrite=True)

We can now start asking questions about our snapshot. Below shows the ipOwners question to get the ip details of the device. Note: answer() runs the question and returns the answer in a JSON format. frame() wraps the answer as pandas dataframe. The Pandas Dataframe provides us with a data structure and various methods to parse, maniuplate and iterate the results.

>>> bfq.ipOwners().answer().frame()
status: TRYINGTOASSIGN
.... no task information
status: TERMINATEDNORMALLY
.... 2020-07-03 08:56:32.506000+01:00 Begin job.
  Node      VRF Interface              IP Mask Active
0  fw1  default   webfarm      10.0.1.254   24   True
1  fw1  default      mgmt  172.29.132.100   24  False
2  fw1  default    inside       10.0.0.13   30   True
3  fw1  default   outside   192.168.0.254   24   True

Note: answer() .frame()`.

As mentioned in the previous section, there are numerous questions available. This can also be seen by printing the names (questions) within the bfq namespace. Like so:

>>> from pprint import pprint
>>> pprint.pprint(dir(bfq))
['__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'aaaAuthenticationLogin',
 'bgpEdges',
 'bgpPeerConfiguration',
 'bgpProcessConfiguration',
 'bgpSessionCompatibility',
 'bgpSessionStatus',
 'bidirectionalReachability',
 'bidirectionalTraceroute',
 'compareFilters',
 'definedStructures',
 'detectLoops',
 'differentialReachability',
 'edges',
 'eigrpEdges',
 'evpnL3VniProperties',
 'f5BigipVipConfiguration',
 'fileParseStatus',
 'filterLineReachability',
 'filterTable',
 'findMatchingFilterLines',
 'initIssues',
 'interfaceMtu',
 'interfaceProperties',
 'ipOwners',
 'ipsecEdges',
 'ipsecSessionStatus',
 'isisEdges',
...

From this list you will see 2 questions – filterLineReachability and compareFilters. These questions will form the basis of our ACL auditor.

Creating an ACL Auditor

We will now look at how to build an ACL auditor. We will be using the environment via the pre-built repo ntc-soteria https://github.com/networktocode/ntc-soteria, that used previoulsy to run a simple Batfish example.

Many of you may be asking, what’s the strange name ntc-soteria. Well,

in Greek mythology, Soteria was the goddess or spirit (daimon) of safety and salvation, deliverance, and preservation from harm.

ACL Auditor Overview

Our ACL auditor will be a CLI based tool, written in Python, powered by Batfish and will provide two types of audits:

Differential – Compares and reports the differences between a set of reference flows and a configured (implemented) ACL. Reference flows are 5-tuple policy definitions that define what should be permitted or denied by the firewall. By calculating and reporting the difference between the reference flows and implemented flows, we can ensure no unintended traffic is being permitted (or denied) by the firewall. This will be performed via the Batfish question compareFilters.
Unreachable Entries – Reports any entries within an ACL that will never be hit due to being shadowed by prior lines within the ACL. This will be performed via the Batfish question filterLineReachability.

Audit Types

Let’s look at each audit type in more detail.

Differential

This audit takes 3 pieces of information, a single YAML file containing a set of reference flows, the configuration of your firewall, and also the ACL name in question. It then compares your reference flows and implemented flows to provide you with a set of results showing the differences. The results include:

flows that your firewall IS permitting or denying but should not as they are not included in your reference flow definition.
flows that your firewall IS NOT permitting or denying but should as they are included in your reference flow definition.

Some use cases for this audit include:

Preventing human error during firewall changes. For example, incorrect addition of an “ip any any.”
Adding to the previous point, you can also add this to your ACL CI pipelines.
Allows you to run routine scripted checks against your ACL base to ensure no flows are opened incorrectly (for example by bad actors).

Unreachable

This check takes a firewall configuration containing your ACL rule sets. It then reports on any lines that will not match any packet, because of being shadowed by prior lines. The key use cases for this are:

Prevent human error during firewall changes. For example, incorrect placement of an encompassing deny rule.
Adding to the previous point, you can also add this to your ACL CI pipelines.
Assist in keeping your ACL rule sets minimal and free of unnecessary lines. This helps in clarity and also can reduce firewall overhead.

The Code

Code Layout/Files

From the shell you previously entered during the Batfish example earlier, you will now be presented with the following code structure for our tool.

tree .
.
├── Dockerfile                  // How to assemble the Docker image.
├── Makefile                    // Set of shell shortcuts. See avail via `make`. 
├── README.md                   // Details about repo.
├── acl_auditor                  
│   ├── __init__.py  
│   ├── auditor.py              // Main script file. 
│   ├── helpers.py              // Various helpers (file, acl generators).
│   ├── report.j2               // HTML report jinja2 template.
│   └── reporter.py             // Formats outputs, and renders outputs/report.
├── data
│   ├── asa.cfg                 // Example ASA configuration.
│   ├── csr.cfg                 // Example CSR configuration.
│   ├── flows.yml               // Example flow reference.
│   ├── report-example.png      // Example image of HTML report.
│   └── report.html             // Example HTML report.
├── docker-compose.yml          // Docker environment definition.
├── poetry.lock                 // Package management file for Poetry.
├── pyproject.toml              // Package management file for Poetry.
└── tests
    ├── __init__.py
    ├── test_config.cfg         // Test config for unit tests.
    ├── test_flows.yml          // Test flows for unit tests.
    └── unit
        ├── __init__.py
        └── test_helpers.py     // Unit tests

Based on the files above. At a high level we will:

Via the CLI, run the auditor.py module and pass in a set of inputs. Example inputs have been included within the data directory.
auditor.py contains a class ACLAuditor. This class contains various methods for performing the required Batfish actions.
Once the Batfish operations have been performed the results will be parsed and formatted via the reporter.py module, for output via the CLI and/or HTML.

A visual representation of this is below.

Unreachable Entry Audit

Let’s now look at how we build our unreachable audit. As mentioned previously this audit will report on any ACL entries that are shadowed by another ACL rule, and therefore would never be hit. To calculate this result we will use the Batfish question:

bfq.filterLineReachability().answer().frame()

Below shows an overview of the steps that we will perform within this audit.

Build Batfish Session

As per our differential audit, the Batfish session will be created at the point of ACLAuditor instantiation. Like so:

./acl_auditor/auditor.py

...

class ACLAuditor:
    def __init__(self, config_file):
        bf_session.host = "batfish"
        load_questions()
        self.config_file = config_file

Create Snapshot

Next, we need to create a snapshot using our device configuration. We use the same method as we used before, as shown below:

./acl_auditor/auditor.py

...

def _create_base_snapshot(self):
    bf_session.init_snapshot_from_text(
        self.config_file, snapshot_name="base", overwrite=True
    )

Query Batfish

We now query Batfish via the bfq.filterLineReachability(), like so:

./acl_auditor/auditor.py

...
   def get_unreachable_lines(self):
        ...
        return bfq.filterLineReachability().answer()

Output Reports

Just like we did for our previous report we then pass our results into various reporting functions within reporter.py, which formats the outputs and also deals with the rendering of the HTML template using Jinja2.

Example

We will again use the example ASA configuration supplied. Within this configuration let’s focus in on the following ACL:

access-list acl-inside extended deny ip any4 any4
access-list acl-inside extended permit udp host 10.0.2.1 host 8.8.8.8 eq domain
access-list acl-inside extended permit udp host 10.0.2.1 host 8.8.4.4 eq domain

We run the audit, we get the following results:

./acl_auditor/auditor.py -c unreachable -d data/asa.cfg
+---------------------+-------------------------------------------------+---------------------------+-----------------------+----------------+
| Sources             | Unreachable Line                                | Unreachable Line Action   | Blocking Lines        | Reason         |
|---------------------+-------------------------------------------------+---------------------------+-----------------------+----------------+
| ['fw1: acl-inside'] | permit udp host 10.0.2.1 host 8.8.4.4 eq domain | PERMIT                    | ['deny ip any4 any4'] | BLOCKING_LINES |
| ['fw1: acl-inside'] | permit udp host 10.0.2.1 host 8.8.8.8 eq domain | PERMIT                    | ['deny ip any4 any4'] | BLOCKING_LINES |
+---------------------+-------------------------------------------------+---------------------------+-----------------------+----------------+

Here we can see that the line deny ip any4 any4 is blocking the 2 lines for DNS access out to Google. Great!

Differential Audit

So how do we use Batfish to perform a differential audit? That is, how do we compare and report on the differences between a set of reference flows and an ACL. In short we use the Batfish question bfq.compareFilters(). The questions takes a node name, along with 2 snapshots, containing your ACLs, and then returns the differences.

bfq.compareFilters(nodes='rtr-with-acl').answer(snapshot='filters-change',reference_snapshot='filters').frame()

Unlike the previous audit this one is a little more advanced. Below shows the steps involved. To summarize we will:

Create a snapshot from our device configuration.
Convert our YAML reference flows into an ACL.
Create a reference snapshot using the reference ACL.
Compare the 2 snapshots.
Return the results.

Let’s step through the key steps and code:

Build Batfish Session

Our Batfish session will be built within the constructor of the ACLAuditor class. Like so:

./acl_auditor/auditor.py

class ACLAuditor:
    def __init__(self, config_file):
        bf_session.host = "batfish"
        load_questions()
        self.config_file = config_file
...

Convert Reference Flows

First we take a set of reference flows, that we have defined as YAML (as shown below), and convert them into an ACL based format.

./data/flows.yml

---
- source_ip: 10.0.1.1/32
  dest_ip: 8.8.8.8/32
  dest_port: 53
  proto: udp
  action: permit
- source_ip: 10.0.1.1/32
  dest_ip: 10.200.1.1/32
  dest_port: 3306
  proto: tcp
  action: permit

For this we use YAML to ACL convertor helper functions found within helpers.py – generate_acl_syntax_juniper_srx().

Create Snapshots

We now have our reference flows in an ACL based format. We will use this ACL to generate a reference snapshot. We will then use our device config to generate a base snapshot.

Like so:

...    
    def _create_base_snapshot(self):
        bf_session.init_snapshot_from_text(
            self.config_file, snapshot_name="base", overwrite=True
        )

    def _create_reference_snapshot(self, hostname):
        platform = "juniper_srx"
        reference_acl = create_acl_from_yaml(
            self.flows_file, hostname, self.acl_name, platform
        )
        bf_session.init_snapshot_from_text(
            reference_acl,
            platform=platform,
            snapshot_name="reference",
            overwrite=True,
        )
        self.validate_reference_snapshot()

Query Batfish

With the 2 snapshots created, we can run our bfq.compareFilters() question, as shown below.

def get_acl_differences(self, flows_file, acl_name):
...
    return bfq.compareFilters().answer(
        snapshot="base", reference_snapshot="reference"
    )

Output Reports

Once done we then pass our results into various reporting functions within reporter.py, which formats the outputs and also deals with the rendering of the HTML template using jinja2.

Example

Let’s take our reference flows, which are shown below. These are the flows that should be configured; nothing more, nothing less.

---
- source_ip: 10.0.1.1/32
  dest_ip: 8.8.8.8/32
  dest_port: 53
  proto: udp
  action: permit
- source_ip: 10.0.1.1/32
  dest_ip: 10.200.1.1/32
  dest_port: 3306
  proto: tcp
  action: permit

In this case, we will use an ASA configuration as our device config. Below shows the ACL in question:

access-list acl-webfarm extended permit tcp any host 10.0.2.1 eq 3306
access-list acl-webfarm extended permit udp host 10.0.1.1 host 8.8.8.8 eq domain
access-list acl-webfarm extended permit udp host 10.0.1.1 host 8.8.4.4 eq domain
access-list acl-webfarm extended deny ip any4 any4

When we run the audit we get the following results:

+------------------------+---------------------------------------------------------+---------------------------+-------------------------------------------------+
| Reference Flow Index   | Reference Flow Content                                  | Implemented Flow Action   | Implemented Flow Content                        |
|------------------------+---------------------------------------------------------+---------------------------+-------------------------------------------------+
| 1                      | "flow2 (10.0.1.1/32 any 10.200.1.1/32 3306 tcp permit)" | DENY                      | deny ip any4 any4                               |
| No Match               |                                                         | PERMIT                    | permit tcp any host 10.0.2.1 eq 3306            |
| No Match               |                                                         | PERMIT                    | permit udp host 10.0.1.1 host 8.8.4.4 eq domain |
+------------------------+---------------------------------------------------------+---------------------------+-------------------------------------------------+

So we have 3 differences (failures) that the audit has returned. Lets step through each one by line:

The reference flow 10.0.1.1/32 any 10.200.1.1/32 3306 tcp permit is not permitted due to the implemented line deny ip any4 any4.
The implemented ACL is permitting permit tcp any host 10.0.2.1 eq 3306. However, no match for this flow is found within the reference flows.
Likewise, the implemented ACL is permitting permit udp host 10.0.1.1 host 8.8.4.4 eq domain. However, No Match for this flow is found within the reference flows.

Great, we have detected flows that should have been implemented and also flows that were incorrectly implemented.

HTML Report

We previously ran the audits individually out to just the CLI. However, I’ve also included the option to output the results as an HTML template, as shown below:

This report is generated via an additional -o html option when running both audits. For example:

./acl_auditor/auditor.py -c all -d data/asa.cfg -r data/flows.yml -a acl-inside -o html

A detailed dive into how the template is constructed and rendered is outside the scope of this article. But the key points are:

The report is rendered using a jinja2 template (report.j2) via the reporter.py module.
The HTML report generated uses the following Material/Bootstrap framework: https://fezvrasta.github.io/bootstrap-material-design/.

Thanks

A thanks goes out to Ratul Mahajan and Dan Halperin at Intentionet for their help and input into this tool.

References

“A Hands-on Guide to Multi-Tiered Firewall … – PacketFlow.” 13 Dec. 2019, https://www.packetflow.co.uk/a-hands-on-guide-to-multi-tiered-firewall-changes-with-ansible-and-batfish-part-1/. Accessed 23 Jun. 2020. ↩
“A Hands-on Guide to Multi-Tiered Firewall … – PacketFlow.” 13 Dec. 2019, https://www.packetflow.co.uk/a-hands-on-guide-to-multi-tiered-firewall-changes-with-ansible-and-batfish-part-1/. Accessed 24 Jun. 2020. ↩

Conclusion

I hope you have enjoyed reading this article as much as I have enjoyed writing it. When it comes to Batfish, I have only really scratched the surface in what you can perform when it comes to flow validation. For example, this audit could be extended to check flows across multiple devices (think dual layer firewall topologies).

I hope this has provided you with a springboard into the world of Batfish, and network security based automation.

Thanks for reading.

-Rick Donato (@rickjdon)

Tags :

automation batfish netdevops tools tutorial

Does this all sound amazing? Want to know more about how Network to Code can help you do this, reach out to our sales team. If you want to help make this a reality for our clients, check out our careers page.

Cookie	Duration	Description
__hssc	30 minutes	HubSpot sets this cookie to keep track of sessions and to determine if HubSpot should increment the session number and timestamps in the __hstc cookie.
__hssrc	session	This cookie is set by Hubspot whenever it changes the session cookie. The __hssrc cookie set to 1 indicates that the user has restarted the browser, and if the cookie does not exist, it is assumed to be a new session.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__cf_bm	30 minutes	Cloudflare set the cookie to support Cloudflare Bot Management.
li_gc	5 months 27 days	Linkedin set this cookie for storing visitor's consent regarding using cookies for non-essential purposes.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.

Cookie	Duration	Description
__hstc	5 months 27 days	Hubspot set this main cookie for tracking visitors. It contains the domain, initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session).
_ga	1 year 1 month 4 days	Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors.
_gat_gtag_UA_*	1 minute	Google Analytics sets this cookie to store a unique user ID.
_gid	1 day	Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously.
AnalyticsSyncHistory	1 month	Linkedin set this cookie to store information about the time a sync took place with the lms_analytics cookie.
CONSENT	2 years	YouTube sets this cookie via embedded YouTube videos and registers anonymous statistical data.
hubspotutk	5 months 27 days	HubSpot sets this cookie to keep track of the visitors to the website. This cookie is passed to HubSpot on form submission and used when deduplicating contacts.
ln_or	1 day	Linkedin sets this cookie to registers statistical data on users' behaviour on the website for internal analytics.

Cookie	Duration	Description
bcookie	1 year	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser IDs.
bscookie	1 year	LinkedIn sets this cookie to store performed actions on the website.
li_sugr	3 months	LinkedIn sets this cookie to collect user behaviour data to optimise the website and make advertisements on the website more relevant.
VISITOR_INFO1_LIVE	5 months 27 days	YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface.
YSC	session	Youtube sets this cookie to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt-remote-device-id	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt.innertube::nextId	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.

How to Build an ACL Auditor with Batfish

Batfish 101

What is Batfish?

Installation

Example

Creating an ACL Auditor

ACL Auditor Overview

Audit Types

Differential

Unreachable

The Code

Code Layout/Files

Unreachable Entry Audit

Build Batfish Session

Create Snapshot

Query Batfish

Output Reports

Example

Differential Audit

Build Batfish Session

Convert Reference Flows

Create Snapshots

Query Batfish

Output Reports

Example

HTML Report

Thanks

References

Conclusion

Tags :

Share :

Contents

Recent Posts

December 11, 2024

December 5, 2024

November 25, 2024

November 15, 2024

October 4, 2024

Contact Us to Learn More

Author