Ansible - BGP Networking Troubleshooting Guide

Network troubleshooting is a common automation use case. Network outages are costly and time-consuming and often require the network engineers to log into network equipment and manually investigate network issues. Working on network operations teams, I quickly noticed that troubleshooting network problems is a playbook of repeatable steps, hence the rationale for automating network troubleshooting with Ansible.

Use Case – BGP

Troubleshooting Layer 3 connectivity tends to lead an operations engineer to jump into multiple routers and check routing. Let’s say internet access has been lost from the WAN edge. If I were troubleshooting this, my instincts would tell me to go to my edge router(s) and check the BGP neighbor going towards my ISP.

east-rtr#show ip bgp summary

<...output omitted...>

Neighbor        V    AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
4.4.4.4      4   400       0       0        0    0    0 08:11    Idle

From the output of show ip bgp summary the issue can determined, BGP is down toward the ISP. How can Ansible help? This is a simplified example with one router and one WAN connection, but what happens if you have 10, 15, or more BGP relationships you need to check. It is costly to manually log in to each router to check the status of BGP. How can Ansible help?

Checking BGP with Ansible

Here is a sequential listing of what the Ansible playbook is doing.

Run show ip bgp summary outputs from ISP routers.
Use ansible-napalm to get BGP facts on the neighbors for easy reporting.
Create an easy-to-consume report using a Jinja2 template to create a report with BGP neighbor status.
Assemble all the device reports into a single overview report.
Iterate through the neighbors and if a neighbor is down, attempt to ping the destination IP to verify Layer 3 reachability using napalm-ping.

Pre-req

There needs to be a valid Ansible inventory, either a static inventory file or dynamic inventories utilizing an existing SoT (Source of Truth). For demonstration purposes a static file will be used.

inventory.cfg

[isp_routers]

[isp_routers:vars]
ansible_network_os=ios

[isp_routers:children]
east_isp
west_isp

[east_isp]
east-rtr

[west_isp]
west-rtr

For help building an inventory file. See Ansible Inventory

Step 1

Create a simple playbook to execute show ip bgp neighbors on all of the routers in the group called isp_routers.

---
- name: "PLAY:1 - GET BGP SUMMARY"
  gather_facts: False
  connection: "network_cli"
  hosts: "isp_routers"
  tasks:
  - name: "TASK:1 - 'SHOW IP BGP SUMMARY'"
    ios_command:
      commands: "show ip bgp summary"
    register: "output_ios"
  - name: "TASK:2 - PRINT BGP OUTPUT"
    debug:
      msg: "{{ output_ios.stdout[0] }}"

Running the playbook results in the following output.

▶ ansible-playbook pb.yml -u ntc -k
SSH password: 

PLAY [PLAY:1 - GET BGP SUMMARY] **************************************************************************************************************************************************************************************

TASK [TASK:1 - 'SHOW IP BGP SUMMARY'] ********************************************************************************************************************************************************************************

ok: [east-rtr]
ok: [west-rtr]

TASK [TASK:2 - PRINT BGP OUTPUT] *************************************************************************************************************************************************************************************
ok: [east-rtr] => {
    "msg": "BGP router identifier 1.1.1.1, local AS number 100\nBGP table version is 416, main routing table version 416\n28 network entries using 6944 bytes of memory\n41 path entries using 5576 bytes of memory\n8/7 BGP path/bestpath attribute entries using 2304 bytes of memory\n4 BGP AS-PATH entries using 128 bytes of memory\n0 BGP route-map cache entries using 0 bytes of memory\n0 BGP filter-list cache entries using 0 bytes of memory\nBGP using 14952 total bytes of memory\nBGP activity 124/96 prefixes, 232/191 paths, scan interval 60 secs\n32 networks peaked at 23:40:21 Jan 7 2021 UTC (6w5d ago)\n\nNeighbor        V           AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd\n4.4.4.4      4   400       0       0        0    0    0 08:21    Idle"
}
ok: [west-rtr] => {
    "msg": "BGP router identifier 2.2.2.2, local AS number 100\nBGP table version is 579, main routing table version 579\n28 network entries using 6944 bytes of memory\n41 path entries using 5576 bytes of memory\n8/7 BGP path/bestpath attribute entries using 2304 bytes of memory\n4 BGP AS-PATH entries using 128 bytes of memory\n0 BGP route-map cache entries using 0 bytes of memory\n0 BGP filter-list cache entries using 0 bytes of memory\nBGP using 14952 total bytes of memory\nBGP activity 158/130 prefixes, 267/226 paths, scan interval 60 secs\n32 networks peaked at 23:40:21 Jan 7 2021 UTC (6w5d ago)\n\nNeighbor        V           AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd\n8.8.8.8      4   400       0       0        0    0    0 18:52    1"
}

PLAY RECAP ***********************************************************************************************************************************************************************************************************
east-rtr   : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
west-rtr   : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

At this point you have a single pane to quickly check all the BGP neighbors; however, it’s hard to read the output. To take this playbook to the next level, we can easily take command output and create structured data using one of the various cli parsing modules.

Ansible Parsing
Parsing Strategies on NTC Blog by Mikhail Yohman.
- Parsing Strategies – An Introduction

What if more information is needed? You could check route counts or layer 3 reachability.

Let’s dig into this use case further.

Step 2

Use napalm-ansible module to run get-facts on BGP.

Note: For readability the rest of the task will be in a PLAY:2 of the playbook.

- name: "PLAY:2 - USE NAPALM BGP FACTS"
  gather_facts: False
  connection: "network_cli"
  hosts: "isp_routers"
  tasks:
  - name: "TASK:1 - 'GET BGP FACTS'"
    napalm_get_facts:
      filter:
        - "bgp_neighbors"
    register: "bgp"
  - debug: var=bgp

Results in:

▶ ansible-playbook napalm_pb.yml -u ntc -k
SSH password: 

PLAY [PLAY:2 - USE NAPALM BGP FACTS] *********************************************************************************************************************************************************************************

TASK [TASK:1 - 'GET BGP FACTS'] **************************************************************************************************************************************************************************************
ok: [east-rtr]
ok: [west-rtr]

TASK [debug] *********************************************************************************************************************************************************************************************************
ok: [east-rtr]] => {
    "bgp": {
        "ansible_facts": {
            "napalm_bgp_neighbors": {
                "global": {
                    "peers": {
                        "4.4.4.4": {
                            "address_family": {
                                "ipv4 unicast": {
                                    "accepted_prefixes": 0,
                                    "received_prefixes": 0,
                                    "sent_prefixes": 0
                                }
                            },
                            "description": "",
                            "is_enabled": true,
                            "is_up": false,
                            "local_as": 100,
                            "remote_as": 400,
                            "remote_id": "4.4.4.4",
                            "uptime": 0
                        },
                    "router_id": "1.1.1.1"
                }
            }
        },
        "changed": false,
        "failed": false
    }
}
ok: [west-rtr] => {
    "bgp": {
        "ansible_facts": {
            "napalm_bgp_neighbors": {
                "global": {
                    "peers": {
                        "8.8.8.8": {
                            "address_family": {
                                "ipv4 unicast": {
                                    "accepted_prefixes": 1,
                                    "received_prefixes": 1,
                                    "sent_prefixes": 11
                                }
                            },
                            "description": "",
                            "is_enabled": true,
                            "is_up": true,
                            "local_as": 100,
                            "remote_as": 400,
                            "remote_id": "8.8.8.8",
                            "uptime": 1641600
                        },
                    "router_id": "2.2.2.2"
                }
            }
        },
        "changed": false,
        "failed": false
    }
}

PLAY RECAP ***********************************************************************************************************************************************************************************************************
east-rtr   : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
west-rtr   : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

The playbook returns structured operational data on the BGP neighbors. This data can easily be used to build a report.

Step 3

Create a report with the validation of BGP.

We can take the registered data from TASK:1 and pass it to the template module where a Jinja2 template can be used to create a report.

We will add TASK:2 to our PLAY.

---
- name: "TASK:2 - 'GENERATE REPORTS'"
  template:
    src: "./templates/bgp_report.j2"
    dest: "./build/{{ inventory_hostname }}.txt"

An example of a Jinja2 template can be seen below:

bgp_report.j2


Hostname: {{ inventory_hostname }}
-----------------
{% for neighbor, details in bgp["ansible_facts"]["napalm_bgp_neighbors"]["global"]["peers"].items() %}
Neighbor:           {{ neighbor }}
Enabled:            {{ details["is_enabled"] }}
Neighbor_UP:        {{ details["is_up"] }}
Accepted Prefixes:  {{ details['address_family']['ipv4 unicast']['accepted_prefixes'] }}
Received Prefixes:  {{ details['address_family']['ipv4 unicast']['received_prefixes'] }}
Sent Prefixes:      {{ details['address_family']['ipv4 unicast']['sent_prefixes'] }}

{% endfor %}

The Jinja2 template will render the structured data into a human-readable format.

Example:

Hostname: east-rtr
-----------------
Neighbor:          4.4.4.4
Enabled:           True
Neighbor_UP:       False
Accepted Prefixes: 0
Received Prefixes: 0
Sent Prefixes:     0

Step 4

With multiple devices in our inventory group, a file per device will be written. Parsing through multiple files can slow down the time to resolution; therefore, merging all these files together into one all-encompassing report will be done in the next task.

The Ansible assemble module will be used to merge all the reports together.

- name: "TASK:3 - ASSEMBLE REPORTING FROM HOST DETAILS"
    assemble:
    src: "./build"  # Directory with files to merge.
    dest: "./reports/report.txt"  # Merged output filename.

Once TASK:3 executes, one report is generated with the following output:

Hostname: east-rtr
-----------------
Neighbor:          4.4.4.4
Enabled:           True
Neighbor_UP:       False
Accepted Prefixes: 0
Received Prefixes: 0
Sent Prefixes:     0

Hostname: west-rtr
-----------------
Neighbor:          8.8.8.8
Enabled:           True
Neighbor_UP:       True
Accepted Prefixes: 8
Received Prefixes: 8
Sent Prefixes:     22

Now a single easy-to-read file exists to look at neighbors. We see east-rtr has a BGP neighbor that is DOWN.

Step 5

Check whether any DOWN neighbor is reachable via ping.

- name: "TASK:4 - PING BGP NEIGHBORS THAT ARE DOWN"
  napalm_ping:
    hostname: "{{ inventory_hostname }}"
    username: "{{ ansible_user }}"
    password: "{{ ansible_password }}"
    dev_os: "{{ ansible_network_os }}"
    destination: "{{ item.key }}"
  loop: "{{ bgp['ansible_facts']['napalm_bgp_neighbors']['global']['peers'] | dict2items }}"
  when: "not item.value.is_up"
  register: neighbor_down

After the reachability check is completed, print the results for the DOWN neighbors.

- name: "TASK:5 - PRINT PING RESULTS FOR DOWN NEIGHBORS"
  debug:
    msg: "{{ item['ping_results'] }}"
  loop: "{{ neighbor_down['results'] }}"
  when: "item['ping_results'] is defined"

TASK:5 example output:

    "msg": {
        "success": {
            "packet_loss": 5,
            "probes_sent": 5,
            "results": [],
            "rtt_avg": 0.0,
            "rtt_max": 0.0,
            "rtt_min": 0.0,
            "rtt_stddev": 0.0
        }
    }
}

Playbook Summary

Valuable troubleshooting data was gathered by running this playbook. A BGP neighbor is down on east-rtr. Details about all neighbors were also collected, including: enabled state, current neighbor state, and sent/received route counts. Finally, for any DOWN neighbors a reachability check using ping was performed. Most importantly, all this data was assembled across all our isp_routers in just seconds. This was still a simplified example with only two routers, but extrapolating this across tens, hundreds, or more routers is very powerful.

It is important to mention that additional tasks could be added to this playbook to troubleshoot further, for example:

Check the routing to the neighbor IP.
Grab the next-hop IP from the route entry.
Verify that the ARP table for the next-hop IP has a MAC entry.

Full Playbook

- name: "PLAY:1 - GET BGP SUMMARY"
  gather_facts: False
  connection: "network_cli"
  hosts: "isp_routers"
  tasks:
    - name: "TASK:1 - 'SHOW IP BGP SUMMARY'"
      ios_command:
        commands: "show ip bgp summary"
      register: "output_ios"
    - name: "TASK:2 - PRINT BGP OUTPUT"
      debug:
        msg: "{{ output_ios.stdout[0] }}"
- name: "PLAY:2 - USE NAPALM BGP FACTS"
  gather_facts: False
  connection: "network_cli"
  hosts: "isp_routers"
  tasks:
    - name: "TASK:1 - 'GET BGP FACTS'"
      napalm_get_facts: filter="bgp_neighbors"
      register: "bgp"
    - debug: var=bgp
    - name: "TASK:2 - 'GENERATE REPORT'"
      template:
        src: "./templates/bgp_report.j2"
        dest: "./build/{{ inventory_hostname }}.txt"
    - name: "TASK:3 - ASSEMBLE REPORTING FROM HOST DETAILS"
      assemble:
        src: "./build"
        dest: "./reports/report.txt"
    - name: "TASK:4 - PING BGP NEIGHBORS THAT ARE DOWN"
      napalm_ping:
        hostname: "{{ inventory_hostname }}"
        username: "{{ ansible_user }}"
        password: "{{ ansible_password }}"
        dev_os: "{{ ansible_network_os }}"
        destination: "{{ item['key'] }}"
      with_dict: "{{ bgp['ansible_facts']['napalm_bgp_neighbors']['global']['peers'] }}"
      when: "not item['value']['is_up']"
      register: "neighbor_down"
    - name: "TASK:5 - PRINT PING RESULTS FOR DOWN NEIGHBORS"
      debug:
        msg: "{{ item['ping_results'] }}"
      loop: "{{ neighbor_down['results'] }}"
      when: "item['ping_results'] is defined"

Conclusion

BGP troubleshooting is one of a multitude of operational troubleshooting playbooks that could be executed for troubleshooting connectivity issues. Taking these same steps to other use cases can greatly improve MTTR on network issues and outages. Furthermore, these playbooks can be extended using a module to update ITSM ticket notes, or even for use during an existing daily network readiness task.

-Jeff

Tags :

ansible automation netdevops tips tutorial

Does this all sound amazing? Want to know more about how Network to Code can help you do this, reach out to our sales team. If you want to help make this a reality for our clients, check out our careers page.

Author

Cookie	Duration	Description
__hssc	30 minutes	HubSpot sets this cookie to keep track of sessions and to determine if HubSpot should increment the session number and timestamps in the __hstc cookie.
__hssrc	session	This cookie is set by Hubspot whenever it changes the session cookie. The __hssrc cookie set to 1 indicates that the user has restarted the browser, and if the cookie does not exist, it is assumed to be a new session.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__cf_bm	30 minutes	Cloudflare set the cookie to support Cloudflare Bot Management.
li_gc	5 months 27 days	Linkedin set this cookie for storing visitor's consent regarding using cookies for non-essential purposes.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.

Cookie	Duration	Description
__hstc	5 months 27 days	Hubspot set this main cookie for tracking visitors. It contains the domain, initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session).
_ga	1 year 1 month 4 days	Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors.
_gat_gtag_UA_*	1 minute	Google Analytics sets this cookie to store a unique user ID.
_gid	1 day	Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously.
AnalyticsSyncHistory	1 month	Linkedin set this cookie to store information about the time a sync took place with the lms_analytics cookie.
CONSENT	2 years	YouTube sets this cookie via embedded YouTube videos and registers anonymous statistical data.
hubspotutk	5 months 27 days	HubSpot sets this cookie to keep track of the visitors to the website. This cookie is passed to HubSpot on form submission and used when deduplicating contacts.
ln_or	1 day	Linkedin sets this cookie to registers statistical data on users' behaviour on the website for internal analytics.

Cookie	Duration	Description
bcookie	1 year	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser IDs.
bscookie	1 year	LinkedIn sets this cookie to store performed actions on the website.
li_sugr	3 months	LinkedIn sets this cookie to collect user behaviour data to optimise the website and make advertisements on the website more relevant.
VISITOR_INFO1_LIVE	5 months 27 days	YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface.
YSC	session	Youtube sets this cookie to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt-remote-device-id	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt.innertube::nextId	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.

Ansible – BGP Networking Troubleshooting Guide

Use Case – BGP

Checking BGP with Ansible

Pre-req

Step 1

Step 2

Step 3

Step 4

Step 5

Playbook Summary

Full Playbook

Conclusion

Tags :

Share :

Contents

Recent Posts

February 27, 2025

February 24, 2025

February 14, 2025

February 11, 2025

Contact Us to Learn More

Author

Nautobot

What we do

How we do it

Company

Community

Resources

Contact us

Author