One of the most important components of a network automation strategy is how to manage your data and/or Source of Truth (SoT). In networking, the data elements are often massive. As an example, most Fortune 500 type companies will have to manage a few hundred thousand to potentially millions of switchports, each with several data points (VLAN, MTU, IP, etc) per interface. The scale and criticality of this data present several challenges. The intention of this blog is describe the need for a versionable database in network automation and serve as an primer to a versionable database.
Problem Space
When managing and storing your Source of Truth data, there are two primary mechanisms, each with it’s own pros and cons.
Source Control (Git)
Provides the ability to version data in such a way the exact state of the data at any point in the past is known.
Provides the ability to know who is the owner of the data.
Provides the ability to populate data in a staging area, without modifying the production data.
Tooling integration with things such as CI Systems.
Has a native querying language to obtain, filter, and all around work with the data.
Provides schema enforcement of the data.
The ability to scale to large datasets.
There is a clear dichotomy between these two choices. On one hand there is great integrations with all standard DevOps tooling, on the other hand there is an enterprise-grade manager of the data.
History
For several years, a few of us have been searching for solutions that combine these two concepts. If one could build a database with Git constructs, this would allow versionable management of your data, with the ability to query, effectively store, and manage at scale. Through various searches it would seem that this has not been solved in any meaningful way. The closest I have come across was a project called NOMS, but it has limitations.
Recently a startup company called liquidata was founded and is creating a solution–a library called Dolt. Dolt is is written on top of the NOMS project, in Go. The intention is to support all Git semantics (such as branch, add, commit, etc..), as well as all MySQL semantics (such as insert, create, update, etc…).
By supporting all MySQL semantics, applications that use a MySQL server should be able to simply use a Dolt-SQL server with no disruption. Whether you are using an ODBC driver or raw SQL, in theory it should still work. You would naturally have to build in the capability for your application to take advantage of the Git capabilities or use traditional “Git-like” workflows.
By supporting all Git semantics, data should be able to be managed in a decentralized fashion using Git workflows to branch, add, diff, and merge the data. In a similar vein to GitHub, they have developed DoltHub, to provide that level of tooling expected in a standard Git user interface, such as forking, API’s, webhooks, and CI integrations.
Primer
In this brief introduction to the technology, we will create a Dolt data repository, called “simple-inventory” and run it on a dolt-sql server. If you care to follow along the only requirements are to have Docker and internet access.
Setup
Start the Docker container.
docker run -it --rm --name dolt golang:1.12.14
Install Dolt by running curl -L https://github.com/liquidata-inc/dolt/releases/download/v0.12.0/install.sh | bash
root@2a29f8a34dd3:/go# curl -L https://github.com/liquidata-inc/dolt/releases/download/v0.12.0/install.sh | bash% Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed10060106010026470--:--:----:--:----:--:--2647100303810030380083920--:--:----:--:----:--:--8392Downloading:https://github.com/liquidata-inc/dolt/releases/download/v0.12.0/dolt-linux-amd64.tar.gzInstalling dolt, git-dolt and git-dolt-smudge to /usr/local/bin.root@2a29f8a34dd3:/go#
You can create a Dolt data repository by creating a folder, and then initializing the repository. Just like Git, in Dolt, you need to add yourself to the Dolt config. If you are familiar with Git, these commands will be familiar, only changing the command from git to dolt, but keeping all of the other options the same.
root@2a29f8a34dd3:/go# mkdir simple-inventoryroot@2a29f8a34dd3:/go# cd simple-inventoryroot@2a29f8a34dd3:/go/simple-inventory# dolt config --global --add user.email "ken@celenza.org"Config successfully updated.root@2a29f8a34dd3:/go/simple-inventory# dolt config --global --add user.name "itdependsnetworks"Config successfully updated.root@2a29f8a34dd3:/go/simple-inventory# dolt initSuccessfully initialized dolt data repository.root@2a29f8a34dd3:/go/simple-inventory# dolt statusOn branch masternothing to commit, working tree cleanroot@2a29f8a34dd3:/go/simple-inventory# dolt logcommit hc8gq0434ckkjeo36rccn55mo14gk87qAuthor: itdependsnetworks <ken@celenza.org>Date: Tue Jan 2804:22:33+00002020 Initialize data repositoryroot@2a29f8a34dd3:/go/simple-inventory#
Creating Tables
Ideally, if you are familiar with MySQL the only difference should be using Dolt engine instead of a MySQL server, everything should be the same. To enter the Dolt SQL server, simply use the command dolt sql.
root@2a29f8a34dd3:/go/simple-inventory# dolt sql# Welcome to the DoltSQL shell.# Statements must be terminated with';'.# "exit" or "quit" (or Ctrl-D) to exit.doltsql>
Create tables as you normally would, in this example, we will create two tables. The table device_inventory will have the inventory of the devices, with columns for hostname and IP address. The table vlan will have the hostname (with foreign key relationaship to the device_inventory table) column, VLAN , and name of the VLAN.
Note: Though the foreign key relationaship syntax is defined, it is not a currently supported Dolt feature.
doltsql> CREATE TABLE device_inventory (-> hostname varchar(32) NOT NULL,-> ip_address varchar(15) NOT NULL,-> primary key (`hostname`)-> );doltsql> CREATE TABLE vlan (-> hostname varchar(32),-> vlan int NOT NULL,-> name varchar(32) NOT NULL,-> PRIMARY KEY (hostname, vlan),-> FOREIGN KEY (hostname) REFERENCES device_inventory(hostname)-> );doltsql> exit
Here is where it starts to get interesting–we can combine these two different concepts and see how they interact. So far, the data has not been committed into master, only staged into our local environment. We can view this by issuing standard Git-like commands when we return back to the command line from the dolt-sql server.
Run the dolt status, dolt diff, and dolt diff -q commands.
root@2a29f8a34dd3:/go/simple-inventory# dolt statusOn branch masterUntracked files: (use "dolt add <table>" to include in what will be committed)new table: device_inventorynew table: vlanroot@2a29f8a34dd3:/go/simple-inventory# dolt diffdiff --dolt a/device_inventory b/device_inventoryadded tablediff --dolt a/vlan b/vlanadded tableroot@2a29f8a34dd3:/go/simple-inventory# dolt diff -qCREATE TABLE `device_inventory` (`hostname` LONGTEXT NOT NULL COMMENT 'tag:0',`ip_address` LONGTEXT NOT NULL COMMENT 'tag:1', PRIMARY KEY (`hostname`));CREATE TABLE `vlan` (`hostname` LONGTEXT NOT NULL COMMENT 'tag:0',`vlan` BIGINT NOT NULL COMMENT 'tag:1',`name` LONGTEXT NOT NULL COMMENT 'tag:2', PRIMARY KEY (`hostname`,`vlan`));root@2a29f8a34dd3:/go/simple-inventory#
The dolt status command shows that the schema has not “taken effect” by being merged into master. The dolt diff with optional -q flag, shows that a new table has been created. Note the additional tag parameters, this is logic to track column names, which may change over time and cause name conflicts. The new schema can be committed into master.
Issue the commands dolt add -a, dolt commit -m 'initial schema', and dolt log.
root@2a29f8a34dd3:/go/simple-inventory# dolt add -aroot@2a29f8a34dd3:/go/simple-inventory# dolt commit -m 'initial schema'commit 4n29eaqb1v9pmnpnda2r43fc91p1vp9lAuthor: itdependsnetworks <ken@celenza.org>Date: Tue Jan 2804:22:41+00002020 initial schemaroot@2a29f8a34dd3:/go/simple-inventory# dolt logcommit 4n29eaqb1v9pmnpnda2r43fc91p1vp9lAuthor: itdependsnetworks <ken@celenza.org>Date: Tue Jan 2804:22:41+00002020 initial schemacommit hc8gq0434ckkjeo36rccn55mo14gk87qAuthor: itdependsnetworks <ken@celenza.org>Date: Tue Jan 2804:22:33+00002020 Initialize data repositoryroot@2a29f8a34dd3:/go/simple-inventory#
Awesome! The first piece of information of the database was committed.
Adding data
Defining schema is not all the valuable without the data. Again, using standard SQL constructs we can commit the data. Let’s use proper branching strategies this time.
Create a branch and Enter dolt-sql
root@2a29f8a34dd3:/go/simple-inventory# dolt checkout -b intial_dataSwitched to branch 'intial_data'root@2a29f8a34dd3:/go/simple-inventory# dolt sql# Welcome to the DoltSQL shell.# Statements must be terminated with';'.# "exit" or "quit" (or Ctrl-D) to exit.doltsql>
Add data
doltsql> INSERT INTO device_inventory (hostname, ip_address) VALUES ("nyc-sw01","10.1.1.1");+---------+| updated |+---------+|1|+---------+doltsql> INSERT INTO device_inventory (hostname, ip_address) VALUES ("nyc-sw02","10.1.1.2");+---------+| updated |+---------+|1|+---------+doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw01",10,"user");+---------+| updated |+---------+|1|+---------+doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw01",20,"printer");+---------+| updated |+---------+|1|+---------+doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw01",30,"wap");+---------+| updated |+---------+|1|+---------+doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw02",10,"user");+---------+| updated |+---------+|1|+---------+doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw02",20,"printer");+---------+| updated |+---------+|1|+---------+doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw02",30,"wap");+---------+| updated |+---------+|1|+---------+doltsql> exitBye
Now that we have staged the data, we can view the date in two different ways. The first is via raw sql entries, and the other is more akin to a unix diff.
View the diff. In the console, diffs are actually color coded green and red for add and remove respectively.
root@2a29f8a34dd3:/go/simple-inventory# dolt diffdiff --dolt a/device_inventory b/device_inventory--- a/device_inventory @ tktob1spsoos6isfdj4o9benp9c57iic+++ b/device_inventory @ b37haqr0n9j1e5bckj35sqt4kntojlp9+-----+----------+------------+|| hostname | ip_address |+-----+----------+------------+|+| nyc-sw01 |10.1.1.1||+| nyc-sw02 |10.1.1.2|+-----+----------+------------+diff --dolt a/vlan b/vlan--- a/vlan @ 7f8lqlv2k9cpdth68kob9hl8atuejrpn+++ b/vlan @ e2kovelv2sfjuo584o0v6hp76207k720+-----+----------+------+---------+|| hostname | vlan | name |+-----+----------+------+---------+|+| nyc-sw01 |10| user ||+| nyc-sw01 |20| printer ||+| nyc-sw01 |30| wap ||+| nyc-sw02 |10| user ||+| nyc-sw02 |20| printer ||+| nyc-sw02 |30| wap |+-----+----------+------+---------+root@2a29f8a34dd3:/go/simple-inventory# dolt diff -qINSERT INTO `device_inventory` (`hostname`,`ip_address`) VALUES ("nyc-sw01","10.1.1.1");INSERT INTO `device_inventory` (`hostname`,`ip_address`) VALUES ("nyc-sw02","10.1.1.2");INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw01",10,"user");INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw01",20,"printer");INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw01",30,"wap");INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw02",10,"user");INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw02",20,"printer");INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw02",30,"wap");root@2a29f8a34dd3:/go/simple-inventory#
Personally, I’m pretty impressed with what is happening here, but there is still more to do to complete this workflow. The data needs to be committed, and merged from the “feature” branch into master branch.
root@2a29f8a34dd3:/go/simple-inventory# dolt checkout masterSwitched to branch 'master'root@2a29f8a34dd3:/go/simple-inventory# dolt branch intial_data* masterroot@2a29f8a34dd3:/go/simple-inventory# dolt merge intial_dataUpdating 4n29eaqb1v9pmnpnda2r43fc91p1vp9l..gbt6r21mtbd0cvs5iahog912essjhn0nFast-forwardroot@2a29f8a34dd3:/go/simple-inventory# dolt logcommit gbt6r21mtbd0cvs5iahog912essjhn0nAuthor: itdependsnetworks <ken@celenza.org>Date: Tue Jan 2804:22:50+00002020 initial datacommit 4n29eaqb1v9pmnpnda2r43fc91p1vp9lAuthor: itdependsnetworks <ken@celenza.org>Date: Tue Jan 2804:22:41+00002020 initial schemacommit hc8gq0434ckkjeo36rccn55mo14gk87qAuthor: itdependsnetworks <ken@celenza.org>Date: Tue Jan 2804:22:33+00002020 Initialize data repositoryroot@2a29f8a34dd3:/go/simple-inventory#
So far this is showing the “Create” part of standard CRUD operations is working, meaning we can add data to the repository.
Update Data
Naturally, as soon as data is entered, you will want to modify it. The process is the same to modify: branch, sql statements, add, commit, and merge.
Checkout a new branch, and enter Dolt SQL.
root@2a29f8a34dd3:/go/simple-inventory# dolt checkout -b change_vlanSwitched to branch 'change_vlan'root@2a29f8a34dd3:/go/simple-inventory# dolt sql# Welcome to the DoltSQL shell.# Statements must be terminated with';'.# "exit" or "quit" (or Ctrl-D) to exit.doltsql>
Update the data.
doltsql> update vlan set name ="prnt" where name ="printer";+---------+---------+| matched | updated |+---------+---------+|2|2|+---------+---------+doltsql> exitBye
View the diff.
root@2a29f8a34dd3:/go/simple-inventory# dolt diffdiff --dolt a/vlan b/vlan--- a/vlan @ bjg5ou111lmu03ciu6aisphf7m6jbk1r+++ b/vlan @ 7f8lqlv2k9cpdth68kob9hl8atuejrpn+-----+----------+------+---------+|| hostname | vlan | name |+-----+----------+------+---------+|<| nyc-sw01 |20| printer ||>| nyc-sw01 |20| prnt ||<| nyc-sw02 |20| printer ||>| nyc-sw02 |20| prnt |+-----+----------+------+---------+root@2a29f8a34dd3:/go/simple-inventory# dolt diff -qUPDATE `vlan` SET `name`="prnt"WHERE (`hostname`="nyc-sw01" AND `vlan`=20);UPDATE `vlan` SET `name`="prnt"WHERE (`hostname`="nyc-sw02" AND `vlan`=20);root@2a29f8a34dd3:/go/simple-inventory#
A keen eye will note the captured update statements shown in the diff are not the same as the update statement inputted in the SQL server. This conversion makes sense, since the data could be merged on a different set of data that had more or less data in it. It also illustrates the complexity of building this technology.
Commit to a feature branch and merge to master.
root@2a29f8a34dd3:/go/simple-inventory# dolt add -aroot@2a29f8a34dd3:/go/simple-inventory# dolt commit -m 'update data'commit jqe3vb84fhfunn7uapksdt5thb4rr7r7Author: itdependsnetworks <ken@celenza.org>Date: Tue Jan 2804:23:03+00002020 update dataroot@2a29f8a34dd3:/go/simple-inventory# dolt checkout masterSwitched to branch 'master'root@2a29f8a34dd3:/go/simple-inventory# dolt merge change_vlanUpdating gbt6r21mtbd0cvs5iahog912essjhn0n..jqe3vb84fhfunn7uapksdt5thb4rr7r7Fast-forwardroot@2a29f8a34dd3:/go/simple-inventory# dolt sql# Welcome to the DoltSQL shell.# Statements must be terminated with';'.# "exit" or "quit" (or Ctrl-D) to exit.doltsql>
View the data from dolt sql.
doltsql> select * from vlan;+----------+------+------+| hostname | vlan | name |+----------+------+------+| nyc-sw01 |10| user || nyc-sw01 |20| prnt || nyc-sw01 |30| wap || nyc-sw02 |10| user || nyc-sw02 |20| prnt || nyc-sw02 |30| wap |+----------+------+------+doltsql>
You can also view who is the owner of the data by tracking to the commit, which includes the auther, time, and commit hash metadata as well, using the dolt blame <table> command.
root@2a29f8a34dd3:/go/simple-inventory# dolt blame vlan+----------+------+--------------+-------------------+------------------------------+-------------------+| HOSTNAME | VLAN | COMMIT MSG | AUTHOR | TIME | COMMIT |+----------+------+--------------+-------------------+------------------------------+-------------------+| nyc-sw01 |30| initial data | itdependsnetworks | Tue Jan 2804:22:50 UTC 2020| gbt6r21mtbd0cvs5i || nyc-sw02 |10| initial data | itdependsnetworks | Tue Jan 2804:22:50 UTC 2020| gbt6r21mtbd0cvs5i || nyc-sw02 |20| update data | itdependsnetworks | Tue Jan 2804:23:03 UTC 2020| jqe3vb84fhfunn7ua || nyc-sw02 |30| initial data | itdependsnetworks | Tue Jan 2804:22:50 UTC 2020| gbt6r21mtbd0cvs5i || nyc-sw01 |10| initial data | itdependsnetworks | Tue Jan 2804:22:50 UTC 2020| gbt6r21mtbd0cvs5i || nyc-sw01 |20| update data | itdependsnetworks | Tue Jan 2804:23:03 UTC 2020| jqe3vb84fhfunn7ua |+----------+------+--------------+-------------------+------------------------------+-------------------+root@2a29f8a34dd3:/go/simple-inventory#
Conclusion
This is just a primer, and there is still a lot of work to be done to truly have feature parity with both MySQL and Git, but this is a great step in the right direction. I plan to continue to monitor and follow up with examples, use cases, and library updates over time. Specifically, next time, I want to extend the workflow to include DoltHub to review those capabilities as well.
Does this all sound amazing? Want to know more about how Network to Code can help you do this, reach out to our sales team. If you want to help make this a reality for our clients, check out our careers page.
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies. In case of sale of your personal information, you may opt out by using the link Do not sell my personal information. Privacy | Cookies
This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Cookie
Duration
Description
__hssc
30 minutes
HubSpot sets this cookie to keep track of sessions and to determine if HubSpot should increment the session number and timestamps in the __hstc cookie.
__hssrc
session
This cookie is set by Hubspot whenever it changes the session cookie. The __hssrc cookie set to 1 indicates that the user has restarted the browser, and if the cookie does not exist, it is assumed to be a new session.
cookielawinfo-checkbox-advertisement
1 year
Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-analytics
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional
11 months
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent
1 year
CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie.
viewed_cookie_policy
11 months
The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Cookie
Duration
Description
__cf_bm
30 minutes
Cloudflare set the cookie to support Cloudflare Bot Management.
li_gc
5 months 27 days
Linkedin set this cookie for storing visitor's consent regarding using cookies for non-essential purposes.
lidc
1 day
LinkedIn sets the lidc cookie to facilitate data center selection.
UserMatchHistory
1 month
LinkedIn sets this cookie for LinkedIn Ads ID syncing.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Cookie
Duration
Description
__hstc
5 months 27 days
Hubspot set this main cookie for tracking visitors. It contains the domain, initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session).
_ga
1 year 1 month 4 days
Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors.
_gat_gtag_UA_*
1 minute
Google Analytics sets this cookie to store a unique user ID.
_gid
1 day
Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously.
AnalyticsSyncHistory
1 month
Linkedin set this cookie to store information about the time a sync took place with the lms_analytics cookie.
CONSENT
2 years
YouTube sets this cookie via embedded YouTube videos and registers anonymous statistical data.
hubspotutk
5 months 27 days
HubSpot sets this cookie to keep track of the visitors to the website. This cookie is passed to HubSpot on form submission and used when deduplicating contacts.
ln_or
1 day
Linkedin sets this cookie to registers statistical data on users' behaviour on the website for internal analytics.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Cookie
Duration
Description
bcookie
1 year
LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser IDs.
bscookie
1 year
LinkedIn sets this cookie to store performed actions on the website.
li_sugr
3 months
LinkedIn sets this cookie to collect user behaviour data to optimise the website and make advertisements on the website more relevant.
VISITOR_INFO1_LIVE
5 months 27 days
YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface.
YSC
session
Youtube sets this cookie to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices
never
YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt-remote-device-id
never
YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt.innertube::nextId
never
YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests
never
YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.