Blog Detail
One of the most important components of a network automation strategy is how to manage your data and/or Source of Truth (SoT). In networking, the data elements are often massive. As an example, most Fortune 500 type companies will have to manage a few hundred thousand to potentially millions of switchports, each with several data points (VLAN, MTU, IP, etc) per interface. The scale and criticality of this data present several challenges. The intention of this blog is describe the need for a versionable database in network automation and serve as an primer to a versionable database.
Problem Space
When managing and storing your Source of Truth data, there are two primary mechanisms, each with it’s own pros and cons.
Source Control (Git)
- Provides the ability to version data in such a way the exact state of the data at any point in the past is known.
- Provides the ability to know who is the owner of the data.
- Provides the ability to populate data in a staging area, without modifying the production data.
- Tooling integration with things such as CI Systems.
- Decentralized storage of the data.
Database
- Provides ACID-based transactions of the data.
- Has a native querying language to obtain, filter, and all around work with the data.
- Provides schema enforcement of the data.
- The ability to scale to large datasets.
There is a clear dichotomy between these two choices. On one hand there is great integrations with all standard DevOps tooling, on the other hand there is an enterprise-grade manager of the data.
History
For several years, a few of us have been searching for solutions that combine these two concepts. If one could build a database with Git constructs, this would allow versionable management of your data, with the ability to query, effectively store, and manage at scale. Through various searches it would seem that this has not been solved in any meaningful way. The closest I have come across was a project called NOMS, but it has limitations.
Recently a startup company called liquidata was founded and is creating a solution–a library called Dolt. Dolt is is written on top of the NOMS project, in Go. The intention is to support all Git semantics (such as branch, add, commit, etc..), as well as all MySQL semantics (such as insert, create, update, etc…).
By supporting all MySQL semantics, applications that use a MySQL server should be able to simply use a Dolt-SQL server with no disruption. Whether you are using an ODBC driver or raw SQL, in theory it should still work. You would naturally have to build in the capability for your application to take advantage of the Git capabilities or use traditional “Git-like” workflows.
By supporting all Git semantics, data should be able to be managed in a decentralized fashion using Git workflows to branch, add, diff, and merge the data. In a similar vein to GitHub, they have developed DoltHub, to provide that level of tooling expected in a standard Git user interface, such as forking, API’s, webhooks, and CI integrations.
Primer
In this brief introduction to the technology, we will create a Dolt data repository, called “simple-inventory” and run it on a dolt-sql server. If you care to follow along the only requirements are to have Docker and internet access.
Setup
Start the Docker container.
docker run -it --rm --name dolt golang:1.12.14
Install Dolt by running
curl -L https://github.com/liquidata-inc/dolt/releases/download/v0.12.0/install.sh | bash
root@2a29f8a34dd3:/go# curl -L https://github.com/liquidata-inc/dolt/releases/download/v0.12.0/install.sh | bash
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 601 0 601 0 0 2647 0 --:--:-- --:--:-- --:--:-- 2647
100 3038 100 3038 0 0 8392 0 --:--:-- --:--:-- --:--:-- 8392
Downloading: https://github.com/liquidata-inc/dolt/releases/download/v0.12.0/dolt-linux-amd64.tar.gz
Installing dolt, git-dolt and git-dolt-smudge to /usr/local/bin.
root@2a29f8a34dd3:/go#
You can create a Dolt data repository by creating a folder, and then initializing the repository. Just like Git, in Dolt, you need to add yourself to the Dolt config. If you are familiar with Git, these commands will be familiar, only changing the command from git
to dolt
, but keeping all of the other options the same.
root@2a29f8a34dd3:/go# mkdir simple-inventory
root@2a29f8a34dd3:/go# cd simple-inventory
root@2a29f8a34dd3:/go/simple-inventory# dolt config --global --add user.email "ken@celenza.org"
Config successfully updated.
root@2a29f8a34dd3:/go/simple-inventory# dolt config --global --add user.name "itdependsnetworks"
Config successfully updated.
root@2a29f8a34dd3:/go/simple-inventory# dolt init
Successfully initialized dolt data repository.
root@2a29f8a34dd3:/go/simple-inventory# dolt status
On branch master
nothing to commit, working tree clean
root@2a29f8a34dd3:/go/simple-inventory# dolt log
commit hc8gq0434ckkjeo36rccn55mo14gk87q
Author: itdependsnetworks <ken@celenza.org>
Date: Tue Jan 28 04:22:33 +0000 2020
Initialize data repository
root@2a29f8a34dd3:/go/simple-inventory#
Creating Tables
Ideally, if you are familiar with MySQL the only difference should be using Dolt engine instead of a MySQL server, everything should be the same. To enter the Dolt SQL server, simply use the command dolt sql
.
root@2a29f8a34dd3:/go/simple-inventory# dolt sql
# Welcome to the DoltSQL shell.
# Statements must be terminated with ';'.
# "exit" or "quit" (or Ctrl-D) to exit.
doltsql>
Create tables as you normally would, in this example, we will create two tables. The table device_inventory
will have the inventory of the devices, with columns for hostname and IP address. The table vlan
will have the hostname (with foreign key relationaship to the device_inventory
table) column, VLAN , and name of the VLAN.
Note: Though the foreign key relationaship syntax is defined, it is not a currently supported Dolt feature.
doltsql> CREATE TABLE device_inventory (
-> hostname varchar(32) NOT NULL,
-> ip_address varchar(15) NOT NULL,
-> primary key (`hostname`)
-> );
doltsql> CREATE TABLE vlan (
-> hostname varchar(32),
-> vlan int NOT NULL,
-> name varchar(32) NOT NULL,
-> PRIMARY KEY (hostname, vlan),
-> FOREIGN KEY (hostname) REFERENCES device_inventory(hostname)
-> );
doltsql> exit
Here is where it starts to get interesting–we can combine these two different concepts and see how they interact. So far, the data has not been committed into master, only staged into our local environment. We can view this by issuing standard Git-like commands when we return back to the command line from the dolt-sql server.
Run the
dolt status
,dolt diff
, anddolt diff -q
commands.
root@2a29f8a34dd3:/go/simple-inventory# dolt status
On branch master
Untracked files:
(use "dolt add <table>" to include in what will be committed)
new table: device_inventory
new table: vlan
root@2a29f8a34dd3:/go/simple-inventory# dolt diff
diff --dolt a/device_inventory b/device_inventory
added table
diff --dolt a/vlan b/vlan
added table
root@2a29f8a34dd3:/go/simple-inventory# dolt diff -q
CREATE TABLE `device_inventory` (
`hostname` LONGTEXT NOT NULL COMMENT 'tag:0',
`ip_address` LONGTEXT NOT NULL COMMENT 'tag:1',
PRIMARY KEY (`hostname`)
);
CREATE TABLE `vlan` (
`hostname` LONGTEXT NOT NULL COMMENT 'tag:0',
`vlan` BIGINT NOT NULL COMMENT 'tag:1',
`name` LONGTEXT NOT NULL COMMENT 'tag:2',
PRIMARY KEY (`hostname`,`vlan`)
);
root@2a29f8a34dd3:/go/simple-inventory#
The dolt status
command shows that the schema has not “taken effect” by being merged into master. The dolt diff
with optional -q
flag, shows that a new table has been created. Note the additional tag
parameters, this is logic to track column names, which may change over time and cause name conflicts. The new schema can be committed into master.
Issue the commands
dolt add -a
,dolt commit -m 'initial schema'
, anddolt log
.
root@2a29f8a34dd3:/go/simple-inventory# dolt add -a
root@2a29f8a34dd3:/go/simple-inventory# dolt commit -m 'initial schema'
commit 4n29eaqb1v9pmnpnda2r43fc91p1vp9l
Author: itdependsnetworks <ken@celenza.org>
Date: Tue Jan 28 04:22:41 +0000 2020
initial schema
root@2a29f8a34dd3:/go/simple-inventory# dolt log
commit 4n29eaqb1v9pmnpnda2r43fc91p1vp9l
Author: itdependsnetworks <ken@celenza.org>
Date: Tue Jan 28 04:22:41 +0000 2020
initial schema
commit hc8gq0434ckkjeo36rccn55mo14gk87q
Author: itdependsnetworks <ken@celenza.org>
Date: Tue Jan 28 04:22:33 +0000 2020
Initialize data repository
root@2a29f8a34dd3:/go/simple-inventory#
Awesome! The first piece of information of the database was committed.
Adding data
Defining schema is not all the valuable without the data. Again, using standard SQL constructs we can commit the data. Let’s use proper branching strategies this time.
Create a branch and Enter dolt-sql
root@2a29f8a34dd3:/go/simple-inventory# dolt checkout -b intial_data
Switched to branch 'intial_data'
root@2a29f8a34dd3:/go/simple-inventory# dolt sql
# Welcome to the DoltSQL shell.
# Statements must be terminated with ';'.
# "exit" or "quit" (or Ctrl-D) to exit.
doltsql>
Add data
doltsql> INSERT INTO device_inventory (hostname, ip_address) VALUES ("nyc-sw01", "10.1.1.1");
+---------+
| updated |
+---------+
| 1 |
+---------+
doltsql> INSERT INTO device_inventory (hostname, ip_address) VALUES ("nyc-sw02", "10.1.1.2");
+---------+
| updated |
+---------+
| 1 |
+---------+
doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw01", 10, "user");
+---------+
| updated |
+---------+
| 1 |
+---------+
doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw01", 20, "printer");
+---------+
| updated |
+---------+
| 1 |
+---------+
doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw01", 30, "wap");
+---------+
| updated |
+---------+
| 1 |
+---------+
doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw02", 10, "user");
+---------+
| updated |
+---------+
| 1 |
+---------+
doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw02", 20, "printer");
+---------+
| updated |
+---------+
| 1 |
+---------+
doltsql> INSERT INTO vlan (hostname, vlan, name) VALUES ("nyc-sw02", 30, "wap");
+---------+
| updated |
+---------+
| 1 |
+---------+
doltsql> exit
Bye
Now that we have staged the data, we can view the date in two different ways. The first is via raw sql entries, and the other is more akin to a unix diff.
View the diff. In the console, diffs are actually color coded green and red for add and remove respectively.
root@2a29f8a34dd3:/go/simple-inventory# dolt diff
diff --dolt a/device_inventory b/device_inventory
--- a/device_inventory @ tktob1spsoos6isfdj4o9benp9c57iic
+++ b/device_inventory @ b37haqr0n9j1e5bckj35sqt4kntojlp9
+-----+----------+------------+
| | hostname | ip_address |
+-----+----------+------------+
| + | nyc-sw01 | 10.1.1.1 |
| + | nyc-sw02 | 10.1.1.2 |
+-----+----------+------------+
diff --dolt a/vlan b/vlan
--- a/vlan @ 7f8lqlv2k9cpdth68kob9hl8atuejrpn
+++ b/vlan @ e2kovelv2sfjuo584o0v6hp76207k720
+-----+----------+------+---------+
| | hostname | vlan | name |
+-----+----------+------+---------+
| + | nyc-sw01 | 10 | user |
| + | nyc-sw01 | 20 | printer |
| + | nyc-sw01 | 30 | wap |
| + | nyc-sw02 | 10 | user |
| + | nyc-sw02 | 20 | printer |
| + | nyc-sw02 | 30 | wap |
+-----+----------+------+---------+
root@2a29f8a34dd3:/go/simple-inventory# dolt diff -q
INSERT INTO `device_inventory` (`hostname`,`ip_address`) VALUES ("nyc-sw01","10.1.1.1");
INSERT INTO `device_inventory` (`hostname`,`ip_address`) VALUES ("nyc-sw02","10.1.1.2");
INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw01",10,"user");
INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw01",20,"printer");
INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw01",30,"wap");
INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw02",10,"user");
INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw02",20,"printer");
INSERT INTO `vlan` (`hostname`,`vlan`,`name`) VALUES ("nyc-sw02",30,"wap");
root@2a29f8a34dd3:/go/simple-inventory#
Personally, I’m pretty impressed with what is happening here, but there is still more to do to complete this workflow. The data needs to be committed, and merged from the “feature” branch into master branch.
Commit to the feature branch.
root@2a29f8a34dd3:/go/simple-inventory# dolt add -a
root@2a29f8a34dd3:/go/simple-inventory# dolt commit -m 'initial data'
commit gbt6r21mtbd0cvs5iahog912essjhn0n
Author: itdependsnetworks <ken@celenza.org>
Date: Tue Jan 28 04:22:50 +0000 2020
initial data
root@2a29f8a34dd3:/go/simple-inventory#
Merge into the master branch.
root@2a29f8a34dd3:/go/simple-inventory# dolt checkout master
Switched to branch 'master'
root@2a29f8a34dd3:/go/simple-inventory# dolt branch
intial_data
* master
root@2a29f8a34dd3:/go/simple-inventory# dolt merge intial_data
Updating 4n29eaqb1v9pmnpnda2r43fc91p1vp9l..gbt6r21mtbd0cvs5iahog912essjhn0n
Fast-forward
root@2a29f8a34dd3:/go/simple-inventory# dolt log
commit gbt6r21mtbd0cvs5iahog912essjhn0n
Author: itdependsnetworks <ken@celenza.org>
Date: Tue Jan 28 04:22:50 +0000 2020
initial data
commit 4n29eaqb1v9pmnpnda2r43fc91p1vp9l
Author: itdependsnetworks <ken@celenza.org>
Date: Tue Jan 28 04:22:41 +0000 2020
initial schema
commit hc8gq0434ckkjeo36rccn55mo14gk87q
Author: itdependsnetworks <ken@celenza.org>
Date: Tue Jan 28 04:22:33 +0000 2020
Initialize data repository
root@2a29f8a34dd3:/go/simple-inventory#
So far this is showing the “Create” part of standard CRUD operations is working, meaning we can add data to the repository.
Update Data
Naturally, as soon as data is entered, you will want to modify it. The process is the same to modify: branch, sql statements, add, commit, and merge.
Checkout a new branch, and enter Dolt SQL.
root@2a29f8a34dd3:/go/simple-inventory# dolt checkout -b change_vlan
Switched to branch 'change_vlan'
root@2a29f8a34dd3:/go/simple-inventory# dolt sql
# Welcome to the DoltSQL shell.
# Statements must be terminated with ';'.
# "exit" or "quit" (or Ctrl-D) to exit.
doltsql>
Update the data.
doltsql> update vlan set name = "prnt" where name = "printer";
+---------+---------+
| matched | updated |
+---------+---------+
| 2 | 2 |
+---------+---------+
doltsql> exit
Bye
View the diff.
root@2a29f8a34dd3:/go/simple-inventory# dolt diff
diff --dolt a/vlan b/vlan
--- a/vlan @ bjg5ou111lmu03ciu6aisphf7m6jbk1r
+++ b/vlan @ 7f8lqlv2k9cpdth68kob9hl8atuejrpn
+-----+----------+------+---------+
| | hostname | vlan | name |
+-----+----------+------+---------+
| < | nyc-sw01 | 20 | printer |
| > | nyc-sw01 | 20 | prnt |
| < | nyc-sw02 | 20 | printer |
| > | nyc-sw02 | 20 | prnt |
+-----+----------+------+---------+
root@2a29f8a34dd3:/go/simple-inventory# dolt diff -q
UPDATE `vlan` SET `name`="prnt" WHERE (`hostname`="nyc-sw01" AND `vlan`=20);
UPDATE `vlan` SET `name`="prnt" WHERE (`hostname`="nyc-sw02" AND `vlan`=20);
root@2a29f8a34dd3:/go/simple-inventory#
A keen eye will note the captured update statements shown in the diff are not the same as the update statement inputted in the SQL server. This conversion makes sense, since the data could be merged on a different set of data that had more or less data in it. It also illustrates the complexity of building this technology.
Commit to a feature branch and merge to master.
root@2a29f8a34dd3:/go/simple-inventory# dolt add -a
root@2a29f8a34dd3:/go/simple-inventory# dolt commit -m 'update data'
commit jqe3vb84fhfunn7uapksdt5thb4rr7r7
Author: itdependsnetworks <ken@celenza.org>
Date: Tue Jan 28 04:23:03 +0000 2020
update data
root@2a29f8a34dd3:/go/simple-inventory# dolt checkout master
Switched to branch 'master'
root@2a29f8a34dd3:/go/simple-inventory# dolt merge change_vlan
Updating gbt6r21mtbd0cvs5iahog912essjhn0n..jqe3vb84fhfunn7uapksdt5thb4rr7r7
Fast-forward
root@2a29f8a34dd3:/go/simple-inventory# dolt sql
# Welcome to the DoltSQL shell.
# Statements must be terminated with ';'.
# "exit" or "quit" (or Ctrl-D) to exit.
doltsql>
View the data from dolt sql.
doltsql> select * from vlan;
+----------+------+------+
| hostname | vlan | name |
+----------+------+------+
| nyc-sw01 | 10 | user |
| nyc-sw01 | 20 | prnt |
| nyc-sw01 | 30 | wap |
| nyc-sw02 | 10 | user |
| nyc-sw02 | 20 | prnt |
| nyc-sw02 | 30 | wap |
+----------+------+------+
doltsql>
You can also view who is the owner of the data by tracking to the commit, which includes the auther, time, and commit hash metadata as well, using the dolt blame <table>
command.
root@2a29f8a34dd3:/go/simple-inventory# dolt blame vlan
+----------+------+--------------+-------------------+------------------------------+-------------------+
| HOSTNAME | VLAN | COMMIT MSG | AUTHOR | TIME | COMMIT |
+----------+------+--------------+-------------------+------------------------------+-------------------+
| nyc-sw01 | 30 | initial data | itdependsnetworks | Tue Jan 28 04:22:50 UTC 2020 | gbt6r21mtbd0cvs5i |
| nyc-sw02 | 10 | initial data | itdependsnetworks | Tue Jan 28 04:22:50 UTC 2020 | gbt6r21mtbd0cvs5i |
| nyc-sw02 | 20 | update data | itdependsnetworks | Tue Jan 28 04:23:03 UTC 2020 | jqe3vb84fhfunn7ua |
| nyc-sw02 | 30 | initial data | itdependsnetworks | Tue Jan 28 04:22:50 UTC 2020 | gbt6r21mtbd0cvs5i |
| nyc-sw01 | 10 | initial data | itdependsnetworks | Tue Jan 28 04:22:50 UTC 2020 | gbt6r21mtbd0cvs5i |
| nyc-sw01 | 20 | update data | itdependsnetworks | Tue Jan 28 04:23:03 UTC 2020 | jqe3vb84fhfunn7ua |
+----------+------+--------------+-------------------+------------------------------+-------------------+
root@2a29f8a34dd3:/go/simple-inventory#
Conclusion
This is just a primer, and there is still a lot of work to be done to truly have feature parity with both MySQL and Git, but this is a great step in the right direction. I plan to continue to monitor and follow up with examples, use cases, and library updates over time. Specifically, next time, I want to extend the workflow to include DoltHub to review those capabilities as well.
-Ken
Contact Us to Learn More
Share details about yourself & someone from our team will reach out to you ASAP!