From Beginner to Expert: How Code Evolves – Part 1

What separates a coding beginner from an expert? Is it sheer knowledge or something more? After observing the actual coding on coderbyte (a platform to evaluate candidates’ coding) of both groups tackling the same challenge lab during our The Network to Code University (NTCU) program and comparing it to some internal experts doing the same challenge, a surprising answer emerged. It wasn’t necessarily all experience, but rather some basic challenges. We saw firsthand how subtle differences in approach could lead to dramatically different outcomes. In this three-part series, I’ll share these insightful observations from the NTCU program.

In this post, we will be starting with the often-overlooked fundamental issues that trip up beginners.

This is not intended to be a replacement for the many beginner mistakes articles and blogs you can read, but instead, a practical observation of what approaches beginners and experts have taken to the same exact challenge

Everything comes from something.. except when it doesn’t

Examples taken from the internet will often start with the presumption of existing code. Let’s take an example from two code snippets from Stack Overflow. The question

net_connect.send_command('save config')
net_connect.send_command('y')

and the answer

net_connect.save_config() 

Running any of these code snippets will result in a name error. 

>>> net_connect.send_command('save config')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'net_connect' is not defined
>>>

For beginners, this is a bit confusing. This is compounded in that even though there is not a single function, class, or method called net_connect, it is likely well understood by those familiar with Netmiko. If you look at this sample code taken from one of Netmiko’s examples, you will see why.

#!/usr/bin/env python
from netmiko import ConnectHandler
from getpass import getpass

cisco1 = {
    "device_type": "cisco_ios",
    "host": "cisco1.lasthop.io",
    "username": "pyclass",
    "password": getpass(),
    "session_log": "output.txt",
}

# Show command that we execute
command = "show ip int brief"
with ConnectHandler(**cisco1) as net_connect:
    output = net_connect.send_command(command)

As you can see, net_connect is just a convention often used but not required. If you are not familiar with the code, it can be nearly impossible to understand the Stack Overflow answer, even if it directly addresses your question.

As noted, there is an exception, that everything comes from something. Let’s take a step back and talk about three types of libraries. 

  1. Python Built-in Functions – These are the functions that are always loaded and available, and thus the exception of not being explicitly defined.
  2. Python Standard Library – These are libraries that come by default with Python, but are not automatically loaded.
  3. Third-Party Libraries – These are libraries that are outside of the core Python ecosystem. Generally speaking, they are hosted on PyPI.

If a variable is not one of the built-ins, it must always be defined earlier in the code. Whether the definition is from a standard library, third-party library, or your own defined variable, it does not matter how it got there, it just must be there. After reading this, if you get a NameError it should be immediately obvious what to do. 

Two helpful tricks to understand what is a built-in function or not is to dir the __builtins__ variable, such as dir(__builtins__), and use the globals() function to see all variables currently defined. 

>>> globals()
{'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': <class '_frozen_importlib.BuiltinImporter'>, '__spec__': None, '__annotations__': {}, '__builtins__': <module 'builtins' (built-in)>}
>>> dir(__builtins__)
['AssertionError', 'EOFError', 'Exception', 'False', 'FileNotFoundError', 'IndexError', 'KeyError', 'None', 'OSError', 'RuntimeError', 'TypeError', 'ValueError', 'ZeroDivisionError', 'abs', 'bin', 'bool', 'callable', 'dict', 'dir', 'eval', 'exec', 'float', 'getattr', 'globals', 'hasattr', 'help', 'input', 'int', 'len', 'list', 'map', 'max', 'min', 'next', 'open', 'ord', 'pow', 'print', 'range', 'repr', 'round', 'set', 'sorted', 'str', 'sum', 'tuple', 'type', 'vars', 'zip', ...] # Truncated output
>>>

Using Input

So much of the beginner content involves interacting with a Python script directly. A common pattern to get data into the script is to use the input function, which stops the program and waits for a response from the user. Let’s take a look at an example found in Adrian Giacometti’s blog

import netmiko
import sys

device_info = {
    "device_type": "cisco_ios",
    "username": "cisco",
    "password": "cisco",
    "secret": "cisco",
}

host = input('Enter host IP: ')
command = input('Enter command to run: ')
device_info['host'] = host

ssh_session = netmiko.ConnectHandler(**device_info)
ssh_session.enable()
result = ssh_session.send_command(command)
ssh_session.disconnect()

print(result)

Coding in this manner is super helpful to get started; there is no complication to worry about how to make this interactive. The person running this script intuitively knows how to answer the question, and they are off to the races. 

However, for beginners who have mostly worked in the lab or running scripts themselves, this is not generally how automation is run in production environments. In fact, outside of the lab, I have never used or even seen someone use this in production. 

It became apparent within our coderbyte platform which does not support interaction of any kind, that this process was unfamiliar to many. There are many other ways to define variables within the script, just to name a few:

  1. Statically defining within the script
  2. Sending in as command-line variables (e.g., sys.argv or argparse)
  3. Interacting with a Source of Truth

All of these methods you should be familiar with as you continue your network automation journey. 

Building Functions

I suspect when someone learns from command snippets and code with pre-built functions, how functions work is not always clear. It’s hard to bundle up all of the reasons, but here are a few common observations:

  1. Not understanding that the data passed into the function is determined by its position in the argument list when the function is called, not by the variable’s name.
  2. Not understanding that a variable is defined outside a function, and due to lexical scoping worked, but not for the reason one expected.
  3. A function can be called multiple times with different arguments. 

That may not mean much on its own, so let’s take a look at a few examples. 

1.

>>> def add_octet(base, fourth_octet):
...     return f"{base}.{fourth_octet}"
...
>>> base = "10.10.10"
>>> fourth_octet = "250"
>>> add_octet(fourth_octet, base)
'250.10.10.10'
>>>

In this example, you can see why it is confusing. The variable names we use, especially in more contrived examples in lab and coding challenges, are often the same. It can seem like you are passing in those variables. Even more confusingly, these all work since you are using the key/value definitions of it.

>>> add_octet(fourth_octet=fourth_octet, base=base)
'10.10.10.250'
>>> add_octet(fourth_octet="250", base="10.10.10")
'10.10.10.250'
>>> fourth_octet = "10.10.10"
>>> base = "250"
>>> add_octet(fourth_octet=base, base=fourth_octet)
'10.10.10.250'
>>>

If that doesn’t make sense, it’s worthwhile to spend some time in the Python interpreter yourself.

2. 

>>> site = "nyc01"
>>> if site.endswith("01"):
...   fourth_octet = 250
...
>>> def add_octet(base):
...     return f"{base}.{fourth_octet}"
...
>>> add_octet("10.10.10")
'10.10.10.250'
>>>

In this example, the function worked, but let’s try again with slightly different data.

>>> site = "nyc02"
>>> if site.endswith("01"):
...   fourth_octet = 250
...
>>> def add_octet(base):
...     return f"{base}.{fourth_octet}"
...
>>> add_octet("10.10.10")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in add_octet
NameError: name 'fourth_octet' is not defined
>>>

While both of these examples are a bit contrived, you can see why there are challenges. In general, these were issues caught within the logic of an if statement or a loop, where it may work for some circumstances, but not in other circumstances.

In general, you should be using variables that you created in the function to only affect variables in that function. 

3.

>>> def add_octet_250(base):
...   fourth_octet = "250"
...   return f"{base}.{fourth_octet}"
...
>>> def add_octet_100(base):
...   fourth_octet = "100"
...   return f"{base}.{fourth_octet}"
...
>>> base = "10.10.10"
>>> add_octet_250(base)
'10.10.10.250'
>>> add_octet_100(base)
'10.10.10.100'
>>>

You can see from this example that the reuse of code has not been mastered. Leveraging the before mentioned add_octet, you can call it as add_octet("10.10.10", "250") or add_octet("10.10.10", "100")

Not Trusting the Stack Trace

Oftentimes it seemed the number 1 response to a stack trace was to rerun the code. Now, don’t get me wrong, I am guilty of an occasional rage-peat but after two runs, I will at least adjust the code a bit before repeating again because, almost always, errors don’t go away. 

>>> device_info = {
...     "device_type": "cisco_ios",
...     "username": "cisco",
...     "password": "cisco",
...     "secret": "cisco",
... }
>>>
>>> device_info['user']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'user'
>>> device_info[user]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'user' is not defined. Did you mean: 'super'?
>>>

Running this many times knowing that “I did define user” or “I did define that variable” has caused hours of frustration for any programmer. The minor misspellings that we perform are endless.

I have worked hard to presume that I am wrong and reframe my mind to say things like “There is clearly an AttributeError, now, why?”. This can be extremely frustrating, but I promise controlling your emotions and presuming you are wrong will save you a tremendous amount of time. 

Overall, you have to trust the stack trace. I have personally never seen a bug with Python itself. More often than not the bugs I swear come from the third-party library are either using it wrong, not reading the documentation thoroughly enough, or most commonly a simple spelling mistake.

Know the State

While using print can be moderately controversial compared to using proper logging or debugging with pdb, let’s take a look at this example of using print to better understand the state.

>>> sites = ["nyc", "sfo", "chi"]
>>> sorted_sites = sites.sort()
>>> if sorted_sites[0] == "chi":
...     core_in_scope = True
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not subscriptable
>>>

Now, I am supremely confident that “c” becomes before “n” or “s”. I know that in this example “chi” has to be the 0th element. Now, of course, we learned from the previous section that this is not the case, and the error is almost certainly true, but what happened here? 

>>> sites = ["nyc", "sfo", "chi"]
>>> sorted_sites = sites.sort()
>>> print(sorted_sites)
None
>>>

Well color me shocked, how is sorted_sites a None, when it is used directly after being set? The important part here isn’t why, but rather accepting that it is, and one can focus on troubleshooting. 

In the interest of understanding why, the method .sort() adjusts the list directly but returns None, but the function sorted leaves the list intact but returns the sorted list. So this would work: sorted_sites = sorted(sites)

Using Pandas

Pandas was a go-to tool for many of the potential hires. Pandas is a powerful tool, but often overkill for the scenario at hand. I suspect that the reason Pandas is used so often is from the influence of a Computer Science background, but that is merely a hunch.

When it comes to manipulating data and ETL (Extract Transform Load) functionality, good old regex and split tend to be simpler to deal with given the type of data we often get (e.g., command-line text) in network automation.

Pandas is a great tool, but make sure to choose the right tool for the job.  


Conclusion

Wrapping It Up

Many of the experts may forget the struggles of getting started, but for those earlier in their journey, hopefully these practical examples provide clarity on some of the most common issues. Obviously the way we create our challenges emphasize certain testable scenarios and sidestep others, but each of these challenges offers helpful examples for beginners.

Over the next two blogs in this series, we will review what experts are doing and some key takeaways.

-Ken



Author