Python // Misc: Netmiko and what constitutes “done”

Note, in order to reproduce the results specified here, you will need to use Netmiko >= 2.2.2.

Screen-scraping suffers from two fundamental problems–the first is that screen-scraping returns unstructured data (i.e. a big, block of text) and the second is that the communication channel has no good notion of being done. While people frequently bring up the former issue, the latter is probably a bigger problem.

Let’s discuss the second issue in some more detail (and what it implies).

The communication channel has no good notion of being done

What does this statement mean? It means that when you send a command down the SSH channel, there is no clear indicator that your command has completed. The general practice for screen-scraping is to search for the trailing router prompt and to use this to indicate “done”.

Let’s look at a fairly typical Netmiko send_command() call:

#!/usr/bin/env python
from netmiko import ConnectHandler
from getpass import getpass

host = input("Enter your hostname: ")
device = { 
    'device_type': 'cisco_ios',
    'host': host,
    'username': 'pyclass',
    'password': getpass,
} 

net_connect = ConnectHandler(**device)
output = net_connect.send_command("show ip int brief")
print(output)

What does send_command() do here (to determine that the SSH channel is done)?

By default send_command() will first try to find the current device prompt using the find_prompt() method. The reason for this is that network device prompts frequently change during an SSH session (enable mode, configuration mode, et cetera). This implies Netmiko will send a newline down the SSH channel before executing the command.

Using Pdb() in the Netmiko code, let’s inspect this:

(Pdb) p search_pattern
'pynet\-rtr1\#'


This is from inside of the send_command() method and shortly after the call to find_prompt(). Netmiko is going to search for the trailing router prompt (the backslash escaping of the prompt happens automatically so that this pattern can be used in a regular expression).


Now there are multiple things that can go wrong here…

  1. Sending a newline right before the command might not work (for example, you might be in a context that doesn’t allow another ‘enter’ to be executed),
  2. Even if find_prompt() works, the command we are executing might not directly return a trailing prompt (for example, a command that prompts for additional information),
  3. Even if our search pattern is entirely correct how long should we wait for the trailing prompt to be returned: 10 seconds, a minute, an hour?



Let’s look at this some more:

output = net_connect.send_command("show ip int brief")


We send the above command down the SSH channel and it will work successfully (as we successfully match the trailing router prompt). Note, I have added some minor code to track the code execution time and to make the output printing cleaner.

$ python test2_send_command.py 
Enter your hostname: test1.twb-tech.com
Password: 


############################################################
Interface           IP-Address      OK? Method Status   Protocol
FastEthernet0       unassigned      YES unset  down     down    
FastEthernet1       unassigned      YES unset  down     down    
FastEthernet2       unassigned      YES unset  down     down    
FastEthernet3       unassigned      YES unset  down     down    
FastEthernet4       10.220.88.20    YES NVRAM  up       up      
Vlan1               unassigned      YES unset  down     down    
############################################################


Total time: 0:00:05.922922



This SSH login process and show command took about 6 seconds to execute (but it did execute successfully and we found the trailing router prompt).



Now what about executing this second command on the CLI (this is for a Cisco IOS router)?

copy flash:/c880data-universalk9-mz.154-2.T1.bin flash:/test1.bin


The output of this command looks as follows (on the router’s CLI):

# copy flash:/c880data-universalk9-mz.154-2.T1.bin flash:/test1.bin
Destination filename [test1.bin]? 
Copy in progress...CCCCCCCCCCCCCCCCCCCC
...   [large number of C's omitted]
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
42628912 bytes copied in 143.676 secs (296702 bytes/sec)


What are the issues with executing this copy command?

First of all the command prompt us after we hit ‘enter’. The router asks us, “is the destination filename really test1.bin” and we need to respond to this.

Secondly, the command starts outputting a bunch of ‘C’ characters (hundreds of them).

Thirdly, the command takes a long time to complete (over two minutes).



What happens if we try to execute this copy command using Netmiko’s send_command().

net_connect = ConnectHandler(**device)
cmd = 'copy flash:/c880data-universalk9-mz.154-2.T1.bin flash:/test1.bin'
output = net_connect.send_command(cmd)


And then executing this:

$ python test3_send_command_copy.py 
Traceback (most recent call last):
  File "test3_send_command_copy.py", line 16, in <module>
    output = net_connect.send_command(cmd)
  File "/home/gituser/netmiko/netmiko/base_connection.py", line 1072, in send_command
    search_pattern))
OSError: Search pattern never detected in send_command_expect: pynet\-rtr1\#


Netmiko sent the command and the trailing router prompt never came back (because the router was stuck waiting for us to confirm the filename).




How can we handle this?

There are a few different ways in Netmiko to handle this, one of them is to use the ‘expect_string’ argument to send_command(). The ‘expect_string’ argument tells Netmiko what to look for in the output.


The updated code now looks as follows:

starttime_time = datetime.now()
net_connect = ConnectHandler(**device)
cmd = 'copy flash:/c880data-universalk9-mz.154-2.T1.bin flash:/test1.bin'
output = net_connect.send_command(
    cmd, 
    expect_string=r'Destination filename'
)
try:
    output += net_connect.send_command('\n', expect_string=r'#')
except:
    end_time = datetime.now()
    print("Total time: {}".format(end_time - start_time))
    raise


Note, I included the exception handling and start_time/end_time code so that I can track how long the code runs.


Here is the output that we receive when we execute this:

$ python test3_send_command_copy.py 
Enter your hostname: cisco1.twb-tech.com
Password: 
Total time: 0:01:46.065829
Traceback (most recent call last):
  File "test3_send_command_copy.py", line 19, in <module>
    output += net_connect.send_command('\n', expect_string=r'#')
  File "/home/gituser/netmiko/netmiko/base_connection.py", line 1072, in send_command
    search_pattern))
OSError: Search pattern never detected in send_command_expect: #



Netmiko waited 106 seconds for the command to complete.

Under the hood, the send_command() method defaults to a timeout of roughly 100 seconds; the additional 6 seconds is additional overhead time (the connection setup, disabling of output paging, and additional delay at the beginning of send_command).

Now this timeout makes sense…as we saw that our earlier manual copy required over 143 seconds and Netmiko’s send_command() only waits for 100 seconds. 




How do we tell Netmiko to wait longer? 

There are two main ways to do this: delay_factor and global_delay_factor.


delay_factor is an argument that we can pass into send_command(); its effect is to modify all of the delays embedded in that method (for example, a delay_factor=2 will double all of the delays; a delay_factor=4 will quadruple all of the delays).

Consequently, if our command requires 143 seconds to complete, we can set delay_factor=2. This will cause send_command to wait at least 200 seconds before the command times out. Note, Netmiko will still end the loop earlier if the expect_string is detected.


Our updated code now looks as follows:

start_time = datetime.now()
net_connect = ConnectHandler(**device)
cmd = 'copy flash:/c880data-universalk9-mz.154-2.T1.bin flash:/test1.bin'
output = net_connect.send_command(
    cmd, 
    expect_string=r'Destination filename'
)
output += net_connect.send_command(
    '\n', 
    expect_string=r'#', 
    delay_factor=2
)
end_time = datetime.now()


And executing it:

$ python test3_send_command_copy.py

Enter your hostname: test1.twb-tech.com
Password: 

############################################################
Destination filename [test1.bin]? Copy in progress...CCCCC
....
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
42628912 bytes copied in 146.256 secs (291468 bytes/sec)
############################################################

Total time: 0:02:32.690754 


…the copy completed successfully.




Now what about global_delay_factor?

global_delay_factor also let’s you increase the delays in your program, but it essentially increases ALL of the delays in your program.

When would you want to use global_delay_factor and not just delay_factor? Occasionally you run into issues where the SSH session preparation fails (for example, due to the network device being too slow). In this situation, you generally need to tell Netmiko to slow down.


Let’s update our code and use global_delay_factor:

#!/usr/bin/env python
from datetime import datetime
from netmiko import ConnectHandler
from getpass import getpass

host = input("Enter your hostname: ")
device = { 
    'device_type': 'cisco_ios',
    'host': host,
    'username': 'pyclass',
    'password': getpass(),
    'global_delay_factor': 2,
} 

start_time = datetime.now()
net_connect = ConnectHandler(**device)
cmd = 'copy flash:/c880data-universalk9-mz.154-2.T1.bin flash:/test1.bin'
output = net_connect.send_command(
    cmd, 
    expect_string=r'Destination filename'
)
output += net_connect.send_command('\n', expect_string=r'#')
end_time = datetime.now()
 
print("\n")
print("#" * 60) 
print(output)
print("#" * 60) 
print("\n")
print("Total time: {}".format(end_time - start_time))
print("\n")

And executing this yields:

$ python test4_global_delay.py 
Enter your hostname: test1.twb-tech.com
Password: 

############################################################
Destination filename [test1.bin]? Copy in progress...CCCCCCCCCCCCCCC
...
CCCCCCCCCCCCCCCCCCCCCC
42628912 bytes copied in 142.880 secs (298355 bytes/sec)
############################################################

Total time: 0:02:33.498948 


Once again the file copied successfully. This time we used global_delay_factor=2 and it took marginally longer to execute when compared to using delay_factor=2 (about a second). The dominant factor for this script to execute is the time it takes to execute the copy operation.


 

Let’s look at a more extreme example of slowing down a script. 

What if we go back to our initial send_command(“show ip int brief”) script (that took about 6 seconds to execute) and this time we set the global_delay_factor equal to 4.

device = {
    'device_type': 'cisco_ios',
    'host': host,
    'username': 'pyclass',
    'password': getpass(),
    'global_delay_factor': 4,
} 

start_time = datetime.now()
net_connect = ConnectHandler(**device)
output = net_connect.send_command("show ip int brief")
end_time = datetime.now() 


If we re-execute this script, it now takes over 19 seconds to complete:

$ python test5_send_command_global.py
Enter your hostname: test1.twb-tech.com
Password:

############################################################
Interface          IP-Address      OK? Method Status    Protocol
FastEthernet0      unassigned      YES unset  down      down    
FastEthernet1      unassigned      YES unset  down      down    
FastEthernet2      unassigned      YES unset  down      down    
FastEthernet3      unassigned      YES unset  down      down    
FastEthernet4      10.220.88.20    YES NVRAM  up        up      
Vlan1              unassigned      YES unset  down      down    
############################################################

Total time: 0:00:19.672448




Now one additional item worth noting…

What happens if you set both delay_factor and global_delay_factor at the same time?

By default the larger delay factor will win (so whichever value is the larger of the two will win out). In other words, whichever one slows you down the most will be selected.