Debugging Resources¶
If only we wrote perfect code from the get-go…
This section describes in detail AEtest built-in tools that may be useful during your quest on zapping those pesty bugs. Keep in mind that there may be more than one way to debug something: the following are supplements to any existing public ways of debugging python code.
Hint
there is no magic: the AEtest engine is purely developed in python.
PDB & Multiprocessing¶
Python’s native interactive source code debugger pdb
is quite powerful when
it comes to zapping bugs: setting breakpoints, stepping code, inspecting stack
frames & such. This documentation is not going to cover the use of it: for all
intents and purposes, read the official documentation: The Python Debugger.
However, one thing to note is that pdb
does not play well with python
Multiprocessing module. When multiprocessing
forks a child process,
it closes the stdin
as part of the default child process bootstrap. Thus,
as pdb
relies on stdin
to provide an interactive session, it crashes
with the following cryptic exception:
Traceback (most recent call last):
File "/auto/pysw/cel63/python/3.4.1/lib/python3.4/multiprocessing/process.py", line 254, in _bootstrap
self.run()
File "/auto/pysw/cel63/python/3.4.1/lib/python3.4/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "<stdin>", line 2, in func
File "/auto/pysw/cel63/python/3.4.1/lib/python3.4/bdb.py", line 52, in trace_dispatch
return self.dispatch_return(frame, arg)
File "/auto/pysw/cel63/python/3.4.1/lib/python3.4/bdb.py", line 96, in dispatch_return
if self.quitting: raise BdbQuit
bdb.BdbQuit
This error can be typically seen when trying to use pdb
in your script while
it ran under Easypy Execution (as a Task/subprocess), forcing you
to debug the debugger for hours…
Luckily, there are a few known ways to workaround this problem:
Open
/dev/stdin
manually in child process before launching pdb, or use the followingForkedPdb
class:import sys import pdb class ForkedPdb(pdb.Pdb): '''A Pdb subclass that may be used from a forked multiprocessing child ''' def interaction(self, *args, **kwargs): _stdin = sys.stdin try: sys.stdin = open('/dev/stdin') pdb.Pdb.interaction(self, *args, **kwargs) finally: sys.stdin = _stdin # to use: ForkedPdb().set_trace()
Use Easypy Task argument
pdb = True
, which automatically opens/dev/stdin
in the given task process, and also turning on PDB on Failure.Use a multiprocesss-aware debugger like Winpdb
Any other creative methods you may cook-up :-)
Hint
StackOverflow is your friend: http://stackoverflow.com/questions/4716533
PDB on Failure¶
pdb
is golden, but putting pdb
everywhere in your scripts and
libraries trying to catch a runaway bug is… not pretty. There are
times where you can’t seem to figure out what, where, and why an exception is
raised, why your script is failing, and where to put
import pdb; pdb.set_trace()
…
PDB on Failure to the rescue!
When AEtest scripts are run with the flag pdb = True
, whenever an error,
failure, or exception is encountered during execution, the testing engine
pauses and starts an interactive post_mortem debugging session
at the stack frame of failure.
Tip
Additionally to pdb = True
, pyATS debugging supports any debugger that
has post_mortem capabilities. Want to use pudb
? Set pdb = 'pudb'
.
Want to use web-pdb
? Set pdb = 'web_pdb'
.
# Example
# -------
#
# PDB on Failure demonstration
from pyats import aetest
class CommonSetup(aetest.CommonSetup):
@aetest.subsection
def subsection(self):
pass
class TestcaseOne(aetest.Testcase):
@aetest.setup
def setup(self):
pass
@aetest.test
def test(self):
# raise an exception by calling something that doesn't exist
blablabla()
@aetest.cleanup
def cleanup(self):
pass
class CommonCleanup(aetest.CommonCleanup):
@aetest.subsection
def subsection(self):
pass
aetest.main(pdb = True)
# output result:
#
# +------------------------------------------------------------------------------+
# | Starting common setup |
# +------------------------------------------------------------------------------+
# +------------------------------------------------------------------------------+
# | Starting subsection subsection |
# +------------------------------------------------------------------------------+
# The result of subsection subsection is => PASSED
# The result of common setup is => PASSED
# +------------------------------------------------------------------------------+
# | Starting testcase TestcaseOne |
# +------------------------------------------------------------------------------+
# +------------------------------------------------------------------------------+
# | Starting section setup |
# +------------------------------------------------------------------------------+
# The result of section setup is => PASSED
# +------------------------------------------------------------------------------+
# | Starting section test |
# +------------------------------------------------------------------------------+
# Caught exception during execution:
# Traceback (most recent call last):
# File "a.py", line 16, in test
# blablabla()
# NameError: name 'blablabla' is not defined
# > example_pdb_on_failure.py(16)test()
# > blablabla()
# (Pdb)
#
The interactive debugging session starts with (pdb)
prompt, and waits for
user input. Refer to Debugger Commands for details on how to use the pdb
debugger.
You can also use pdb
from the command line by including --pdb
. This will
drop you into the default Python pdb debugger. If you have a third-party debugger
installed that supports post_mortem you can specify it like so --pdb <debugger>
.
Examples:
pyats run job job.py --pdb
pyats run job job.py --pdb pudb
pyats run job job.py --pdb web_pdb
Tip
submitting a testscript to sanity/regression with pdb = True
may be
career-limiting.
Pause on Phrase¶
import pdb; pdb.set_trace()
and PDB on Failure features both have its
limitations: the first requires editing the script, and the second only pauses
when an exception or non-passing result occurs. What about cases where you just
want to pause the testscript execution somewhere & check the current state
and/or configuration of your environment, logs, and testbed devices?
The pause on phrase feature allows you to pause on any log messages
generated by the current script run, including CLI output from devices,
without requiring any modifications to the scripts and/or its libraries.
When enabled, it actively filters all log messages
propagated to the logging.root
logger, and pauses when a match is found.
Three distinct types of pause actions are supported:
email
pauses, creates a pause file and notifies the user via email. Deleting this file (or after a set timeout value is reached) resumes the script execution.
pdb
pauses and opens a Python Pdb Debugger at the caller stack.
code
pauses and opens a Python Interactive Shell at the caller stack.
To enable this feature, provide the pause_on
(see
Standard Arguments) to your script run with one of the following
types of values:
full path & name to a YAML input file, satisfying the schema described below, or
the content of said YAML file, in corresponding dictionary format, or
a regex string for the engine to look and pause on.
The following schema describes the YAML pause file format:
# Pause On Phrase YAML Schema
# ---------------------------
timeout: # timeout value in seconds (int)
# specifies the max pause time before the script resumes
# execution automatically. Set to 0 to wait indefinitely.
# (only applies to email action)
# (default: 0)
# (optional)
action: # action to be performed on pause (email/pdb/code)
# choose to either send user a notification email, provide a
# pdb debugger, or an interactive code shell on pause.
# (disables timeout value when action is pdb or code)
# (default: email)
# (optional)
patterns: # patterns to search & pause on (list)
# each list item needs to follow a particular structure as below
# (mandatory)
- pattern: # pattern to pause on (str)
# this is internally compiled into a regex pattern used
# to match log messages with
# (mandatory)
section: # section/uid to enable pattern searching (str)
# this is internally compiled into regex, used
# to match the current executing section uid
# note that you can use Testcase.setup to denote setup
# section of a testcase, etc.
# (if not provided, the pattern is used globally)
# (optional)
Normally, the default action is email
: when the script is paused after
matching a log phrase, it sends a notification email to the current
executing user, including with it the instructions on how to resume execution
(eg, remove the pause file).
Note
If the execution is paused in email
mode and the pause file is modified,
timer will be disabled and execution will be paused until the pause file is
deleted.
However, if the action is set to pdb
or code
, the script pauses
and provides an interactive python debugger/shell to the user, allowing for
look-arounds, debugs & etc. Be careful with what you do in these modes:
modifications persist when the script continues.
Hint
use Ctrl-D
to resume from interactive shell, and c
to resume from
pdb.
Warning
timeout counter is not active/disregarded in pdb/interactive mode.
Here’s an example YAML pause file:
# Example
# -------
#
# yaml input pause file
timeout: 600 # pause a maximum of 10 minutes
patterns:
- pattern: '.*pass.*' # pause on all log messages including
# .*pass.* in them globally
- pattern: '.*state: down.*' # pause whenever 'state: down' is found
section: '^common_setup\..*$' # enable for all common_setup sections
- pattern: '.*should pause.*' # pause whenever 'should pause' is found
section: '^TestcaseTwo\.setup$' # pause on TestcaseTwo setup section
Keep in mind that the content of YAML seamlessly translates to Python dict
types. Therefore, as a convenience feature, it is also possible to use the
content of a YAML file in its dict format as input to pause_on
instead of
a file:
# Example
# -------
#
# dict format corresponding to the above
pause_on = {'timeout': 600,
'patterns': [{'pattern': '.*pass.*'},
{'pattern': '.*state: down.*',
'section': '^common_setup\..*$'},
{'pattern': '.*should pause.*',
'section': '^TestcaseTwo\.setup$'}]}
# command-line call
# -----------------
# python testscript.py -pause_on="{'timeout': 600,\
# 'patterns': [{'pattern': '.*pass.*'},\
# {'pattern': '.*state: down.*',\
# 'section': '^common_setup\..*$'},\
# {'pattern': '.*should pause.*',\
# 'section': '^TestcaseTwo\.setup$'}]}"
Hint
if your pause regex patterns are extremely generic, eg, .*
, it
pauses … a lot. However, if multiple regex matches to the same log
message, the engine is pretty smart about only pausing once, and avoiding
duplicated pauses on this same message.
In addition, as a convenience measure, it is also possible to simply provide a
string to the pause_on
argument. This simplified input mode is mostly intended
for local debugging sessions, avoiding the need to use a properly formatted
dictionary or YAML file. As this is solely intended for interactive debugging,
the default action with simplified input mode is pdb
.
# Example
# -------
#
# using simple string regex inputs
# pause on all instances of 'some phrase'
bash$ python testscript.py -pause_on='.*some phrase.*'
Below are some examples of the pause on phrase feature in action:
# Example 1
# ---------
#
# pause on phrase with code action
from pyats import aetest
import logging
logger = logging.getLogger()
class CommonSetup(aetest.CommonSetup):
@aetest.subsection
def subsection(self):
pass
class tc_one(aetest.Testcase):
@aetest.setup
def setup(self):
pass
@aetest.test
def test(self):
logger.info('i should pause')
@aetest.cleanup
def cleanup(self):
pass
class CommonCleanup(aetest.CommonCleanup):
@aetest.subsection
def subsection(self):
pass
if __name__ == '__main__':
logging.root.setLevel(logging.INFO)
aetest.main(pause_on = dict(action = 'code',
patterns = [{'pattern': 'i should pause'}]))
# output of the script:
#
# Starting common setup
# Starting subsection subsection
# The result of subsection subsection is => PASSED
# The result of common setup is => PASSED
# Starting testcase tc_one
# Starting section setup
# The result of section setup is => PASSED
# Starting section test
# i should pause
#
# --------------------------------------------------------------------------------
# Pause On Phrase: Interactive Console
# -> /path/to/my/example/testscript.py[20] test()
# (press Ctrl-D to resume execution)
# >>>
# Example 2
# ---------
#
# pause on phrase with from command line, simplified mode
# output of the script:
# +------------------------------------------------------------------------------+
# | Starting common setup |
# +------------------------------------------------------------------------------+
# +------------------------------------------------------------------------------+
# | Starting subsection subsection |
# +------------------------------------------------------------------------------+
# The result of subsection subsection is => PASSED
# The result of common setup is => PASSED
# +------------------------------------------------------------------------------+
# | Starting testcase tc_one |
# +------------------------------------------------------------------------------+
# +------------------------------------------------------------------------------+
# | Starting section setup |
# +------------------------------------------------------------------------------+
# The result of section setup is => PASSED
# +------------------------------------------------------------------------------+
# | Starting section test |
# +------------------------------------------------------------------------------+
# i should pause
#
# --------------------------------------------------------------------------------
# Pause On Phrase: Interactive Console
# -> /path/to/my/example/testscript.py[20] test()
# (press Ctrl-D to resume execution)
# >>>
The pause will also dump the connection information, including which devices are currently connected.
# --------------------------------------------------------------------------------
# Pause On Phrase: Connection Information
#
# +-------------------+--------------------------------+
# | Device Property | Value |
# +-------------------+--------------------------------+
# | Name | device.name |
# | Alias | device.alias |
# | Active Connection | alias of active connection |
# | Status | Connected|Disconnected |
# | Spawn Command | command for active connection |
# | -------- | -------- |
# | Connection Alias | connection alias e.g. cli |
# | Class | Unicon, Gnmi, Netconf, etc |
# | IP | IPV4|IPV6 address |
# | Protocol | ssh, telnet, etc |
# | -------- | -------- |
# | URL | <protocol>://<ip>:<port> |
# +-------------------+--------------------------------+
Note
If it is required to connect to the device directly while the test is paused, the device connection must be disconnected from before it can be accessed directly.
>>> import pyats.easypy as ep
>>> dev = ep.runtime.testbed.devices['<device>']
>>> dev.disconnect()
# the device can now be connected to in a separate terminal. Once ready to resume the test,
# break the connection and reattach with
>>> dev.connect()