The Basics are Kinda Important, see also: Being Dumb wiht NAPALM-Ansible

Hello friends!

This is a tiny post to remind you that the basics (and the obvious stuff!) is kinda important, or so it turns out! I was messing around using napalm-ansible to push templatized (templetized? template-ized?) configurations to some devices. Everything was going great until it wasn’t. Using Ansible template my configurations were looking good, but for whatever reason napalm-ansible kept timing out and leaving me with a nasty-gram like this:

"msg": "cannot install config: Search pattern never detected in send_command_expect: [>##]\\s*$"

Well then… that isn’t ideal clearly. The extra fun part was that if I ran the exact same Playbook again immediately after failure it would complete and everything would be great. I hate when you reboot a switch (or anything) and the thing you were troubleshooting works… this is kinda how this felt for me. So determined to get to the bottom of it I cloned my CSR a bajillion times and started testing.

I found this GitHub issue where Kirk suggested setting the global delay factor. As I understand it this is there to basically just delay timeouts for things so that if the underlying napalm config merge was taking a long time for one or more configs we could gain some buffer time. I did this and originally set it to 2 as Kirk suggested, then I tried 4, and when that didn’t work I tried 20. Needless to say that took a really long time, but still eventually failed :(

Doing as Kirk suggested in that same issue and manually doing the config replace didn’t really work since 1) I wasn’t doing a config replace, and 2) my configuration template did not contain management access stuff, so doing a merge would gut my connectivity (because it was a CSR I still had access via console but still).

Eventually, I realized I was just being a noob and not paying attention to the little obvious stuff we always overlook (because you would think it would just work). Turns out that the router I was poking had no public interweb access – why would this matter you may ask yourself? Of course if napalm/ansible can get to it, shouldn’t life be all rainbows and puppy dogs? Oh, one would think, but yours truly was doing a dumb thing and setting not one, but FOUR NTP servers to a DNS name. Yeah… without internet access that whole resolving DNS thing doesn’t work out so well. I have no idea what the timeouts were (and w/ the global delay set to 20 or whatever I used it took forever to timeout), but it clearly was angering things! Flipping those NTP servers to dummy IPs immediately solved my problem, duh :)

This has been your friendly public service reminder that it’s always something obvious and simple!