debugging puppet and facter with strace

I worked on a slightly annoying problem yesterday and thought I’d share the process with you. On one of our linux boxes puppet stopped working with an error that was not very helpful:

[root@portrix]# puppet agent --test
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Error: Could not retrieve local facts: private method `split' called for nil:NilClass
Error: Failed to apply catalog: Could not retrieve local facts: private method `split' called for nil:NilClass

All this told me was that something was wrong with facter during the puppet run. I was able to reproduce the error with just facter but even the debug output was not much more helpful:

[root@portrix]# facter --puppet --debug
Found no suitable resolves of 1 for ec2_metadata
value for ec2_metadata is still nil
private method `split' called for nil:NilClass

In fact, this was even more misleading since I was drawn to the message about ec2_metadata. This box is a physical server and should not have anything to do with ec2. I chased this rabbit for a while but without any success. The best idea then became to run strace on the command and see if this returned anything useful. Strace records and prints all system calls that are made during the execution of a program and among other things it will show you which libraries, files etc are being touched. I produces a ton of output and I will spare you the whole thing. But the general idea is to scan the output from the bottom up. The first things (from the bottom) will be the printing of the error message to your screen and hopefully the stuff that caused the error will show up slightly above that. This is the (not very interesting) part where the error message is printed:

[root@portrix]# strace facter --puppet --debug
execve("/usr/bin/facter", ["facter", "--puppet", "--debug"], [/* 28 vars */]) = 0
brk(0)                                  = 0x18d9000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f78f4f7a000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
[... many, many lines deleted ...]
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
write(2, "private method `split' called fo"..., 46private method `split' called for nil:NilClass) = 46
write(2, "\n", 1
)                       = 1
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigaction(SIGINT, {SIG_DFL, [INT], SA_RESTORER|SA_RESTART, 0x30162326a0}, {0x3c6a295ef0, [INT], SA_RESTORER|SA_RESTART, 0x30162326a0}, 8) = 0
close(3)                                = 0
munmap(0x7fe785c7e000, 4096)            = 0
exit_group(12)                          = ?

And just a bit above that is the last file it read before failing:

stat("/etc/oratab", {st_mode=S_IFREG|0664, st_size=741, ...}) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
open("/etc/oratab", O_RDONLY)           = 4
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
fstat(4, {st_mode=S_IFREG|0664, st_size=741, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe785c7c000
read(4, "#\n\n\n\n# This file is used by ORAC"..., 4096) = 741

So it read my the /etc/oratab file and if we scan the output just a bit further up we can also find the module that does this:

stat("/var/lib/puppet/lib/facter/oracle_sid.rb", {st_mode=S_IFREG|0644, st_size=255, ...}) = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
open("/var/lib/puppet/lib/facter/oracle_sid.rb", O_RDONLY) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=255, ...}) = 0
close(4)                                = 0
rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
open("/var/lib/puppet/lib/facter/oracle_sid.rb", O_RDONLY) = 4
close(4)                                = 0
open("/var/lib/puppet/lib/facter/oracle_sid.rb", O_RDONLY) = 4
fstat(4, {st_mode=S_IFREG|0644, st_size=255, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe785c7c000
read(4, "require 'facter'\n\nisOracle = Fil"..., 4096) = 255
read(4, "", 4096)                       = 0
close(4)                                = 0

This file oracle_sid.rb is a facter plugin that we wrote ourselves and immediately when I saw this I knew where to look. This is the part of our code that caused this error in facter:

result = open('/etc/oratab') { |f| f.grep(/dbhome_1/) }
result = result[0].split(":").first

The code is looking for the first line with the string “/dbhome_1” in it (which usually works for our databases) and then extracts the oracle SID from that line (seperated by colons). But the oratab on that server was empty, so we were trying to split an empty string, resulting in the error “private method `split’ called for nil:NilClass”.

Once this was known the fix was rather easy and this is not what I wanted to show in the blog post. But I hope you got the idea of using strace to debug a linux program that is not returning the level of logging that you would need to find the error yourself.

One thought on “debugging puppet and facter with strace

  1. Puppet in Solaris11.3

    So far I have the certs step done (but not the database yet)
    My next step will be to be able to schedule some script and get it execute on the slave system.

Leave a Reply

Your email address will not be published. Required fields are marked *