Archive for the ‘SysAdmin’ Category

Commonality in Unix Command Errors

Friday, July 10th, 2009

We’ve got a script that parses install logs for each Solaris client looking for errors and then generating a summary report of what happened and what went wrong. It works well, but until recently, it’s been quite limited as it only flagged variations of the words Error and Warning. That was ok originally, but the install scripts got more complicated and it’s now been out grown.

After seeing a few variations on “cannot”, including “can not”, “can’t” and “could not” and “couldn’t” showing up in the logs, I thought it’d be interesting to audit some of the common unix commands we use in the scripts and see what sort of words they use to describe errors. It turned out to be quite a long list.

These are all taken from Solaris 10 10/08 clients. Where relevant I’ve kept them case sensitive.

cp/mv/ln

  • failed / Failed
  • cannot
  • could not
  • unable
  • not
  • can’t
  • Insufficient
  • exceeds
  • Invalid

mkdir

  • Failed
  • but is not
  • not permitted

touch

  • bad
  • cannot

chown

  • can’t
  • unknown
  • invalid
  • too large

chmod

  • can’t
  • could not
  • invalid
  • required
  • not permitted
  • WARNING
  • ERROR

ls

  • can’t

ksh

  • too big
  • required
  • couldn’t
  • prohibited
  • cannot
  • Bad / bad
  • failure
  • unknown
  • invalid
  • is not
  • not
  • denied
  • too many
  • corrupted
  • can’t
  • out of range
  • exceeds
  • already
  • restricted
  • missing
  • expected
  • failed
  • requires

The thing about most unix commands is that they’re generally not very chatty. There’s a good chance if they’re producing any sort of output at all, it’s probably an error of some description. With that in mind, it’s quite possible there’s a much more elegant way of writing the parse scripts.

Purging Log Files a.k.a A logadm Letdown

Friday, May 15th, 2009

This morning I set out to do something fairly simple. I’ve got a directory full of JumpStart log files (generated as a client is JumpStarted) and I want to purge everything older than six months.

Should be a simple one liner cron job right? You’re right, it is. But I had to discover what it wasn’t before I could reach that conclusion.

Here’s my directory structure:

/export/home/loguser/clienthostname-timestamp/ and inside each clienthostname-timestamp directory is a bunch of log files, some with unique names.

I wanted to use logadm to rotate the old logs right out of existance. After a lot of discovering all sorts of things logadm could do for me, it turns out that just deleting a directory full of files isn’t one of them. Delete the contents of a file? No problem. Delete the file itself? Not a chance. Bummer.

In the end, find is going to do exactly what I want:

find /export/home/loguser -type d -mtime +180 -exec rm -rf {} \;

That’s all good and simple and I’m happy with that solution. But if I knew I could just do that, why did I burn so much time trying to get logadm to play nice?

I wanted to use what seemed to be the right tool for the job. From the logadm man page:

logadm - manage endlessly growing log files

It sounds right, and more importantly, whoever comes after me looking for why log files seem to be disappearing is probably going to start their search there. Log files… think logadm. Unfortunately, it’s not going to be quite that simple for them and I’m keenly aware that “them” could very well be me.