Have you ever experienced hard lockups and seen no trace of the cause in your log files? Those situations can be even more of a pain if you do not have physical access to the machine since you will not be able to look for kernel oops on the console. You could buy a serial console or an ip kvm but if you don’t have the need for remote control, but would really like to be able to debug without being physically present you need to check out netconsole. Netconsole sends printk messages over UDP.

Setting up netconsole is not difficult but the syntax can be a bit tiresome. Netconsole needs several bits of information in order to function properly.

Find MAC of remote agent in same subnet

REMOTE_AGENT=172.16.0.1
MAC=$(ping -c 1 $REMOTE_AGENT > /dev/null ; arp -n $REMOTE_AGENT | grep ^$REMOTE_AGENT | awk '{print $3}')
echo Remote MAC: $MAC

Find MAC of default gw

GATEWAY=$(netstat -rn | awk '/^0.0.0.0/ {print $2}')
MAC=$(ping -c 1 $GATEWAY > /dev/null ; arp -n $GATEWAY | grep ^$GATEWAY | awk '{print $3}')
echo Remote MAC: $MAC

Initialize netconsole

Now you should have enough information to go ahead and initalize netconsole so lets give it a test

modprobe netconsole [email protected]_ip/dev_name,[email protected]_ip/remote_mac

Now we still need to get something listening on the remote and test if it actually works. Log into your remote machine and run

nc -l -p remote_port -u | teeĀ  somelogfile.log

For a more permanent setup you might want to use syslog but this will suffice for now. If it’s a short term but long running test you might be well advised to run that from a screen session.

Good now we have the remote listening on udp with netcat. We should make sure that the messages are getting logged. Log back into the machine thats running netconsole (local_ip) and run the following.

dmesg -n 8

This will increase the number of things that get logged.

Now find an innocuous kernel module that you can load and unload (i like to use floppy)

rmmod floppy (in case its already loaded)
modprobe floppy

You should have seen some output on your remote machine that looks something like

Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077

Great now you have netconsole working! If you get kernel oops your remote box should display it and log it to a file as well.

Want to make netconsole active through reboots? No problem we just need to edit a few files.

First lets get netconsole loading on boot by adding the module to /etc/modules

echo "netconsole" >> /etc/module

That was easy enough, but we need to make sure it has the proper options as well so lets add the module options to /etc/modprobe.d/netconsole

echo "options netconsole [email protected]_ip/dev_name,[email protected]_ip/remote_mac" > /etc/modprobe.d/netconsole

That should do it. Go ahead and try rebooting the machine running netconsole and watch your remote to see the boot msgs that happen after netconsole loads.

Note: there is a dynamic way to specify how netconsole is configured but you need to have CONFIG_NETCONSOLE_DYNAMIC in your kernel and since debian etch does not have this by default I wont cover it here. For more information check out the netconsole doc in the kernel source /usr/src/linux/Documentation/networking/netconsole.txt.

Now if you would like to make the remote side a bit more permanent thats pretty easy as well. Lets install and configure syslog-ng.

aptitude install syslog-ng

append the following to your /etc/syslog-ng/syslog-ng.conf

Note: make sure your set remote_port as you did above

source net { udp(ip("0.0.0.0") port(remote_port)); };
destination netconsole { file("/var/log/$HOST/netconsole.log"); };
log { source(net); destination(netconsole); };

Now restart syslog-ng

/etc/init.d/syslog-ng restart

Now you should be able to find the logs in /var/log/local_ip/netconsole.log on your remote machine. Note: local_ip is the ip of the machine that was running netconsole