Wednesday, 24 March 2010

Network lockup during heavy load

Since karmic, the Realtek drivers for the 8169 card have become borked. If you try and copy large amounts of data your machine will hang and require a hard reset.

I got this solution from here;

1) Check to see if the r8169 module is loaded
lsmod | grep r816
r8168 41104 0
-> lspci -v
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 03)
Subsystem: ASRock Incorporation Device 8168
Kernel driver in use: r8169
Kernel modules: r8169

2) Download the official Realtek driver
Realtek RTL8111/RTL8168

Update: I'm using the 8.017 driver but the current driver on that website is 8.018. It is possible that this version contains the same regression fault that causes this lock up behaviour as I experienced trouble after installing it. If you have trouble, download the older driver from here and add a comment to this post.

3) Remove the r8169 module
rmmod r8169
mv /lib/modules/`uname -r`/kernel/drivers/net/r8169.ko ~/r8169.ko.backup
(Note: the ` is a backtick, it is not an apostrophe or single quote )

4) Build the new r8168 module for the kernel
bzip2 -d r8168-8.009.00.tar.bz2
tar -xf r8168-8.009.00.tar
cd r8168-8.009.00
make clean modules
make install

5) Rebuild the kernel module dependencies
depmod -a
insmod ./src/r8168.ko

6) Remove the r8169 module from initrd
mv /initrd.img ~/initrd.img.backup
mkinitramfs -o /boot/initrd.img-`uname -r` `uname -r`

7) Add r8168 module to /etc/modules
echo "r8168" >> /etc/modules

Reboot, You are done!

1 comment:

Dennis said...

THANK YOU! I've been hunting down these damn lockups on my home server for weeks now, I was fearing for the life of my 5TB RAID5 array, after one or two of the lockups it had to resync. I was starting to fear drive failure until i realized that i could write to the array just fine from the system itself, it was only over the network that I had problems. My switch also started reporting CRC/Alignment errors on the server's port with timestamps close to/ almost matching the lockups