Help troubleshoot a fussy server

Zuhaib

Well-Known Member
Joined
Dec 11, 2003
Messages
855
Location
San Francisco, CA
Car(s)
19 BMW X3
So i am frustrated with my NAS right now, for some reason it does not want to run stable if its doing something CPU intensive.

It runs 99% of the stable UNLESS its doing a Hash check on a large torrent (usually 10GB+), in which case it runs for a bit and then just hangs. If i have a monitor plugged in it just blank, and the system log say nothing. You just see basic background nose and boom around the time the system stops responding to pings it stops.
Now I did run Prime95 and ran in to some issues which got solved by pumping up the vcore on the processor but still when I fire up Vuze CLI, after a while of checking I get this error


Sounds like the main OS HD is going, which is /dev/sdc.. But I ran the Seagate HD test and it passes with no errors which makes me o_0.

Is this a hardware failure? if so what... Is this just Vuze being stupid or what?

Quick Specs
Soyo Dragon + Mobo
AMD Athlon 1800+
512MB DDR Ram
40GB Seagate Drive (OS, Ubuntu Server)

SATA Card with two SATA Drives
 
I think we actually just had this problem at work the other day, and if it is, it's actually an I/O buffering problem. When your server hits a full CPU load for an extended period of time, eventually the buffer runs dry and it can't process I/O requests properly anymore, so it starts throwing up whenever it tries to access the drives.
 
I'm going to say it is a bad hard disk. It could be something else, but I'm assuming that it is pretty old (40GB). Your average hard disk lasts for 3 years before dying (yes there are exceptions my raptor lasted 7 and I've had deathstars last less than a year).
 
So i swapped out the drive, and while it seems to slove some issues i am still getting hang up but now i have logs of it. And this log is very interesting
ct 21 19:59:31 HomeNAS kernel: [ 6627.000089] ------------[ cut here ]------------
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000097] WARNING: at /build/buildd/linux-2.6.28/net/sched/sch_generic.c:226 dev_watchdog+0x219/0x230()
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000101] NETDEV WATCHDOG: eth1 (via-rhine): transmit timed out
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000104] Modules linked in: appletalk input_polldev video output it87 hwmon_vid lp i2c_viapro via_ircc ppdev via_agp irda parport_pc serio_raw agpgart shpchp pcspkr parport crc_ccitt usbhid via_rhine mii floppy fbcon tileblit font bitblit softcursor
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000135] Pid: 3100, comm: java Not tainted 2.6.28-15-server #52-Ubuntu
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000139] Call Trace:
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000151] [<c01412e0>] warn_slowpath+0x60/0x80
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000163] [<c0198bee>] ? mempool_alloc_slab+0xe/0x10
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000169] [<c0198e7c>] ? mempool_alloc+0x2c/0xd0
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000177] [<c015e683>] ? getnstimeofday+0x53/0x110
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000187] [<c02d421d>] ? strlcpy+0x1d/0x60
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000197] [<c043c722>] ? netdev_drivername+0x32/0x40
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000202] [<c0450e09>] dev_watchdog+0x219/0x230
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000210] [<c044690b>] ? neigh_table_init_no_netlink+0x14b/0x1d0
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000215] [<c04460e0>] ? neigh_periodic_timer+0x0/0x190
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000222] [<c014c257>] ? mod_timer+0x37/0x80
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000226] [<c0446204>] ? neigh_periodic_timer+0x124/0x190
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000231] [<c014b830>] run_timer_softirq+0x130/0x200
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000236] [<c0450bf0>] ? dev_watchdog+0x0/0x230
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000240] [<c0450bf0>] ? dev_watchdog+0x0/0x230
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000247] [<c01468f7>] __do_softirq+0x97/0x170
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000252] [<c0146a2d>] do_softirq+0x5d/0x60
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000257] [<c0146ba5>] irq_exit+0x55/0x90
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000264] [<c010c1c3>] do_IRQ+0x83/0xa0
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000270] [<c019e6ed>] ? __do_page_cache_readahead+0xad/0x1d0
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000275] [<c010ab13>] common_interrupt+0x23/0x28
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000280] [<c02d62e4>] ? __copy_to_user_ll+0x44/0xf0
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000286] [<c019629f>] file_read_actor+0xbf/0xe0
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000291] [<c0197efc>] do_generic_file_read+0x3ac/0x4b0
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000296] [<c01980a3>] generic_file_aio_read+0xa3/0x210
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000301] [<c01961e0>] ? file_read_actor+0x0/0xe0
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000309] [<c0133e9c>] ? enqueue_entity+0x13c/0x360
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000317] [<c01c6445>] do_sync_readv_writev+0xb5/0xf0
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000324] [<c01569e0>] ? autoremove_wake_function+0x0/0x50
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000333] [<c02b1ef0>] ? apparmor_file_permission+0x20/0x30
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000343] [<c029002f>] ? security_file_permission+0xf/0x20
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000348] [<c01c66f4>] ? rw_verify_area+0x54/0xd0
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000353] [<c01c718c>] do_readv_writev+0x9c/0x1c0
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000358] [<c0198000>] ? generic_file_aio_read+0x0/0x210
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000364] [<c01ca222>] ? sys_fstat64+0x22/0x30
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000369] [<c01c73f7>] vfs_readv+0x47/0x60
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000373] [<c01c744d>] sys_readv+0x3d/0xa0
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000377] [<c0109eef>] sysenter_do_call+0x12/0x2f
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000381] ---[ end trace f8f199c4f67f95bb ]---
Oct 21 19:59:31 HomeNAS kernel: [ 6627.000527] eth1: Transmit timed out, status 0000, PHY status 782d, resetting...
Oct 21 19:59:31 HomeNAS kernel: [ 6627.001232] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1
Oct 21 20:00:03 HomeNAS /USR/SBIN/CRON[4435]: (root) CMD ([ -x /usr/sbin/update-motd ] && /usr/sbin/update-motd hourly 2>/dev/null)
Oct 21 20:00:03 HomeNAS /USR/SBIN/CRON[4436]: (root) CMD ([ -x /usr/sbin/update-motd ] && /usr/sbin/update-motd 2>/dev/null)
Oct 21 20:00:17 HomeNAS kernel: [ 6673.000190] eth1: Transmit timed out, status 0000, PHY status 782d, resetting...
Oct 21 20:00:17 HomeNAS kernel: [ 6673.000916] eth1: link up, 100Mbps, full-duplex, lpa 0x45E1

Just as this happens the system becomes unresponsive and if i try to log in physically it just sits after taking my password some times, sometimes it justs blank
 
From the log, it looks like the ethernet device/drivers are malfunctioning. Any chance of swapping out the NIC for a different one or using a USB ethernet device temporarily?
 
Top