FE/BE randomly locks up
I have a combined FE/BE MD5 setup that randomly (and regulary) crashes. I've searched various posts, and this one seems to describe my problem perfectly, but there's really no indication of a solution: http://www.mythdora.com/?q=node/1770
I've tried almost everything that was mentioned in this post, including testing memory, replacing hard drives and power supplies, removing capture cards, disabling realitime priority threads, etc, to no avail. The crashes typically occur when there is absolutely NOTHING happening on the machine. Often it just freezes at night, when there is no recording, comm flagging, or anything going on. I have two friends at work who also installed MD5 and are having similar issues, which tends to make me think it might not be a hardware problem. But three people is hardly a representative statistical sampling.
I'd love to hear from anyone else who has similar problems, especially if you've found a solution. At the moment I rarely have an uptime of more than 2 days or so, and I'd love to get it fixed.
If it helps, here's my hardware setup:
ECS NFORCE6M-A motherboard
AMD 64 X2 5200+
2GB Crucial DDR2 800 memory
1TB Seagate HD (SATA)
2x400GB HD (one SATA, one PATA)
OCZ 650W PSU
PVR-350
PVR-250
HDHomeRun
Turtle Beach Rivera Sound
Geforce 6200LE video card
Case with plenty o' ventilation
CPU temps typically around 30-35C (highest I've seen is 54C using 200% CPU)
Any ideas or suggestions would be greatly appreciated!
thanks,
darren

Problem solved (well, problem found, anyways)
It turns out that I had bad memory after all. Well, the memory itself wasn't necessarily bad, just incompatible with my motherboard. I got a smokin' deal on some high-performance memory, but that memory required 2.2V, and my motherboard bios doesn't provide for adjusting memory voltages. Initially it seemed to work fine, as memtest showed when I first put the machien together. But over time, errors started popping up.
Moral #1: Just because your memory may have tested out good at one point, doesn't mean that it won't cause problems in the future
Moral #2: If the memory requires 2.2V, don't try to run it at 1.8V
Moral #3: Linux and Mythtv (and maybe just computers in general) are far more tolerant to memory errors than I would have ever imagined. Memtest turned up 27,000 (!) errors in 10 minutes, yet I still see average uptimes of 3-4 days on my myth box...
Time for some different memory!
Anything meaningful in the
Anything meaningful in the logs? dmesg or /var/log/messages? Is the kernel core dumping anything or just hanging up entirely? You could try enabling sysrq so that you can force a kernel dump on the next hang. Does it even answer pings?
I've seen a few hard-lockups on a machine at home, but never my myth boxes. However, it's very few and far between.
Absolutely nothing in
Absolutely nothing in /var/log/messages or in the mythtv logs around the time the machine hangs. I haven't checked dmesg. I'll do that next time. I'll look into sysrq, too. When the machine hangs, it won't even answer a ping.
Thanks for the tips.
-darren
You never mention if you did
You never mention if you did an lspci -v too see if any IRQ's are sharing. If you don't find anything log wise then you may have to resort to removing all installed cards and non essential hard drives and boot up. Let that go overnight and see if it hangs. If not then try adding one thing at a time to see if that is the issue. What hardware are your friends using? The fact that you and your friends all experience this makes me believe that you all have something in common pointing to a hardware related issue.
"Please ignore the man behind the curtain"
Dennis
"Please ignore the man behind the curtain"
Dennis
Oh, and as for hardware
Oh, and as for hardware similarities, anything that is similar is coincidental. We all put our systems together at different times from different sources. Two of us have AMD64 x2, and one has a Core2 Duo. All of us have nvidia graphics cards (either 5200 or 6200). I think all of us ended up with Crucial Ballistix memory, which might be the only common component between us (we all happen to work at Micron, so we have to help the stock price any way we can!)
The really strange part is that MD5 was rock-solid on my system for close to three months before my problems began. Now it locks up hard about every other day...
Thanks again for all your help.
-darren
No, I didn't run lspci.
No, I didn't run lspci. Thanks for pointing that out. It turns out that I *DO* have some IRQs that are shared. Is this a problem, and if so, what can I do about it? The first two (on IRQ 18) seem to be the same device perhaps? But the ones on IRQ 19 might be causing me a problem...
00:08.0 IDE interface: nVidia Corporation MCP61 SATA Controller (rev a2) (prog-if 85 [Master SecO PriO])
Subsystem: Elitegroup Computer Systems Unknown device 2608
Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 18
I/O ports at 09f0 [=8]
I/O ports at 0bf0 [=4]
I/O ports at 0970 [=8]
I/O ports at 0b70 [=4]
I/O ports at d800 [=16]
Memory at fe02c000 (32-bit, non-prefetchable) [=4K]
Capabilities:
Kernel driver in use: sata_nv
Kernel modules: sata_nv
00:08.1 IDE interface: nVidia Corporation MCP61 SATA Controller (rev a2) (prog-if 85 [Master SecO PriO])
Subsystem: Elitegroup Computer Systems Unknown device 2608
Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 18
I/O ports at 09e0 [=8]
I/O ports at 0be0 [=4]
I/O ports at 0960 [=8]
I/O ports at 0b60 [=4]
I/O ports at c400 [=16]
Memory at fe02b000 (32-bit, non-prefetchable) [=4K]
Capabilities:
Kernel driver in use: sata_nv
Kernel modules: sata_nv
01:04.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
Subsystem: C-Media Electronics Inc CMI8738/C3DX PCI Audio Device
Flags: bus master, stepping, medium devsel, latency 64, IRQ 19
I/O ports at bc00 [=256]
Capabilities:
Kernel driver in use: C-Media PCI
Kernel modules: snd-cmipci
02:00.0 VGA compatible controller: nVidia Corporation NV44 [GeForce 6200 LE] (rev a1) (prog-if 00 [VGA controller])
Subsystem: eVga.com. Corp. Unknown device c297
Flags: bus master, fast devsel, latency 0, IRQ 19
Memory at fa000000 (32-bit, non-prefetchable) [=16M]
Memory at d0000000 (64-bit, prefetchable) [=256M]
Memory at fb000000 (64-bit, non-prefetchable) [=16M]
Capabilities:
Kernel driver in use: nvidia
Kernel modules: nvidia, nvidiafb
me too
I also have a front end that is just freezing on me and is completely unresponsive. As far as I can tell it is not a hardware issue because I can run Knoppix on the system with no problems. The system freezes even when I am not running the front end and it is a front end only so no mythbackend processes is running in the background. The backend server runs fine with no problems but it is a backend only. I have tried to reinstall MD5 and the system will work for a little bit (as long as i clear the setting for the front end from the backend database) but once i change a few myth front end settings the system starts to freeze again. could one of the front end setting change something in the OS to make it unstable?
Any help would be much appreciated.
Quote: I have tried to
I have tried to reinstall MD5 and the system will work for a little bit (as long as i clear the setting for the front end from the backend database) but once i change a few myth front end settings the system starts to freeze again.
Well what is it that your changing if it's just a few settings? Does it freeze if you don't make it a frontend? If it does and your backend works fine like you say then it's hardware related. What do the logs say?
"Please ignore the man behind the curtain"
Dennis
"Please ignore the man behind the curtain"
Dennis
the issue occurs on other hardware
I don't believe it is hardware related because the system that is now my dedicated backend was a front end/backend and it did this same thing until I reinstalled MD5 as a backend only(this is so I would always be able to record shows even if the front ends crash). As for the setting i change I have noticed that it happens mostly after I configure the GUI (set the theme, resize the window, set my over scan, and things like that). Now With that being said if I clear the database in the backend which would theoretically resetting everything to the defaults the issue does not go away. This issue has haunted me for the last few months and I have re installed MD5 so many times on different hardware that I can almost do it in my sleep.
i will try reinstalling MD5 and not making it a front end this weekend if i have time.
For shits and giggles, are
For shits and giggles, are these mythboxes updated using the latest mythtv rpms? If so, try a fresh install without the updates and see if things still freeze up. Long shot but I've seen it before.
"Please ignore the man behind the curtain"
Dennis
"Please ignore the man behind the curtain"
Dennis