Frequent Crashes on OpenSuSE 13.2

I’m really not sure if this falls under the much discussed “Native Mono Crashing” - but I do not believe it does. Nothing in the kernel/console logs. Happens multiple times per day, every day. Doesn’t seem to be doing any particular thing when it happens. In this case it was pretty much idle. System is an x86_64 OpenSuSE VM on vsphere 5.5.

Running Sonarr Version 2.0.0.3732

Crash example:

[Info] RssSyncService: Starting RSS Sync
[Info] DownloadDecisionMaker: Processing 200 releases
[Info] RssSyncService: RSS Sync Completed. Reports found: 200, Reports grabbed: 0
EPIC FAIL: System.NullReferenceException: Object reference not set to an instance of an object
  at System.Threading.Timer+Scheduler.SchedulerThread () [0x0002f] in /home/abuild/rpmbuild/BUILD/mono-3.8.0/mcs/class/corlib/System.Threading/Timer.cs:328
  at System.Threading.Thread.StartInternal () [0x00016] in /home/abuild/rpmbuild/BUILD/mono-3.8.0/mcs/class/corlib/System.Threading/Thread.cs:691
[Fatal] GlobalExceptionHandlers: EPIC FAIL: Object reference not set to an instance of an object

System.NullReferenceException: Object reference not set to an instance of an object
  at System.Threading.Timer+Scheduler.SchedulerThread () [0x0002f] in /home/abuild/rpmbuild/BUILD/mono-3.8.0/mcs/class/corlib/System.Threading/Timer.cs:328
  at System.Threading.Thread.StartInternal () [0x00016] in /home/abuild/rpmbuild/BUILD/mono-3.8.0/mcs/class/corlib/System.Threading/Thread.cs:691

[ERROR] FATAL UNHANDLED EXCEPTION: System.NullReferenceException: Object reference not set to an instance of an object
  at System.Threading.Timer+Scheduler.SchedulerThread () [0x0002f] in /home/abuild/rpmbuild/BUILD/mono-3.8.0/mcs/class/corlib/System.Threading/Timer.cs:328
  at System.Threading.Thread.StartInternal () [0x00016] in /home/abuild/rpmbuild/BUILD/mono-3.8.0/mcs/class/corlib/System.Threading/Thread.cs:691

System info:

sabnzbd@neon:~> cat /etc/os-release
NAME=openSUSE
VERSION="13.2 (Harlequin)"
VERSION_ID="13.2"
PRETTY_NAME="openSUSE 13.2 (Harlequin) (x86_64)"
ID=opensuse
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:opensuse:opensuse:13.2"
BUG_REPORT_URL="https://bugs.opensuse.org"
HOME_URL="https://opensuse.org/"
ID_LIKE="suse"
sabnzbd@neon:~> uname -a
Linux neon 3.16.7-29-desktop #1 SMP PREEMPT Fri Oct 23 00:46:04 UTC 2015 (6be6a97) x86_64 x86_64 x86_64 GNU/Linux
sabnzbd@neon:~> mono --version
Mono JIT compiler version 3.8.0 (tarball Fri Oct 10 15:00:11 UTC 2014)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
        TLS:           __thread
        SIGSEGV:       altstack
        Notifications: epoll
        Architecture:  amd64
        Disabled:      none
        Misc:          softdebug
        LLVM:          supported, not enabled.
        GC:            sgen
sabnzbd@neon:~> mediainfo --version
MediaInfo Command line,
MediaInfoLib - v0.7.73
sabnzbd@neon:~>

The exception looks very similar, so I suspect its a similar issue. Its failing outside of Sonarr’s code, within mono, which could be an issue with mono or an incompatibility with the kernel as we saw in the Native Mono Crashing threads.

First thing to do would be to upgrade mono to at least 3.10.

I guess I’ll bring the system up to OpenSuSE Leap 42.1 and see what I end up with. I’d like to avoid a shorehorn-in non-novell build of Mono on this box, and 3.8 is the latest I’m going to get on 13.2. Will update the thread with results.

All kinds of crashes still.

First one:

[Info] RssSyncService: RSS Sync Completed. Reports found: 200, Reports grabbed: 0
Stacktrace:


Native stacktrace:

        mono() [0x4ba7a8]
        mono() [0x51092e]
        mono() [0x42b358]
        /lib64/libpthread.so.0(+0xf890) [0x7fd5e96a9890]
        [0x41baaca3]

Debug info from gdb:


=================================================================
Got a SIGSEGV while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.
=================================================================

Aborted

Attempted restart:

[Debug] Router: Console selected
[Info] OwinHostController: Listening on the following URLs:
[Info] OwinHostController:   http://*:8989/
[Debug] OwinAppFactory: Attaching NzbDroneVersionMiddleWare to host
[Debug] OwinAppFactory: Attaching SignalRMiddleWare to host
[Debug] OwinAppFactory: Attaching NancyMiddleWare to host
[Info] NancyBootstrapper: Starting NzbDrone API
[Trace] EventAggregator: Publishing ApplicationStartedEvent
[Trace] EventAggregator: ApplicationStartedEvent -> UserService
[Trace] EventAggregator: ApplicationStartedEvent <- UserService
[Trace] EventAggregator: ApplicationStartedEvent -> DownloadClientFactory
[Debug] DownloadClientFactory: Initializing Providers. Count 11
[Trace] EventAggregator: ApplicationStartedEvent <- DownloadClientFactory
[Trace] EventAggregator: ApplicationStartedEvent -> IndexerFactory
[Debug] IndexerFactory: Initializing Providers. Count 14
[Trace] EventAggregator: ApplicationStartedEvent <- IndexerFactory
[Trace] EventAggregator: ApplicationStartedEvent -> ReconfigureLogging
EPIC FAIL: System.NullReferenceException: Object reference not set to an instance of an object
  at System.Threading.EventWaitHandle.Reset () [0x00000] in <filename unknown>:0
  at (wrapper remoting-invoke-with-check) System.Threading.EventWaitHandle:Reset ()
  at System.Threading.Timer+Scheduler.SchedulerThread () [0x00000] in <filename unknown>:0
  at System.Threading.Thread.StartInternal () [0x00000] in <filename unknown>:0
[Fatal] GlobalExceptionHandlers: EPIC FAIL: Object reference not set to an instance of an object

System.NullReferenceException: Object reference not set to an instance of an object
  at System.Threading.EventWaitHandle.Reset () [0x00000] in <filename unknown>:0
  at (wrapper remoting-invoke-with-check) System.Threading.EventWaitHandle:Reset ()
  at System.Threading.Timer+Scheduler.SchedulerThread () [0x00000] in <filename unknown>:0
  at System.Threading.Thread.StartInternal () [0x00000] in <filename unknown>:0

[ERROR] FATAL UNHANDLED EXCEPTION: System.NullReferenceException: Object reference not set to an instance of an object
  at System.Threading.EventWaitHandle.Reset () [0x00000] in <filename unknown>:0
  at (wrapper remoting-invoke-with-check) System.Threading.EventWaitHandle:Reset ()
  at System.Threading.Timer+Scheduler.SchedulerThread () [0x00000] in <filename unknown>:0
  at System.Threading.Thread.StartInternal () [0x00000] in <filename unknown>:0

Attempted restarted again:

[Info] OwinHostController:   http://*:8989/
[Debug] OwinAppFactory: Attaching NzbDroneVersionMiddleWare to host
[Debug] OwinAppFactory: Attaching SignalRMiddleWare to host
[Debug] OwinAppFactory: Attaching NancyMiddleWare to host
[Info] NancyBootstrapper: Starting NzbDrone API
[Trace] EventAggregator: Publishing ApplicationStartedEvent
[Trace] EventAggregator: ApplicationStartedEvent -> UserService
[Trace] EventAggregator: ApplicationStartedEvent <- UserService
[Trace] EventAggregator: ApplicationStartedEvent -> DownloadClientFactory
[Debug] DownloadClientFactory: Initializing Providers. Count 11
[Trace] EventAggregator: ApplicationStartedEvent <- DownloadClientFactory
[Trace] EventAggregator: ApplicationStartedEvent -> IndexerFactory
[Debug] IndexerFactory: Initializing Providers. Count 14
[Trace] EventAggregator: ApplicationStartedEvent <- IndexerFactory
[Trace] EventAggregator: ApplicationStartedEvent -> ReconfigureLogging
[Info] SceneMappingService: Updating Scene mappings
[Info] HousekeepingService: Running housecleaning tasks
[Info] Database: Vacuuming Log database
[Info] Database: Log database compressed
[Info] Database: Vacuuming Main database
[Info] Database: Main database compressed
[Info] SceneMappingService: Updating Scene mappings
[Info] RssSyncService: Starting RSS Sync
Stacktrace:


Native stacktrace:

Segmentation fault

It finally restarted on the 3rd attempt after the first crash.

New system info:

sabnzbd@neon:~> cat /etc/os-release
NAME="openSUSE Leap"
VERSION="42.1"
VERSION_ID="42.1"
PRETTY_NAME="openSUSE Leap 42.1 (x86_64)"
ID=opensuse
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:opensuse:opensuse:42.1"
BUG_REPORT_URL="https://bugs.opensuse.org"
HOME_URL="https://opensuse.org/"
ID_LIKE="suse"
sabnzbd@neon:~> uname -a
Linux neon 4.1.13-5-default #1 SMP PREEMPT Thu Nov 26 16:35:17 UTC 2015 (49475c3) x86_64 x86_64 x86_64 GNU/Linux
sabnzbd@neon:~> mono --version
Mono JIT compiler version 4.0.4 (Stable 4.0.4.1/5ab4c0d Sun Oct 25 12:51:49 UTC 2015)
Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com
        TLS:           __thread
        SIGSEGV:       altstack
        Notifications: epoll
        Architecture:  amd64
        Disabled:      none
        Misc:          softdebug
        LLVM:          supported, not enabled.
        GC:            sgen
sabnzbd@neon:~> mediainfo --version
MediaInfo Command line,
MediaInfoLib - v0.7.73
sabnzbd@neon:~>

It’s probably worth noting this is my first attempt with Sonarr. I decided to give it a good look as a replacement for Sickrage. I’ve not had much experience with mono prior to this.

Another crash:

[Info] DownloadDecisionMaker: Processing 200 releases
[Info] RssSyncService: RSS Sync Completed. Reports found: 200, Reports grabbed: 0
* Assertion: should not be reached at monitor.c:721

Stacktrace:

  at <unknown> <0xffffffff>
  at System.Threading.Timer/Scheduler.SchedulerThread () <0x0009e>
  at System.Threading.Thread.StartInternal () <0x000bf>
  at (wrapper runtime-invoke) object.runtime_invoke_void__this__ (object,intptr,intptr,intptr) <0xffffffff>

Native stacktrace:

        mono() [0x4ba7a8]
        /lib64/libpthread.so.0(+0xf890) [0x7fcaa914a890]
        /lib64/libc.so.6(gsignal+0x37) [0x7fcaa8dc9187]
        /lib64/libc.so.6(abort+0x118) [0x7fcaa8dca538]
        mono() [0x62f159]
        mono() [0x62f35f]
        mono() [0x62f4a6]
        mono() [0x5ab72d]
        [0x40440b6e]

Debug info from gdb:


=================================================================
Got a SIGABRT while executing native code. This usually indicates
a fatal error in the mono runtime or one of the native libraries
used by your application.
=================================================================

Aborted

Unfortunately that looks like the same issue as mentioned in the other threads and is happening outside of Sonarr’s code. @Taloth do you have any thoughts on this one or are we at the mercy of the kernel + mono here?

Definitely the famed native crash… that or damaged memory.
On a VM it’s even worse.

Suggestions:

  1. run a memtest for 12h.
  2. compile the kernel yourself. (edit this line replace notrace with notrace noinline, but be warned I recalled this from like 6 months ago, so I could be totally wrong)
  3. try again
  4. give up (sry for the sarcasm)

Definitely not damaged memory. This is on an enterprise-grade blade with lots of other workload on it with no issues.

Was really hoping for a stable solution with a supported vendor kernel. Seems like mono really isn’t up to par. I guess I’ll continue with SickRage.

Which seems like jumping to conclusions, coz everything worked fine until a certain kernel update.
The point is that the problem happens on the boundary between mono and the kernel, which side I don’t know.
What I do know is when I patched the kernel with a noinline somewhere, the symptoms went away, doesn’t mean it fixed the underlying problem of course.

But I suppose we’re done here.

I’ve got two TV management systems both on OpenSuSE. Basically we’ll call it one for my shows and one for wife shows. Moved her server to Sonarr and don’t really want to move it back, because I really like the app aside from the stability issues. I think what I’ll do is put it on life support with a restart script and hope someone figures out what’s going on with these ‘external’ crashes.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.