Sonarr becomes unresponsive randomly

Sonarr version (exact version): 2.0.0.4916
Mono version (if Sonarr is not running on Windows):5.0.1.1
OS: Centos 7

Description of issue: Immediately following after an update, Sonarr crashes and the web interface dies. If I stop the systemd unit (service), the web page immediately comes back with a 504 gateway timeout (I use nginx). A manual restart of the service is needed to get sonarr going again. In the debug logs, this is the last entry before it pukes:

((Debug logs)):

mono[31639]: Stacktrace:
mono[31639]: at <0xffffffff>

This is still happening but I’m not finding any errors except this one occurance below. The crashing appears to be happening though consistently now with the scheduled refresh series events.

17-8-6 22:27:28.2|Trace|Owin|SQLite error (5): database is locked

The events continue past this error but then suddenly stop and sonarr quits logging.

Any insights?

Perhaps https://github.com/Sonarr/Sonarr/issues/1928

But without more logs it’s tough to say (the two lines don’t tell me much).

I thought that too… can I remove the Legacy section yet or is that still broken with indexers?

The issue I have with posting logs is that I can’t seem to find any errors… argh :frowning:

Still broken.

Capture the standard output/error to a file and check that, if mono is crashing Sonarr usually won’t be able to log anything when it goes down.

check mono logs then, not sonarr?

I added this to my systemd unit:

/bin/mono /apps/NzbDrone/NzbDrone.exe -nobrowser --data=/apps/data/.sonarr 2>/tmp/sonarr.mono.log

Hopefully, that’s the correct syntax and will do something :slight_smile:

I’m using systemd as well and journalctl has logs. Is this still something I should try to pursue? Running it manually perhaps? The log file appears to be the same that journalctl gets.

Your other logging may already capture that crash, but may depend how the service handles the crash (I’m not sure it matters). You could run it outside of systemd (make sure you use the same user) and redirect the output if that works.

I’ll try that. It crashed again since posting this last night.

If it helps, the service is running, but the web interface refuses to load. I’m looking for any error indication logs from journalctl in the meantime.

Finally… this appears to have crashed almost immediately after starting… I’ll run it again as the user and verify the service isn’t going as well to make sure this isn’t false information here…

https://pastebin.com/rE2yYY5g

@markus101

Here you go sir! I got a nice crash dump for you guys running it manually.

https://pastebin.com/VEU0cZtc

Thanks for looking into this!

Looks to me like memory corruption, either caused by damaged physical memory or a native library corrupting memory.

I’d suggest you first do an extended memory test overnight, if possible. Afaik most linux distros have memtest86 in their bootloader options.

I can certainly try that but not noticing corruption on any other services and/or my virtual machines.

any idea which library might be causing the corruption? One from Mono I assume?

Just-in-time compiler-based frameworks such as mono are far more susceptible to small memory corruptions. And we’ve seen it in the past.
I’m not saying this is the cause, in fact I think the chance is a small one. But it’s easily tested by running an > 8h memtest (hence ‘overnight’), and if we can exclude it from the possibilities then we save ourselves a wild goose chase.

One of the mono libs, or libmediainfo, or a myriad of other libraries that get loaded. Could even be a bug in mono. But first things first.

Also, at one point we do need a to get a full trace log, so we know exactly what sonarr/mono was doing in the minutes before the crash.
Understand that a ‘crash’ log is post-mortem and often insufficient, it’ll only tell us how the application died (hit by a blunt object), it won’t tell us what the blunt object was or who swung it.
So us ‘detectives’ need as much information as possible.

Understood. I’m running the memtest now.

I usually have trace logging on. Do I need to run it manually like before with different parameters or just zip you the log files?

Yup, just zip it, but it needs to be together with the console log file of the same time period coz we have to try correlate the two.
The Sonarr logs are generated by the ‘managed’ part, so won’t actually contain the native crash. That’s the nasty part of these kind of things.

Will do. Next time I notice the crash, I’ll zip them all together and let you know.

Thanks guys!

Taloth,

Let the memory scan run for almost 10 hours with 100% pass.

I moved the old existing trace logs from previous runs to a temp folder. It’s starting a new one now. Is there anything I could search for in the old logs to assist?

Great, tnx.

Regarding the old logs, probably no coz they’ve rolled over by now. When it crashes again, zip all logs together.

What you could do in the meantime is list the version of libmediainfo and libsqlite3.