Sonarr downloading duplicate episodes that aren't upgrades, sometimes exact duplicate NZB

Sonarr version (exact version): 3.0.1.522
Mono version (if Sonarr is not running on Windows):
OS: Windows 7
Debug logs: https://pastebin.com/tK6EBtNk (just info log, sorry - will get a debug log next time!)
(Make sure debug logging is enabled in settings and post the full log to hastebin/pastebin/dropbox/google drive or something similar, do not post them directly here. Post in .txt not .doc, .rtf or some other formatted document)
Description of issue: below

screenshots: https://imgur.com/a/Hz1wacg

I’m new to Sonarr (moved from Sickbeard) and have only been playing with it for a couple of weeks. I love it so far, but right off the bat I started having issues with it grabbing duplicate episodes. Sometimes it grabs a different NZB but it’s not a quality upgrade nor any higher ranking on my preferred terms list and I just can’t figure out why it downloaded it. Sometimes it’s the EXACT same NZB in which case SABnzbd pauses it automatically because it recognizes it’s a duplicate. Sometimes it even realizes after grabbing that it’s no different and won’t import it. (E.g. the screenshot where it says “Not an upgrade for existing episode file(s)”… which was true, but raises the question of why it grabbed it in the first place.)

Unfortunately I’ve been lazy to take the time to grab screenshots and logs (partly because I kept tinkering with my settings figuring I just need to adjust something or find something I’ve done wrong). I got one right now, but unfortunately I had never turned on debugging so it’s just an info log. But I figured I’d still post while I have a few minutes, and now that I’ve turned debugging on, I’ll grab a debug log the next time it does it.

(Side note, you may notice in my Preferred Terms list a couple of them have a 0 score - I’m aware this does nothing, they’re just there from experimenting with different scoring and I left them there with a 0 temporarily as a sort of reminder of what I was playing with.)

Screenshots show one instance where it grabbed a duplicate of ‘Jane the Virgin’ but then didn’t import it because it was “not an upgrade”. Didn’t grab any logs for that. And then screenshots from today where it grabbed and downloaded “The Crystal Maze”, and then shortly grabbed the EXACT same NZB again. And screenshots of my settings.

Apologies about not having debug logs yet, but like I said I just wanted to get this all down while I have time, and I’ll get debug logs next time!

Okay so last night it tried to get another one that SABnzbd paused as a duplicate, The Crystal Maze 2017 S06E02 once again. (Not the exact same NZB this time, but probably the exact same file within.) Seems to be the same quality/word score as the original, so I can’t see why it grabbed it.

Debug logs:

  1. In this one it considers what seems to be the same NZB at 2:04am and 2:20am, both times deciding not to grab it.
    https://www.dropbox.com/s/kja4b34yf671pqv/sonarr.debug.21.txt

  2. In this one it considers it at 2:36am and 2:52am and still doesn’t grab it, but at 3:07am it DOES decide to grab it. (But then SABnzbd pauses it as a duplicate.)
    https://www.dropbox.com/s/sugwt8yn94y2lit/sonarr.debug.20.txt

Two more now…

  1. Big Little Lies - S02E04
    **Debug log - at 7:50pm grabs a WEBDL-1080p, at 8:06pm grabs the exact same NZB again (SABnzbd pauses it as a duplicate)
    (Later it grabs a 720p but that’s expected behavior as 720p is ranked higher.)

Imgur
Imgur

  1. Last Week Tonight with John Oliver - S06E17
    **Debug log 1 - at 9:10pm grabs a WEBDL-720p
    https://www.dropbox.com/s/zfc6lh76e8y1jdu/sonarr.debug.2.txt

**Debug log 2 - at 9:26pm grabs the exact same NZB again (SABnzbd pauses it as a duplicate)

Imgur
Imgur

Thanks for all the details. I’m having a hard time pinning down the exact issue, but I believe this is something with preferred words, but unfortunately the logging doesn’t confirm this currently.

The latest build adds more logging when comparing the quality, language and preferred word score for the release to the existing file, which I hope will illuminate the issue. It’d be great if you could upgrade to 3.0.1.526 and post the logs when you run into the issue again.

I’ve seen the exact same behavior recently. In my case, the titles look VERY similar to the first example shared above, where the only difference between one NZB title and the “new” one is “.1” (e.g., H.264.1) in the NZB name. Double checking after a duplicate file is downloaded, the scores for the two files are also identical.

I make extensive use of “Preferred” settings. I’ve upgraded and will also provide the requested information.

Okay, got one with the new logs (not sure if this was 3.0.1.526 or 3.0.1.527 - I see that Sonarr updated to .527 but I don’t know if that was before or after this). This was the morning of July 1st but I didn’t have time to post until now.

It appears 19-7-1 09:29:22.0 is the time when it downloaded the duplicate. (Last.Week.Tonight.with.John.Oliver.S06E17.720p.WEB-DL.AAC2.0.H.264-doosh) But 19-7-1 09:13:47.2 (a little bit earlier) is of interest too because it appears to consider the exact same NZB and that time it rejected it - but the next time it accepted it.

In both cases it appears to check the word score three times - the first time it thinks the “new” item has a higher score, the second two times it thinks they’re the same (both 1).

Looks like the new version

19-7-1 09:13:47.3|Debug|UpgradableSpecification|Comparing preferred word score. Current: 0 New: 1

So the existing file is 0, but the new one has a score of 1, so definitely your preferred words. To calculate the preferred word score for an existing file Sonarr uses the “scene name” (used when importing) or the file name.

Do you have SAB replacing periods with spaces? That could be leading to Sonarr incorrectly rejecting the release name from being the scene name.


(Config: Switches)

If that’s enabled please disable and see if the issue persists.

The SAB “replace dots in folder name” setting was enabled, so I disabled it a few days ago. Didn’t have much to download the last few days so it wasn’t put to the test, but it did it again tonight so I guess that didn’t make a difference.


“The Crystal Maze” at 19-7-6 22:47:53.3 and 19-7-6 22:48:10.5 in debug.2a.txt show it accepting the exact same NZB that it downloaded successfully a little earlier (in debug.3a.txt).

I notice sometimes it seems to check the preferred word score twice (or more), and the second time it gets the current one correct. For example (this was on a Crystal Maze that it DID correctly reject):

========================

19-7-6 23:50:47.2|Debug|UpgradableSpecification|Comparing preferred word score. Current: 0 New: -1

19-7-6 23:50:47.2|Debug|UpgradableSpecification|Existing item has a better preferred word score, skipping

19-7-6 23:50:47.2|Debug|HistorySpecification|Performing history status check on report

19-7-6 23:50:47.2|Debug|HistorySpecification|Checking current status of episode [9655] in history

19-7-6 23:50:47.2|Debug|UpgradableSpecification|Comparing preferred word score. Current: 1 New: -1

19-7-6 23:50:47.2|Debug|UpgradableSpecification|Existing item has a better preferred word score, skipping

======================

The first time it says the current file is a 0, but it actually should be 1. The second time it says the current file is a 1, which is correct. I’ve seen it do this on some of the ones where it downloads a duplicate, too, where it first says “Curent: 0 New: 1” and then “Current: 1 New: 1”. For example (from a previous log where it downloaded a duplicate of “Last Week Tonight with John Oliver”):

======================

19-7-1 09:29:22.1|Debug|UpgradableSpecification|Comparing preferred word score. Current: 0 New: 1

19-7-1 09:29:22.1|Debug|UpgradableSpecification|New item has a better preferred word score

19-7-1 09:29:22.1|Debug|DelaySpecification|Quality and language is highest in profile for preferred protocol, will not delay

19-7-1 09:29:22.1|Debug|HistorySpecification|Performing history status check on report

19-7-1 09:29:22.1|Debug|HistorySpecification|Checking current status of episode [6926] in history

19-7-1 09:29:22.1|Debug|UpgradableSpecification|Comparing preferred word score. Current: 1 New: 1

19-7-1 09:29:22.1|Debug|UpgradableSpecification|Existing item meets cut-off. skipping.

19-7-1 09:29:22.1|Debug|UpgradableSpecification|Comparing preferred word score. Current: 1 New: 1

19-7-1 09:29:22.1|Debug|UpgradableSpecification|Existing item has a better preferred word score, skipping

19-7-1 09:29:22.1|Debug|DownloadDecisionMaker|Release accepted

=====================

Maybe that could have something to do with it? What makes it able to get the current word score correct the second time? And why doesn’t that prevent it from grabbing it?
In that last excerpt, twice it said it was skipping (once because it met the cut-off and once because the existing item had a better word score) but then it accepted it anyway. That seems weird.

Side note, it would be handy if the Logs page could show the last write time (as opposed to just the date) even after it’s the next day - it would make it easier to find the right log.

Imgur

I missed this last time, but SAB is truncating folder names, but in a way they’re still partially readable and wonder if that’s the culprit, it could also explain why you’re not constantly seeing issues. That truncation is controlled by folder_max_length in SAB’s Switches settings.

I can’t think of any reason why the preferred word score for the file could (or would) change, assuming those are the same release as the truncated logs don’t tell us much

If you up the logging to trace logging Sonarr will log Calculating preferred word score for {0} (where 0 is the title) so we can see exactly what it’s using to calculate the score followed by a Calculated preferred word score for '{0}': {1} (where 0 is the title and 1 is the score), that’ll take the guess work out of this issue.

I just noticed that too the other day (the same day, funny enough), but I didn’t have time to post. I noticed it in the History, on the Episode Imported info - the “Name” was the same as the folder in the Source, but the folder itself (and hence the Name) was shorter than the actual file name - and that difference meant it was missing key words that would for sure affect the preferred word score.

Imgur

When I saw that, I was going to ask if it would be possible to have it calculate based on both folder and file name, but if they don’t agree, trust the higher one.

At the time I had no guess as to why the folder name would be truncated. But based on your post, I checked SAB’s folder_max_length setting and yep, it was set to only 48. Then I vaguely remembered shortening it in the past when I was having some problem with certain downloads tripping over Windows character limit for paths. It was probably some years ago, there’s no way I would have remembered that on my own!

So at that point I was betting that was the cause of the problem. But before I changed that, first I upped Sonarr to trace logging just to see what we’d get there.

And indeed Sonarr is (sometimes) calculating using the truncated folder name - but not always, which I don’t understand. For example in this section it appears to use “Jane.the.Virgin.S05E15.Chapter.Ninety-Six.720p.A” to calculate 0 (which is ostensibly correct because SAB’s short path cut off the preferred words), but then uses “Jane.the.Virgin.S05E15.Chapter.Ninety-Six.720p.AMZN.WEB-DL.DDP5.1.H.264-KiNGS-xpost” to (correctly) to calculate 5.


19-7-11 12:36:51.7|Trace|PreferredWordService|Calculating preferred word score for ‘Jane.the.Virgin.S05E15.Chapter.Ninety-Six.720p.A’
19-7-11 12:36:51.7|Trace|PreferredWordService|Calculated preferred word score for ‘Jane.the.Virgin.S05E15.Chapter.Ninety-Six.720p.A’: 0
19-7-11 12:36:51.7|Trace|ConfigService|Using default config value for ‘downloadpropersandrepacks’ defaultValue:‘PreferAndUpgrade’
19-7-11 12:36:51.7|Debug|UpgradableSpecification|Comparing preferred word score. Current: 0 New: 1
19-7-11 12:36:51.8|Debug|UpgradableSpecification|New item has a better preferred word score
19-7-11 12:36:51.8|Debug|HistorySpecification|Performing history status check on report
19-7-11 12:36:51.8|Debug|HistorySpecification|Checking current status of episode [1268] in history
19-7-11 12:36:51.8|Trace|PreferredWordService|Calculating preferred word score for ‘Jane.the.Virgin.S05E15.Chapter.Ninety-Six.720p.AMZN.WEB-DL.DDP5.1.H.264-KiNGS-xpost’
19-7-11 12:36:51.8|Trace|PreferredWordService|Calculated preferred word score for ‘Jane.the.Virgin.S05E15.Chapter.Ninety-Six.720p.AMZN.WEB-DL.DDP5.1.H.264-KiNGS-xpost’: 5
19-7-11 12:36:51.8|Debug|UpgradableSpecification|Comparing preferred word score. Current: 5 New: 1


(This particular excerpt isn’t where it downloaded a duplicate, although it did get duplicates of both Jane the Virgin and a House Hunters that day, but either I’m blindly unable to find those moments in the logs, or I saved the wrong logs the other day. Still, hopefully this still shows useful info.)

I’ve since changed SAB’s folder_max_length to 128, but it probably won’t get a good test case to make sure that helped until maybe Sunday night. But I’m optimistic.

Although even if that solves my particular problem at the moment, still it might be ideal to avoid this scenario somehow, in case someone needs a shorter path setting for whatever reason and then they get tripped up by this?

Question - if I were to change folder_max_length in SAB to HIGHER than the default of 128 (say 200 maybe) - to try to make sure any particularly long NZB names would never get truncated in this way (and since Windows 10 now allows disabling the 260 path limit) - would that backfire on me later in some other way? Can Sonarr support long paths?

Being unable to get the correct scene name is going to be a problem regardless, including this case, not sure there is much we can do there, though I’ll make a note of it.

Should be fine, Sonarr supports importing from long paths, but can’t import to some file names due to them exceeding the 255 character limit for a single path section, but that only tends to happen on multiple-episode files with long episode titles.