RssSync extremely slow with large number of series

I’ve got a setup where I’m using 4 indexers: a private one, Spotweb, AnimeZB and Fanzub. Every time Drone checks the RSS feeds it has to check 325 reports, and they’re processed pretty slow, often at least 5 seconds per report.

It got me thinking: what if we could use certain indexers only for manual searches? That way I could keep Drone feeling snappy while at the same time still having all the indexers at my disposal if something fails to be picked up, which should be rare if your primary indexers are any good.

This is how we want to deal with indexer prioritization, instead of checking one indexer at a time: https://trello.com/c/gwz5wiAo/350-indexer-prioritization

Anything that makes Drone not parse reports back to back all day while still allowing me to have plenty of indexers works for me. :slight_smile: Thanks.

Euh, My rss feed parses around 300 reports on my develop setup in just a couple of seconds tops… total.

I’m really wondering where those 5 seconds/report comes from.Sounds like you have a huge performance hog somewhere.

Could you first confirm by comparing the timestamps on the “Starting RSS Sync” and “RSS Sync Completed. Reports found: xxx, Reports downloaded: 0” log lines?

If the duration is confirmed, please enable Trace loglevel, trigger a RSS Sync and gather the relevant log file lines. Maybe that will give an indication of what’s making it so terribly slow.

Just grabbing the first one I can find from a recent log. I think I’ve seen worse but this one seems representative.

14-8-6 08:15:30.2|Info|RssSyncService|Starting RSS Sync
14-8-6 08:15:31.5|Info|DownloadDecisionMaker|Processing 325 reports

14-8-6 08:30:15.2|Info|RssSyncService|RSS Sync Completed. Reports found: 325, Reports grabbed: 1

2.7 seconds per report. :stuck_out_tongue:

A bit later it goes a bit faster, but that seems to be because Fanzub screwed something up:

14-8-6 08:52:05.6|Info|RssSyncService|Starting RSS Sync
14-8-6 08:52:05.9|Error|FetchFeedService|An error occurred while processing feed. https://fanzub.com/rss/?cat=anime&max=100
System.Xml.XmlException: Text node cannot appear in this state. Line 1, position 7.
at Mono.Xml2.XmlTextReader.ReadText (Boolean notWhitespace) [0x00000] in :0
at Mono.Xml2.XmlTextReader.ReadContent () [0x00000] in :0
at Mono.Xml2.XmlTextReader.Read () [0x00000] in :0
at System.Xml.XmlTextReader.Read () [0x00000] in :0
at Mono.Xml.XmlFilterReader.Read () [0x00000] in :0
at Mono.Xml.XmlFilterReader.Read () [0x00000] in :0
at System.Xml.Linq.XElement.LoadCore (System.Xml.XmlReader r, LoadOptions options) [0x00000] in :0
at System.Xml.Linq.XNode.ReadFrom (System.Xml.XmlReader r, LoadOptions options) [0x00000] in :0
at System.Xml.Linq.XContainer.ReadContentFrom (System.Xml.XmlReader reader, LoadOptions options) [0x00000] in :0
at System.Xml.Linq.XDocument.ReadContent (System.Xml.XmlReader reader, LoadOptions options) [0x00000] in :0
at System.Xml.Linq.XDocument.LoadCore (System.Xml.XmlReader reader, LoadOptions options) [0x00000] in :0
at System.Xml.Linq.XDocument.Load (System.Xml.XmlReader reader, LoadOptions options) [0x00000] in :0
at System.Xml.Linq.XDocument.Load (System.Xml.XmlReader reader) [0x00000] in :0
at NzbDrone.Core.Indexers.RssParserBase.Process (System.String xml, System.String url) [0x00000] in :0
at NzbDrone.Core.Indexers.FetchFeedService.Fetch (IIndexer indexer, IEnumerable`1 urls) [0x00000] in :0
14-8-6 08:52:07.2|Info|DownloadDecisionMaker|Processing 225 reports
14-8-6 09:01:03.2|Info|RssSyncService|RSS Sync Completed. Reports found: 225, Reports grabbed: 0

And another one:

14-8-6 09:17:11.5|Info|RssSyncService|Starting RSS Sync
14-8-6 09:18:48.7|Info|DownloadDecisionMaker|Processing 325 reports
14-8-6 09:32:00.8|Info|RssSyncService|RSS Sync Completed. Reports found: 325, Reports grabbed: 0

I mean, Drone is running on a NAS and as such it hasn’t got that much hardware available (Intel Atom D2700 @ 2.13GHz with 3GB of RAM), but during processing I never see my CPU usage go above 40% (usually it’s steady around 30%), nor is my RAM anywhere near full. Most of the time I don’t hear my hard drives working either.

The following are the results of top while not syncing and while syncing, in that order:

The biggest difference between your setup and mine will most likely be this:

Monitored: 399
Unmonitored: 0

Yeah fanzub is down atm.

Could you pastebin me the complete tracelog for one rss sync?

It definitely might have to do with the number of monitored shows. I believe we could still optimize that a bit, but only if a significant portion of those rss releases are about series you have added.

It’s too big for Pastebin. :smiley:

I uploaded it to my own server instead: log.

Edit: I should add that not all of the reports take seconds to process, sometimes it races through a bunch of report in the wink of an eye, only to get stuck on one afterwards. Most reports do seem to take longer than a few seconds each though.

Could you Disable FailedDownloadHandling and then trigger the RSS Sync?
You can enabled it again afterwards.

I’m not saying FailedDownloadHandling is the culprit. I want to prevent it from running the blacklist check, which is only done if FDH is enabled.

Ow and knowing how many pages you have in the History->Blacklist will also be useful.

There are 6 pages in my blacklist. I turned FDH off, turned Trace logging back on and ran another sync, no noticeable difference in processing time: log file.

Let me know if there is more info I can give you, or poke me on IRC if you’d like direct access to my setup.

For the record. The issue was caused by a bug in the database layer, causing a real inefficient db call on every release that had a known series.

Even i had this problem of slow rss syncs’ but i thought it was because of the number of shows I had, Now after upgrading to the lastest develop version the rss sync is so fast. Thanks man