(Disclaimer: Possibly one of my longest post ever, you may want to scroll to the bottom and only look at the pictures.)

So, what’s new in SD since last week?

Fetching everything

From http://bugs.freedesktop.org/, it was possible to download all the bugs I reported there (that means 2…), but trying to download those reported by Julien led to a crash without any trace. Dichotomy FTW, I came up with a bunch of bugs which would lead to the same result, which confirmed that the amount of bugs processed at once wasn’t the issue.

Playing around with strace, it appeared that exchanges with the remote server were apparently fine, so an issue on the client-side SOAP machinery got suspected. So let’s enable tracing:

use SOAP::Lite +trace => 'all';

Tada! The issue was indeed local: malformed XML got received, leading to a die call issued from the XML-RPC layer. Thankfully, one can set up a fault handler, which I used to display the range of bug IDs which triggered that bug, so that people can look into it and determine what to add to the blacklist until the underlying issue’s been investigated (presumably, bugzilla’s at fault).

Speeding things up

With that blacklisting of buggy bugs, one can then try to move to other queries, dealing with more bugs. Some examples follow:

reporter=kibi@d.o
product=xorg&component=Driver/VMWare
product=xorg&component=Driver/nouveau
product=xorg&component=Server/general
product=xorg&component=Driver/intel

Moving from 2 bugs to 10-100 bugs (Julien’s or VMWare’s) was OK. But then moving to several hundreds of bugs (Driver/nouveau = 500+ bugs) led to noticeable performance issues. Not to mention what happened when one reaches several thousands of bugs (Driver/intel = 2500+ bugs). Indeed, even with network exchanges cached into a local file, processing data was taking up to several dozens of minutes.

I knew about Perl’s -d:DProf, which helps figuring out where time is spent, but was pointed to -d:NYTProf (and its accompanying tool, nytprofhtml). Some hotspots got noticed:

  • There’s a huge pile of stuff relying on UUIDs heavily, and a cache is going to be introduced to avoid later calls once a value’s been computed once. That’s going to benefit all replica types, not just bugzilla.
  • I didn’t care much about date/time at the beginning, but that turned out to be a very bad idea: since the format returned by bugzilla wasn’t matching a “well-known” format, time was spent in the DateTime::Format::Natural fallback, leading to a big performance penalty. Fixed with a trivial regular expression.

Things got better, but not good enough. There are several Prophet (the engine under the hood) backends, so one can play with:

PROPHET_REPLICA_TYPE=sqlite   # the default for SD
PROPHET_REPLICA_TYPE=prophet

Switching to prophet was a big win, but still not good enough. Indeed, many tiny files are written, and most of the time is spent in I/O. Although I’m nothing like a performance guru, I guessed that running on an average laptop, with ext3 and its default commit interval of 5 seconds might not be helping, so I gave a quick try to -o remount,commit=60, and that seemed to help.

Even though there are probably other tricks to find in that area (which hopefully won’t require root privileges…), there’s already a patch which landed in prophet’s master branch, replacing File::Spec->catfile with an optimized version: that function alone was eating 10% of runtime…

In the sqlite case, disabling the auto-commit feature helped reducing the I/O load, but a proper patch is still lacking for now (running into locked database issues, or into missing tables after having created them isn’t fun, so I postponed debugging that).

Since performance issues looked like they could be solved eventually, I switched back to implementing missing features.

Handling more than comments

Currently 3 types of stuff are currently fetched from the bugzilla server:

  • Bug status: plenty of properties.
  • Bug comments: comments that are linked to bugs.
  • Bug history: changes that impacted bugs.

(Yes, that means that attachments are totally ignored for now.)

Until now, only bug comments were considered. The first comment was used to determine a pseudo-title (using its first line), the reporter, and the creation date. This approach was chosen to try and get a basic sync working quickly, so as to get:

  • A list of bugs matching the query.
  • All comments for each of these bugs.

Now, the algorithm is the following: from the bug status, determine a set of properties of interest; then walk the history backwards, and update the properties incrementally until the (presumed) “initial state” is reached. Then create the ticket using this “initial state”. Adding the incremental property changes to that initial ticket makes it possible to represent the bug’s life as a list of Prophet::ChangeSet objects.

That’s where the fun begins, since properties in the bug status may not match properties in the bug history, so one needs to establish property correspondence. Also, some properties can be multivalued for added fun. I believe that’s where most of the time is going to be spent while developing a new replica type in SD: once one knows how to get a hand on needed info on the remote server, the main question is what to do with it. For now, I decided to ignore many fields to make it possible to do a “big” sync like Server/general, property support will be improved later on.

Screenshots

Instead of pasting lengthy terminal excerpt, let’s use some screenshots instead (sorry, I’m not sure how to present such things in an accessible way, suggestions welcome).

Cloning Julien’s bugs, listing all open tickets, listing all tickets, searching using a regular expression:

Cloning, listing, searching

Displaying bug 42 (that’s the local ID):

Displaying

Now, let’s start the embedded web server through sd server --port 1234 and point the browser there.

List of RESOLVED bugs:

List of RESOLVED bugs

Status and comments for bug 42:

Status and comments for bug 42

History for bug 42:

History for bug 42

Compared to the original bugzilla page:

Same bug on FreeDesktop.org

Next time

Some items which need work:

  • Tweak properties to address the issues raised above.
  • Start fetching attachments as well.
  • Support further syncs. Currently, a big sync is done once, and there’s no way to tell sd to sync new changes since last time, if any. This will probably lead to rewriting how fetching is currently done, which is: discover all bugs, then fetch all comments and all history items, for all of them. Properties like last_change_time will probably be of some help here.
  • Have a look at what happens with other bugzilla instances, like Gnome’s.