User:Tim Starling/Weekly reports/2008-W05

From Wikipedia, the free encyclopedia

Summary:

  • In progress: new preprocessor, DumpHTML
  • Completed: File_Ogg, image scaling

New preprocessor[edit]

It's been a welcome break this week from parser work. I fixed a few bugs, and helped a few editors with migration issues, but I've also had a bit of variety back into my work. I haven't switched on the new preprocessor on the other wikis as promised last week, due to outstanding problems with message mode that I've been putting off dealing with.

DumpHTML[edit]

My main project this week I suppose was static HTML dumps (DumpHTML). This project was detailed in 2008-W02. I ran a few experimental dumps and fixed the bugs which came out of that. I think I've finally got a potential release candidate running now.

File_Ogg[edit]

Brion committed some live patches of mine in the media player (OggHandler) project, which reminded me that I still hadn't submitted my work on that upstream to PEAR. PEAR is an open source repository for libraries written in the PHP language. We use the libraries occasionally, but we've never directly contributed to one before now. During my media player work last year, I did some work on an otherwise abandoned PEAR project called File_Ogg.

This week, I did the necessary administrative work to take over File_Ogg in PEAR. So I now have an @php.net email address and a CVS account. I copied my work done up to PEAR, and released it as version 0.3.0.

Putting our reusable "library" code in PEAR means it will reach a wider audience. We will get more programmers using it, and so we'll get more feedback on its quality, and perhaps even some collaboration from parties outside Wikimedia.

Image scaling[edit]

I'm sharing hardware for my DumpHTML project with an image thumbnail server (storage1). So, in the interests of making DumpHTML go faster, I did some performance analysis on our image system, in particular image scaling.

I made the backend scaling system a bit more cache-friendly by having it respect the "If-Modified-Since" header. I didn't notice any immediate performance improvement, but then I wasn't looking very hard. The change was theoretically a good one, and will become more important in the future with planned changes to our image storage system.

I also noticed a lot of expensive requests coming from Exalead. I contacted them about it, and they put me in touch with the development team responsible. It looks like we'll be able to solve the problem amicably, and make some useful contacts along the way.