Reuters Source Processing

I worked at Reuters and its subsidiary Factiva for several years, in a team ranging from 7 people up to 18 at the height of pre-Millennium testing. This was how I came to be based in Devon. Our general brief was to write software to process all the various "dialects" of file formats with which news providers sent stories, whose general title was "Source Processing". There was a massive machinery in place, crafted over many years to control the work flow and accommodate the eccentricities of each news-source. Over the years we transferred from VAX Pascal to Java and Antlr on Windows.

Feeds from each source would arrive in a continuous stream or in files, and each would be have some conventions about the contents - how to find the date, the headline, the break from each story to the next, and so on. In addition we catered about 15 languages (including Chinese and Russian) and several characters sets in each case (eg Windows, DOS, IBM or Apple-Mac).

Feeds came from all over the world; our software effort was largely based in Devon, with the production machines running in London.

In the latter phase, we took on a massive express 18 month effort to convert the software for about 420 feeds from the VAX Pascal legacy system, into a new architecture with Windows 2000 and Java. We had some mind-wrenching experiences with "Antlr", an obscure but useful parsing system. To our credit (and occasionally to our surprise) we achieved all of the intermediate milestones plus the final one, within time and budget.

In that phase, we knew we had to keep careful track of progress, given 420 work packages pread out among about 14 people, and some of them always held up by dialog with customers and providers for weeks at a time. I developed a small but useful MS-Access database to track it all - so at a glance I could see how many feeds were completed, not started, or suspended, who was working on them, and any snags. This data fed directly into our monthly reports.

VMS, Pascal, Java, Antlr
Date: 1995-2003

Top of page