Xerox copiers might alter numbers, text in documents

If you think your photocopier is producing exact duplicates of your documents, you might want to double-check — some popular Xerox scanners and photocopiers change text and numbers documents scanned and copied under the "normal" quality setting.

Xerox recommends using "high" instead of "normal" quality to avoid problem

Xerox has confirmed that its scanners and photocopiers could substitute some characters for others when using the 'normal' quality setting. (Paul Sakuma/Associated Press)

If you think your photocopier is producing exact duplicates of your documents, you might want to double-check — some popular Xerox scanners and photocopiers change text and numbers documents scanned and copied under the "normal" quality setting.

The "character substitution issue" might occur with "lower quality and resolution settings" — which are labelled "normal" quality on Xerox machines — confirmed Francis Tse, principal engineer for Xerox, in a blog post Tuesday, several days after German computer science student David Kriesel first noted the problem in a blog post that spread quickly around the internet.

"For data integrity purposes, we recommend the use of factory defaults with the quality level set to 'higher,'" Tse added. Kriesel wrote in a blog post early Tuesday that based on his experiments, using a "higher" quality setting did reduce the errors. However, counterintuitively, it reduced the readability of scanned documents, prompting many people to choose the "normal" setting.

Kriesel first posted about the document alteration problem after noticing that a construction plan scanned with a Xerox WorkCentre 7535 had been changed so that the numbers showing the sizes of three rooms, which had previously been 14.13 square metres, 21.11 square metres and 17.42 square metres, all became 14.13 square metres. At the time, he had already switched off the machine's optical character recognition (OCR) feature, which automatically detects text characters in images and converts them into text data, suggesting that the problem had nothing to do with OCR.

"Patches of the pixel data are randomly replaced in a very subtle and dangerous way: The scanned images look correct at first glance, even though numbers may actually be incorrect," Kriesel wrote.

He noted that this could cause errors in invoices or even life-threatening errors in the case of changes to construction plans for structures such as bridges or doses of medicine.

The post soon spread around the internet, and tests by other people showed this affected many models of Xerox printers. Kriesel reported that both he and others contacted Xerox support staff, who were unable to say what setting could be changed to fix the problem.

Xerox aware of problem

In a statement Tuesday, Xerox said it had been aware of the problem, and that its user interface alerts users to the possibility of "text quality degradation and character substitution errors" when they select the "normal" quality setting. Tse said the problem is that Xerox machines use the "recognized industry standard JBIG2 compressor" which creates very small file sizes with good image quality, "but with inherent tradeoffs."

In a meeting with Kriesel Tuesday, Xerox explained that other types of compression were used for the "high" and "high quality settings."

Following the meeting, Kriesel wrote that while Xerox had been aware of the document alteration problem, what it "seemingly didn't know at all, was the apparently vast number of customers using the "normal" setting … not really knowing about the implications."