Table of Contents

Scanning

I scan all the paper documents I receive and file them digitally. This lets me get at them from anywhere, which in turn makes searching for a particular bit of info easier, especially when OCR is applied.

This page holds my notes on getting things going, in case I have to reinstall.

Scanning drivers for EPSON SX400

EPSON SX400 The EPSON SX400 is one of those All-in-one printer/scanners that lets you photocopy and print direct from memory cards (e.g. camera). Ubuntu 9.04 seems to support printing on it out-of-the-box, but not scanning.

I contacted EPSON through their UK website's “chat now” feature and was given this helpful link: http://www.avasys.jp/lx-bin2/linux_e/spc/DL1.do

I was able to install the Debian 64-bit package without any problems, but it didn't seem to have any effect until I rebooted. I suspect all that I really had to do was refresh the udev file-system, but I couldn't be bothered to work out how to do that. Rebooting got things going: XSANE et al were able to detect the SX400 as a scanner and away I went!

2010-09-12: It would seem that the device file (i.e. /dev/bus/usb/005/002) was not readable by my user account.

OCR (Optical Character Recognition)

The current favourite seems to be tesseract. It only works well when the image you feed it is bitonal (e.g. black and white, _not_ greyscale), and it doesn't cope with page-layout. So ideally you'd touch-up your image to remove all decoration and relay the page out so that all text is linear, but who can be bothered?

Ubuntu 9.04 offers tesseract through its default packages: sudo apt-get install tesseract-ocr-eng (which will pull in tesseract-ocr)

One-button archiving

Ultimately I want to make use of the START button on the printer/scanner itself, but for now I'm settling for a button on my keyboard. If anyone knows how to get events from the printer please drop me a line

… in progress …