A few months ago, an IndieGoGo campaign for some folks developing a new scanner designed for books came to my attention and I jumped on it. The perk was one of these scanners at what, if memory serves, would likely be half price.
It took longer than anticipated, but I just received this scanner. This is a very early release of everything, so it’s not surprising that there are some rough edges. I can see that, a couple iterations down the road, and especially if they get more help with their English translations, this could become a very cool product.
As things stand, I’m pretty happy with it. But there are some issues that I should warn about.
First, I misunderstood about this device also serving as a web projector. (They’ve revised their wording about this.) It’s not a big issue for me because I was primarily interested in the scanning function, but basically what they’ve done, since they’ve already got a camera and lighting pointed downwards at a flat surface is enable their device to serve as an accessory to a projector. Connect it via HDMI to a video display device and it will display whatever the camera is looking at. There’s actually very little mystery to this; you can see roughly what it would do whenever you’re in scan mode.
Second, and most unfortunately, you really need Microsoft Windows for this device. Your alternative is to connect to their cloud server via WiFi, but this will only work over completely open WiFi with no password or captive gateway. I don’t have this, so I can’t tell you how well it works if you do.
Third, you also need relatively powerful hardware. There’s some major processing going on in the software, which flattens curved pages and does optical character recognition (OCR). This will take ages on an inadequate system; it basically failed on my old Windows laptop that I mostly keep on top of a bookshelf. Even on the aforementioned relatively powerful system, such as I needed to finish my dissertation, optical character recognition will take some time.
Fourth, absolutely do set all this up on a good sturdy table. The folding tables I had available were a disaster, in part because the scanner was on a different table from the material I was scanning. This is by far the fastest scanner I’ve ever seen, but the exposure time is still long and any movement, especially movement of the material being scanned relative to the scanner, is very much your enemy.
In this vein, in scanning books, you will become painfully familiar with all the ways they can slip while scanning. The binding of books introduces its own resistance to pages being held open and relatively flat. Pages will shift even while you’re trying to hold them still, and often, will have to be re-scanned. All too often, this is really awful, with page after page requiring re-scanning.
What seems to work best is to use their supplied stretchy thumb thimble thingies to pull the adjoining pages apart. This is complicated when page margins are too narrow. It’s also particularly problematic near the beginning and end of a book when most of the weight of the pages adds to the resistance.
If you notice anything going amiss while scanning a page, scan it again. This is a process that depends on a lot of things going right. It’s easier to delete unneeded redundant scans than it is to re-scan.
Fifth, for most books, you probably want to choose black and white mode (not color and not grayscale). This will lose any highlighting (which I do a lot of) and prevent the OCR from being confused. It also seems to allow the OCR process to run faster. Unfortunately, the default setting is color and your settings (except the product serial number) are not retained between runs. If you scan with the wrong mode, you can change it with the ‘switch’ option under the ‘bulk’ tab.
The OCR, by the way, is very, very good. I don’t know what they’ve done, but this isn’t tesseract (an OCR program available as open source). It does a far better job of recognizing characters correctly than anything I’ve seen before. I wish I could put this software to work on some PDFs I get. It recognizes columns and blocks of text (which tesseract does poorly to the extent that it does it at all).
Sixth, sometimes tables that are too wide to fit in the ‘portrait’ mode of normal book pages are instead presented rotated 90 degrees counter-clockwise. Sometimes, you will be confronted with situations where one page is in the normal ‘portrait’ mode and the adjoining page has a rotated table. For this situation, I suggest taking two scans. Then, as you review the scanned pages, crop so each page contains only one orientation and rotate as needed so text is consistently oriented correctly.
Seventh, be aware that there are two brightness settings for the scanner light that illuminates the material to scan. The light itself has a switch, one of the four buttons on the base of the unit. I think the light heats up; it seems to shut itself off at irregular intervals, even while scanning documents, so even though it’s an LED light, you’ll want to switch it off when you’re not using it.
Eighth, the operating manual that comes with the scanner is extremely minimal. Install the software before connecting up the scanner; this can be downloaded from here. A lot of this you just have to figure out by doing because their user interface was designed by people who do not speak English well and, on top of that, there are some glitches. But I would encourage you to be patient with it. What they’ve done with this is actually pretty amazing.