Notes on the Czur scanner

A few months ago, an IndieGoGo campaign for some folks developing a new scanner designed for books came to my attention and I jumped on it. The perk was one of these scanners at what, if memory serves, would likely be half price.

It took longer than anticipated, but I just received this scanner. This is a very early release of everything, so it’s not surprising that there are some rough edges. I can see that, a couple iterations down the road, and especially if they get more help with their English translations, this could become a very cool product.

As things stand, I’m pretty happy with it. But there are some issues that I should warn about.

First, I misunderstood about this device also serving as a web projector. (They’ve revised their wording about this.) It’s not a big issue for me because I was primarily interested in the scanning function, but basically what they’ve done, since they’ve already got a camera and lighting pointed downwards at a flat surface is enable their device to serve as an accessory to a projector. Connect it via HDMI to a video display device and it will display whatever the camera is looking at. There’s actually very little mystery to this; you can see roughly what it would do whenever you’re in scan mode.

Second, and most unfortunately, you really need Microsoft Windows for this device. Your alternative is to connect to their cloud server via WiFi, but this will only work over completely open WiFi with no password or captive gateway. I don’t have this, so I can’t tell you how well it works if you do.

Third, you also need relatively powerful hardware. There’s some major processing going on in the software, which flattens curved pages and does optical character recognition (OCR). This will take ages on an inadequate system; it basically failed on my old Windows laptop that I mostly keep on top of a bookshelf. Even on the aforementioned relatively powerful system, such as I needed to finish my dissertation, optical character recognition will take some time.

Fourth, absolutely do set all this up on a good sturdy table. The folding tables I had available were a disaster, in part because the scanner was on a different table from the material I was scanning. This is by far the fastest scanner I’ve ever seen, but the exposure time is still long and any movement, especially movement of the material being scanned relative to the scanner, is very much your enemy.

In this vein, in scanning books, you will become painfully familiar with all the ways they can slip while scanning. The binding of books introduces its own resistance to pages being held open and relatively flat. Pages will shift even while you’re trying to hold them still, and often, will have to be re-scanned. All too often, this is really awful, with page after page requiring re-scanning.

What seems to work best is to use their supplied stretchy thumb thimble thingies to pull the adjoining pages apart. This is complicated when page margins are too narrow. It’s also particularly problematic near the beginning and end of a book when most of the weight of the pages adds to the resistance.

If you notice anything going amiss while scanning a page, scan it again. This is a process that depends on a lot of things going right. It’s easier to delete unneeded redundant scans than it is to re-scan.

Fifth, for most books, you probably want to choose black and white mode (not color and not grayscale). This will lose any highlighting (which I do a lot of) and prevent the OCR from being confused. It also seems to allow the OCR process to run faster. Unfortunately, the default setting is color and your settings (except the product serial number) are not retained between runs. If you scan with the wrong mode, you can change it with the ‘switch’ option under the ‘bulk’ tab.

The OCR, by the way, is very, very good. I don’t know what they’ve done, but this isn’t tesseract (an OCR program available as open source). It does a far better job of recognizing characters correctly than anything I’ve seen before. I wish I could put this software to work on some PDFs I get. It recognizes columns and blocks of text (which tesseract does poorly to the extent that it does it at all).

Sixth, sometimes tables that are too wide to fit in the ‘portrait’ mode of normal book pages are instead presented rotated 90 degrees counter-clockwise. Sometimes, you will be confronted with situations where one page is in the normal ‘portrait’ mode and the adjoining page has a rotated table. For this situation, I suggest taking two scans. Then, as you review the scanned pages, crop so each page contains only one orientation and rotate as needed so text is consistently oriented correctly.

Seventh, be aware that there are two brightness settings for the scanner light that illuminates the material to scan. The light itself has a switch, one of the four buttons on the base of the unit. I think the light heats up; it seems to shut itself off at irregular intervals, even while scanning documents, so even though it’s an LED light, you’ll want to switch it off when you’re not using it.

Eighth, the operating manual that comes with the scanner is extremely minimal. Install the software before connecting up the scanner; this can be downloaded from here. A lot of this you just have to figure out by doing because their user interface was designed by people who do not speak English well and, on top of that, there are some glitches. But I would encourage you to be patient with it. What they’ve done with this is actually pretty amazing.

12 thoughts on “Notes on the Czur scanner

    • March 7, 2017 at 4:18 am
      Permalink

      Hi David,

      Thanks for this realy good review.
      May I as a question?
      Have you ever tried to scan sheet music – if so: did the flattening still work?

      Thanks,

      Thomas

      • March 13, 2017 at 1:36 am
        Permalink

        No, I haven’t, so I don’t know for sure. But I think it will. The scanner emits three lines of laser light that I’m pretty sure are what it uses to gauge the curvature of the pages.

        And wow, yeah. Of course. I can definitely imagine how converting sheet music in a book to flat sheets might be a real win.

  • January 26, 2017 at 4:26 am
    Permalink

    David,
    Thank you very much for your thoughts on the Czur Scanner.
    Did you upgrade the firmware and software that is available through the website of the company (http://www.czur.com/tech_support/product?pid=ET16)?
    I did so and now the scanner is not working at all. The lights are not switching on when I switch on the scanner.
    Do you you have any experience?
    Sorry for asking. I do this only as I think an other user might have an idea about it. I’m currently waiting for the company’s reply; they are on seasonal holiday (New Year celebration in China).
    I would appreciate your ideas.
    Thank you!
    Bernd

    • February 3, 2017 at 1:32 pm
      Permalink

      I have now been through two such upgrades successfully. It sounds like you interrupted an upgrade, effectively ‘bricking’ the device.

      • February 13, 2017 at 2:54 pm
        Permalink

        Finally I’m happy with the software after last update (and Firmware) Now the workflow is right.
        But the spliting bug is still there. The pages splits e.g. in center of second page for no reason

        • February 14, 2017 at 3:06 am
          Permalink

          Yes, for me, the workflow improved both with improved speed and with controls over color, etc. and curved (bound) material versus flat shifted to the window in which I’m actually scanning (previously one had to exit that screen to get to another one where I could reconfigure those controls.

          I have no idea about the splitting bug.

  • March 13, 2017 at 9:39 am
    Permalink

    The OCR software the Czur uses is ABBY. I have an old copy of it and it is brilliant, but it is now expensive.

  • April 3, 2017 at 10:39 pm
    Permalink

    I have worked with the scanner for 4 months. Software and firmware has been updated.
    My experience is very mixed.
    Sometimes the book comes out as it ought (flattened and cut).
    But other times it messes it all up: f.ex. 4 pages are OK, then it continues with simple pictures (no flattening, finger tips visible) or distorts some pages. Then it may return to OK pictures. Without me interrupting the process in any way.
    I must say that I´m less than satisfied with the book scanning part.

    Flat pages works just fine. So does OCR.

    • April 4, 2017 at 12:27 pm
      Permalink

      All of what Kaj Ahlburg says is true. It’s definitely not perfect. And I find the Word docs that the OCR produces to be useless. The PDFs are okay (I only bother with the two-level PDFs that incorporate both the original images and the OCR text that can be selected, copied, and pasted).

      What I would compare it to, however, is what else is available. So far as I know, Czur stands alone, and this is still a relatively new product.

  • April 19, 2017 at 1:37 pm
    Permalink

    Hi, I’m using the scanner for the first time and have come across a small anomaly. I’m scanning pages from a veterinary pharmacology tome. There are a lot of diagrams that the scanner picks up perfectly but when I use the OCR function for the text, many of the diagrams do not convert completely. Any ideas what I might do to correct this?

    • April 28, 2017 at 3:37 am
      Permalink

      Not really. There are serious limitations to the the OCR functionality and, in my experience, this is worse with the updated software. Yes, you still have the ability to produce a Word document, but no, the Word document is much less useful than before, with line breaks inserted between every word.

      My recommendation is to create a two-layer PDF (off hand, I’m failing to remember what it’s called in the updated software) that preserves the text in the output PDF such that it can be copied and pasted. This is kludgy to be sure. But it produces a readable document with the original images preserved (so you can correct errors in the OCR text when you paste it elsewhere). Then use a screenshot to copy and extract figures and other images.

Leave a Reply to benfell Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.