Search Inside the Book

Amazon.com launched their “search inside the book” feature today.

According to Charles, one of the principal devs on the project, they started with approximately 50TB of JPEG images from a scanning sweatshop in an unnamed country, spent three months processing the data on over 120 ultra-fast boxes, and wound up with 20TB of images and 300GB of searchable text. They believe that is over half as much text as was ever stored in the library of Alexandria.

Today the text is only for searching. But I hope that at some point in the future, humanity will be enlightened enough to allow itself the pleasure of direct access. I’m not holding my breath.