Menu Home Menu About Us Menu Downloads Menu Links About the Founder

Rollover shows Pretext logo without background

Download beta 1.0

Fully Functional
Works under WINE on Linux
Expires October 15th, 2010

The Problem it Solves

OCR (Optical Character Recognition) software doesn't do well recognizing text inside complex, multi-colored backgrounds. Pretext addresses this issue.

What Pretext does

Pretext is designed to do just one thing, and do it well: Find words of text (*) in any image, be it computer-generated, scanned in, or from a photograph, and output a black-and-white image with just those words. Imagine a picture of a road. Pretext finds words in car number plates and signposts, but ignores the road, the sky and the trees.

Existing OCR software combines finding the text with the recognition of text. Pretext separates the tasks, so that by doing the first stage better, the actual conversion of the words is more accurate.

Further, Pretext is designed to find more text than with traditional OCR methods.

(*) Pretext is designed to work with the Latin alphabet ("abc..."). It will have some success with similar alphabets such as Greek and Cyrillic. It is not designed to work with languages such as Farsi, Japanese or Chinese.

What Pretext doesn't do

Pretext is not an OCR program. Instead it removes the background "noise" to let an OCR program give more accurate results.

What OCR software is and does

OCR software turns pictures of text into actual text that you can copy and paste, or edit in a word processor, email, or put on a webpage.

Documentation

User's Guide

Frequently Asked Questions

Example

Here is an unedited snippet from a printed magazine page. Move the mouse over the image to see what Pretext does.

Rollover to show resul of using Pretext on Crash magazine cover></a>
<p>
Here is the result using FreeOCR (which uses Google Tesseract) on the un-Pretexted image:
<p>
<pre>

Here is the result using FreeOCR on the Pretexted image:

¤I\rI8l475 PTA! ANEWSFIELD PUBLICAHON
nu as Juan msn
  W. MAGAZINE
AND TWO CASSETTES
E2 99
SINCLAIR SPECTRUM GAMES

Email San Fran Systems at
sanfransys@decompiler.org.

San Fran Systems and Pretext logos and text (c) 2006-2010 San Fran Systems