NAME

ocrdesktop - Read and interact with content on your GUI desktop via OCR

SYNOPSIS

ocrdesktop [options...]

DESCRIPTION

OCRdesktop is a useful accessibility tool to grab content from the screen as text via OCR technology.
It takes an image of the current window or workspace, prepares it for better results and uses tesseract to recognize text on it. The result is presented in a caret enabled text area in a detailed list with coordinates where it can emulate clicks on the text, or results can be sent to the clipboard. There are two main views.

1. The browse mode view:
This is where the OCR results are displayed in a caret browsable text area.
There are buttons to perform mouse clicks, change views, etc available in a traditional menu layout opened with f10.

2. The detailed view view:
In this view you can see meta data for displayed items.
Each output text string is shown in the first column of a table.
The 2nd column gives the font size used.
The 3rd column shows color information when available.
The next column has the X coorinant for that item.
the following column has its Y coordinant.
finally, a percent value for accuracy confidence is shown.
You also can see and change the font used to display output using the tab key.
The shortcut to toggle between these two views is control v.

Setup

Assign the command ocrdesktop to a keyboard shortcut in your desktop environment.
e.g., In Gnome or Unity you can do this via the Gnome Control Center in the Keyboard window in the Shortcuts tab.
For languages other than english, set your language code with ocrdesktop -l <languagecode>. Use the tesseract language codes for <languagecode>
You also can use other parameters to expand the function when starting OCRdesktop,
Of course the program can be run via your desktop's run box or dash, but running from a terminal may not be desirable as the terminal window will be in ocrdesktop's screenshot.
Tip Make at least two shortcuts, one without options to capture the active window only, and another with -d to take a screenshot of the whole desktop

Usage

ocrdesktop when run with no commandline options shows the processed output of the active window.
Run ocrdesktop with the -d switch and the whole desktop is processed.
If the OCR quality is not good in the first processing of a screenshot there are transformation options that may improve character recognission accuracy.
To use these options take new screenshots using one or more transformation flags. Available transformations options are:
-i will invert foreground and background color or tone.
<-g uses a gray scale instead of original colors.
-b uses a barier blackwhite method. This method uses the gray scale, and sets a barier wheere all shades darker than the barier are considered black, and everything lighter is seen as white. The default barier is 200, but can be set on the commandline. This method is useful when working with very bright colors or blurry text.
Open menus to see all available options.
They are devided in to three categories, ocrdesktop, interact and macros. There are keyboard shortcuts for all ocrdesktop options. In ocrdesktop you find toggle view, send to clipboard and quit. Interact has controls for simulated mouse clicks etc. The macros menu lets you load, unload, and save macros.

Macros

In most desktop environments, global shortcuts don't work while menus are open, e.g., the file menu in the menu bar at the top of most programs. Ocrdeskop can use preclicks, macros that can be run before the screenshot is taken. This allows you to close all menus and let OCRdesktop click on the menu before it recognizes the window.

To use preclick macros press on preclick in the interact menu, or use the control p shortcut. Then choose a mouse click to perform before OCRdesktop starts the next time. After you click on the button that starts a mouse click, nothing will happen. next time you run OCRdesktop, it will ask what to do. You can press Run, so all stored clicks will execute. After that OCRdesktop takes its screen shot for OCR on the screen as it is after the click action has been performed. Use the Save As option and Preclick and Save As again and the second click will also be stored, e.g. for opening a sub menu.
You can also fire keyboard shortcuts into the preclick macros. To enter the shortcut recording mode, press Ctrl+k or select the Send Key menue entry in the Interact menu. Now every keystroke you type is appended to the currently active preclick macro. Pressing F4 will leve the recording mode. Leving the shortcut recording mode may have a delay of up to 2 seconds.
You can save as many mouse operations as you want.
Choose unload in the macro window to erase the macro, so its lost. If you press Cancel, no mouse clicks are performed, but the main window opens. The
macro will not be deleted and you will be asked next time you start OCRDesktop if you want to run your stored clicks.
Tip: Macros are stored in ~/.activeOCRMacro.ocrm The Save As option lets you store macros in another location if yu choose. It will open a standard file chooser window where you select a name and location for your macro file.

OPTIONS

-b Convert to black and white before doing OCR:-t optional to set barier point
-c Send text output to clipboard
-d Perform OCR on the whole desktop, not just the active window
-g convert colors to gray scale before doing OCR
-h Show a short help message
-i Invert foreground and background colors before doing OCR
-m Execute macro: path to macro-file is required with -m
-n No GUI is shown: run with -c or-m
-p Print debug messages

EXIT CODES

FILES

/usr/bin/ocrdesktop

The executable

/usr/share/man/man1/ocrdesktop.1.gz

This manpage

BUGS

Leving the shortcut recording mode may have a delay of up to 2 seconds.
Please report bugs and/or submit fixes or code enhancements to chrys87@web.de

AUTHOR

ocrdesktop was written by Chrys <chrys87@web.de>
B.H. wrote this manpage.

WEBSITE

https://aur.archlinux.org/packages/ocrdesktop/
https://wiki.archlinux.org/index.php?title=Ocrdesktop&redirect=no

Copyright Notice

## You may distribute this work under the GNU General Public License3 or ## optionally any later version of the GPL if required by your project. http://fsf.org/

27/09/2015

manpage template generated by scriptst