Developing software in the Real World

Automatic OCR with Hazel and PDFPen

I have a useful scanner as part of my networked HP printer that will scan directly to a shared directory on my computer. Once there, I want the file to be renamed to the current date and the document OCR’d so that I can search it.

To do this, I use Hazel and PDFPen and this is a note to ensure that I can remember to do it again if I ever need to!

Firstly, rename the file. My scanner names each file with the prefix scan, so the Hazel rule is quite simple:

If all the following conditions are met:
	Name starts with scan

Do the following to the matched file or folder:
	Rename with pattern: [date created][extension]

This is the screenshot:

Hazel1

Having renamed the file, we can use PDFPen’s AppleScript support to perform an OCR of the document:

If all the following conditions are met:
	Extension is pdf
	Date Last Modified is after Date Last Matched

Do the following to the matched file or folder:
	Run AppleScript embedded script

The embedded AppleScript is:

tell application "PDFpen"
	open theFile as alias
	tell document 1
		ocr
		repeat while performing ocr
			delay 1
		end repeat
		delay 1
		close with saving
	end tell
	quit
end tell

This is the screenshot of it in Hazel:

Hazel2

That’s it. Scanning a document now results in a dated, OCR’d PDF file in my Scans folder.

6 thoughts on “Automatic OCR with Hazel and PDFPen

  1. Or in Hazel do a check to see if the document contents contain one of the following a, e, i, o u = if the document has an OCR layer then hazel should find a vowel (how good/accurate the OCR has been is a different matter)

  2. Hello Rob,

    well, I am going to built up my own automation with Hazel, Devonthink Pro etc. Right now I have a small working solution which use PDFPenPro for OCRing. But if you import documents directly by Devonthink, Devonthink starts for OCR the Abbyy FineReader, which belongs to Devonthink.

    Do you know, which application is the best for OCR? And if Abby FineReader is the best, how can I call this application in the AppleScript to do the job?

    Kind regards,

    Michael

Thoughts? Leave a reply

Your email address will not be published. Required fields are marked *