Automatic OCR with Hazel and PDFPen
I have a useful scanner as part of my networked HP printer that will scan directly to a shared directory on my computer. Once there, I want the file to be renamed to the current date and the document OCR’d so that I can search it.
Firstly, rename the file. My scanner names each file with the prefix scan, so the Hazel rule is quite simple:
If all the following conditions are met: Name starts with scan Do the following to the matched file or folder: Rename with pattern: [date created][extension]
This is the screenshot:
Having renamed the file, we can use PDFPen’s AppleScript support to perform an OCR of the document:
If all the following conditions are met: Extension is pdf Date Last Modified is after Date Last Matched Do the following to the matched file or folder: Run AppleScript embedded script
The embedded AppleScript is:
tell application "PDFpen" open theFile as alias tell document 1 ocr repeat while performing ocr delay 1 end repeat delay 1 close with saving end tell quit end tell
This is the screenshot of it in Hazel:
That’s it. Scanning a document now results in a dated, OCR’d PDF file in my Scans folder.