Category: Computing

Automatic OCR with Hazel and PDFPen

I have a useful scanner as part of my networked HP printer that will scan directly to a shared directory on my computer. Once there, I want the file to be renamed to the current date and the document OCR'd so that I can search it.

To do this, I use Hazel and PDFPen and this is a note to ensure that I can remember to do it again if I ever need to!

Firstly, rename the file. My scanner names each file with the prefix scan, so the Hazel rule is quite simple:

If all the following conditions are met:
	Name starts with scan

Do the following to the matched file or folder:
	Rename with pattern: [date created][extension]

This is the screenshot:

Hazel1

Having renamed the file, we can use PDFPen's AppleScript support to perform an OCR of the document:

If all the following conditions are met:
	Extension is pdf
	Date Last Modified is after Date Last Matched

Do the following to the matched file or folder:
	Run AppleScript embedded script

The embedded AppleScript is:

tell application "PDFpen"
	open theFile as alias
	tell document 1
		ocr
		repeat while performing ocr
			delay 1
		end repeat
		delay 1
		close with saving
	end tell
	quit
end tell

This is the screenshot of it in Hazel:

Hazel2

That's it. Scanning a document now results in a dated, OCR'd PDF file in my Scans folder.

Using Phive to manage PHPUnit

I recently came across the Phive project and have had a play with it. Phive is part of phar.io and is intended to manage development tools such as PHPUnit in preference to using Composer's dev dependencies. The main advantages of Phive are that it uses the phar file of the tool and only keeps one copy of each version rather than downloading a new copy into each project.

How it works

Phive stores one copy of each phar file within ~/.phive/ and then for each project, it creates a symlink. For example, to install PHPUnit into a project, you simply change directory to the root of your project and type:

$ phive install phpunit

This will download the phpunit phar file (if it's not already downloaded) to your ~/.phive directory, It will then create a tools directory in your project with the symlink to phpunit.phar.

$ ls -l tools
total 8
lrwxrwxr-x  1 rob  staff  42  3 Jan 07:34 phpunit -> /Users/rob/.phive/phars/phpunit-5.7.5.phar

You can now run PHPUnit using ./tools/phpunit, however if you would rather install to a different directory, e.g. bin, then use the --target switch:

$ phive install --target bin/ phpunit

The phive.xml file

Phive also creates a phive.xml file to keep track of what it has installed. You should add this file to your git repository and ignore the installed files in tools.

$ echo -e "phpunit" >> tools/.gitignore
$ git add phive.xml tools/.gitignore
$ git commit -m "Add phpunit via Phive"

Other users of the project can now install all the Phive tools using:

$ phive install

Installing for global use

Phive also supports global installation with the -g switch:

$ phive install -g phpunit
Phive 0.6.2 - Copyright (C) 2015-2017 by Arne Blankerts, Sebastian Heuer and Contributors
Downloading https://phar.phpunit.de/phive.xml
Copying phpunit-5.7.5.phar to /usr/local/Cellar/php71/7.1.0_11/bin/phpunit

You may need sudo privileges, but for my HomeBrew installation, this wasn't necessary.

GPG signatures required

Note that Phive only works with projects that also release a GPG signature for their tools. This is good security, but currently a significant limitation if you use anything else other than the few tools listed on phar.io.

The most significant missing tool for me is PHP_CodeSniffer. There's an open issue on their bug tracker but as it's been open for 6 months now, I assume that it isn't important for that project which is frustrating.

As such, this limits the usefulness of Phive for me today, but it's certainly a project to watch.

SSH keys in macOS Sierra

Now that I've upgraded to macOS 10.12 Sierra, I noticed that SSH required me to enter my passphrase to keys every time I used them. This was a surprise as it's not how 10.11 El Capitan worked.

This is how to fix it.

Firstly, add your SSH key's passphrase to the keychain using ssh-add -K ~/.ssh/id_rsa (or any other key file). You can now use your SSH key without re-typing the password all the time which is very handy for use with GitHub/GitLab/Bitbucket/etc.

You can add as many keys as you like and ssh-add -l will show you which keys are registered.

When you reboot, you'll notice that ssh-add -l is empty which is different from how it works on macOS 10.11 and earlier which automatically re-added the keys it knew about. In Sierra, Apple has changed it so that you now need to explicitly add the known identities to the ssh agent. This is done using ssh-add -A which you need to run every time you reboot.

To save having to do this, you can either add ssh-add -A to your ~/.bash_profile file or update your SSH config by editing ~/.ssh/config and adding:

SSH will now work as expected and you'll never need to reenter your passphrase once it has been added to the system keychain.

Using CharlesProxy's root SSL with home-brew curl

Once I installed Homebrew's curl for HTTP/2 usage, I discovered that I couldn't automatically proxy SSL through Charles Proxy any more.

$ export HTTPS_PROXY=https://localhost:8888
$ curl https://api.joind.in/v2.1/
curl: (60) SSL certificate problem: self signed certificate in certificate chain
More details here: https://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.

This is a nuisance.

As I've noted previously, you need to install Charles' root certificate to use it with SSL. On OS X, you do Help -> SSL Proxying -> Install Charles Root Certificate which installs it into the system keychain.

However, this doesn't work with the Homebrew curl or with the curl functions inside PHP. To fix this, we need to add the Charles root certificate to OpenSSL's default_cert_file.

I've talked about this file before. The quickest way to find it is to run:

$ php -r "print_r(openssl_get_cert_locations());"

on the command line. The output should be similar to:

Array
(
    [default_cert_file] => /usr/local/etc/openssl/cert.pem
    [default_cert_file_env] => SSL_CERT_FILE
    [default_cert_dir] => /usr/local/etc/openssl/certs
    [default_cert_dir_env] => SSL_CERT_DIR
    [default_private_dir] => /usr/local/etc/openssl/private
    [default_default_cert_area] => /usr/local/etc/openssl
    [ini_cafile] =>
    [ini_capath] =>
)

As you can see, the file I need is /usr/local/etc/openssl/cert.pem.

Grab the root certificate from the Charles app. On Mac, that's Help -> SSL Proxying -> Save Charles Root Certificate menu item.

You can then append the root certificate to the default cert file:

$ cat charles_root.crt >> /usr/local/etc/openssl/cert.pem

Now, everything works:

$ curl https://api.joind.in/v2.1/
{"events":"https:\/\/api.joind.in\/v2.1\/events","hot-events":"https:\/\/api.joind.in\/v2.1\/events?filter=hot","upcoming-events":"https:\/\/api.joind.in\/v2.1\/events?filter=upcoming","past-events":"https:\/\/api.joind.in\/v2.1\/events?filter=past","open-cfps":"https:\/\/api.joind.in\/v2.1\/events?filter=cfp","docs":"http:\/\/joindin.github.io\/joindin-api\/"}

(It also works in PHP as that's linked against the same curl if you followed my post on enabling HTTP/2 in PHP.)

Notes on keyboard only use of OS X

It's been a while since I could use a trackpad without pain and even longer since I could use a mouse or trackball. Currently I use a Wacom trackpad as my mouse which works really well as long as I don't use it too much. Fortunately, keyboard usage doesn't seem to be a problem (yet?), so I try to use my Mac using only the keyboard as much as possible. These are some notes for others who may be in the same boat.

This is a potpourri of items as I think of them. Hopefully, over time, I'll come back to this article and organise it better!

Alfred

A launcher application is invaluable. Spotlight is built in to OS X and bound to cmd+space. However, I use Alfred with the keyboard shortcut set to the same cmd+space as it's more powerful. This lets me open applications, folders and run scripts very easily.

Shortcat

Shortcut allows me to click on any control in a native application (known as Cocoa applications) via the magic of OS X's built in accessibility system. This is fantastic and has to be seen to be believed. I've bound it to alt+space and once activated, I type in the first letters of the thing I'm trying to click and it will highlight it. I can them press enter to click it, ctrl+enter to right click it, etc. Even more usefully, typing . into the shortcut box will highlight every control in the window, so I can click on items that do not have any associated text.

It is this tool more than anything else, which has made Safari my main browser.

Keyboard access to the menu

One of the nice things about well-written Mac native apps is that pretty much all operations are available as a menu item. You can access the menu via the keyboard by pressing shift+cmd+?. This opens the Help menu with the search box focussed. You can now type the first few letters of the menu item you're looking for and easily find it. Alternatively, use the arrow keys to navigate the menu.

Note that on OS X, some menu items have alternatives. Hold down the option key to see those.

For right clicking, I use shortcat. This is rarely needed on OS X as the right click menu's items are usually available directly from the main menu.

Moving and resizing windows

I use Mercury Mover which I really like. I'm not sure that it's still offered for sale though. Alternatives include Moom, SizeUp, Phoenix and Spectacle.

Browsers

As I've said, Safari is my main browser as it supports Shortcat and the cross-platform ones don't. However, I have a link selector extension for all three main browsers as it's quicker on Safari for well written HTML and the only option in Firefox and Chrome:

Other resources

See also:

Fin

Every so often, I will get "stuck" in a control or app and can't get out. This is really frustrating and invariably, the only solution is to use the mouse. The web in particular is the most problematic, including apps that are essentially websites in a window. I've also noted that apps that write their own versions of the native OS X controls are generally inaccessible as invariably the developer doesn't hook into the Cocoa accessibility framework with their own control.

OS X has a really good accessibility framework and it seems that all native apps get it for free. As a result, it's certainly possible to use OS X without a mouse with the right tools and knowledge.

Installing 32 bit packages on Ubuntu 14.04

This had me stumped for a bit, so I've written it down. If you have a 64 bit version of Ubuntu and want to install a 32-bit package, you simply add :i386 to the end of the package name like this:

However, this didn't initially work for me as apt-get couldn't find the package:

It turned out that my installation only had the 64 bit architecture configured which you can tell by running:

Note that there are no foreign architectures, which is the problem.

The solution is to add the i386 architecture first:

That's better! Now we need to run an update:

Don't forget this update! I did and wondered why I still had the problem…

Now the installation of the 32-bit package works:

All done!

Git submodules cheat sheet

Note: run these from the top level of your repo.

Clone a repo with submodules:

View status of all submodules:

Update submodules after switching branches:

Add a submodule:

Update all submodules to latest remote version

Update a specific submodule to the latest version (explicit method):

Remove a submodule:

or

Docs:

Routing specific traffic to the VPN on OS X

I have a client that requires me to use a VPN when connecting to their servers. I use OS X's built in L2TP VPN to connect, but don't want all my traffic going that way.

To do this, I unchecked the Advanced VPN setting "Send all traffic over VPN connection" in the Network preferences and then created the file /etc/ppp/ip-up like this:

The file itself is a bash script that runs various /sbin/route commands and looks similar to this:

/etc/ppp/ip-up:

Now, whenever I connect to the VPN, only traffic for hosts on 192.168.1.x is sent to the client's VPN and we're both happy.

Provisioning with Ansible within the Vagrant guest

I've been setting up a Vagrant VM for use with some client projects and picked Ansible to do this. Firstly, I played with the Ansible provisioner, but found it a little slow and then I realised that Ansible doesn't run on Windows.

Rather than migrate what I'd done to Puppet, Evan recommended that I look into running Ansible on the guest instead and provided some hints. This turned out to be quite easy.

These are the steps when starting from the ubuntu/trusty64 base box.

Use the Shell provisioner

As we're running Ansible on the guest, we use the shell provisioner, so my Vagrantfile contains:

This simply tells Vagrant to run init.sh which is stored in the provisioning directory of my project.

I immediately noticed a warning when running vagrant up: "stdin: is not a tty error". This is due to the way Ubuntu tries to echo a message to a shell that isn't interactive. To get rid of this, we need to configure Vagrant's ssh shell to be a non-login one in the Vagrantfile:

init.sh

Our shell provisioner needs to do two things:

  1. Install Ansible
  2. Run our playbook

So, init.sh looks like this:

Obviously, we only want to install Ansible once, so we check the output of dpkg-query and only install if it's not already installed. Installation is easy enough: we just use apt-get to add install what we need (quietly!) from the ansible ppa as per the docs.

Once Ansible is installed, we can run ansible-playbook with the --connnection-local switch to run out playbook, setup.yml in my case.

At this point, it's all standard Ansible all the way for your provisioning. It's just faster and works with Windows.

Git push to multiple repositories

I have a couple of projects where I need to push to more than one repo all the time.

I have been using this command line to do so:

However, I recently discovered that I can create a remote that points to more than one repository using these commands:

I now have a remote called all that will push to both repositories!

There's no automatic way to go the other way and fetch from multiple repositories though as apparently it makes less sense to fetch the same branch identifier from multiple places.

Further details in this email by Linus in 2006.

If you want to script the creation of the all remote, then you could use this script which manipulates the remote configuration settings directly:

Create this as /usr/local/bin/git-add-push-all.sh and then you can just run it in the root of your project.