Categories
Programming Technology Tools

Automated Playlist Backup With Swift

I mentioned in my post about scripting with Swift that I’d been working on something that inspired this. Well, here’s what it was: a rewrite of my automated playlist backup AppleScript in Swift. That version ran every hour… ish. Partly that scheduling issue is because launchd doesn’t actually guarantee scheduling, just ‘roughly every n seconds’, and partly it’s because the AppleScript was slow.1
Then I found the iTunesLibrary API docs, such as it is, and thought “well, that’d be a much nicer way to do it.”
And then I remembered that Swift can be used as a scripting language, cracked my knuckles, and got to work. (I also had some lovely reference: I wrote up my very basic intro post, but this post goes further in depth on some of the concepts I touched on.)

https://gist.github.com/grey280/0126ac93df1d52d91e78f52d97805246

Not the best API I’ve ever written, but not bad for something I threw together in a few hours. And I had fun doing it, more so than I did with the AppleScript one.
Oh, and it’s much faster than the AppleScript equivalent: this runs through my ~100 playlists in under a minute. So now I have it run every 15 minutes.2
(The configuration for launchd is about the same, you just replace the /usr/bin/osascript with the path to the Swift file, and make the second argument the full path to the directory where you want your backups going. See the original post for the details.)
I’m a bit tempted to turn this into a macOS app, just so I can play around with SwiftUI on macOS, and make it a bit easier to use. Of course, by ‘a bit tempted’ I mean ‘I already started tinkering,’ but I doubt I’ll have anything to show for a while — near as I can tell, SwiftUI has no equivalent to NSOutlineView as of yet, which makes properly showing the list a challenge. Still, it’s been fun to play with.


  1. I was going to cite this lovely resource, but since that website was built by someone who doesn’t understand the concept of a URL, I can’t link to the relevant section. Click ‘Configuration,’ then the ‘Content’ thing that’s inexplicably sideways on the left side of the screen, and ‘StartInterval’ under ‘When to Start’. 
  2. I’m also looking at the FSEvents API to see how hard it would be to set it up to run whenever Music (née iTunes) updates a playlist, but that… probably won’t happen anytime soon. 
Categories
Technology Tools

Automatic OCR with Hazel: The Easy Way

I have previously written about how to run OCR (Optical Character Recognition) on a PDF using Hazel and… a complicated pile of Python scripts and other software. Since I wrote that post, several of those pieces of software have been updated, and the core component has been, apparently, entirely abandoned.
Recently, while I was waiting for yet another keyboard replacement on my MacBook, I took another look at the OCR thing and found that there’s a much easier way available: OCRmyPDF.
It’s easy to install, assuming you’ve already got brew: brew install ocrmypdf
From there, it’s just a single action in Hazel. “Run embedded shell script: ocrmypdf $1
Admittedly, you can use some of their many settings to get something a bit nicer than just OCR; personally, I’m using --rotate-pages --deskew --mask-barcodes – the first two to help with variations in the input because I sometimes use a bed scanner, and the latter to help Tesseract, which can have issues with barcodes..
I’ve also paired it with a couple additional actions, just to keep everything organized:

I also took the time to stop using Dropbox as the go-between for my scanner and the Mac running Hazel; I’d forgotten that the scanner has a USB port. Plug in a cheap flash drive, and it’s available as a (very slow) file server. Mount the drive, add it as a Login Item so it’ll auto-mount on boot, and you can set Hazel automations to run right there. I’m not OCRing them there, though — like I said, it’s a very slow server, so it tags them ‘for OCR’ and moves them to my desktop.1


  1. With iCloud Drive handling my desktop, I’ve found it to be a pretty great ‘intake’ folder for all of my Hazel automations. It’s quite nice to be able to save a PDF from my phone, add a tag, and watch it disappear again as it’s auto-sorted, or throw a PDF on my desktop with a tag and see it pop in and out as the OCR runs. 
Categories
Technology Tools

Automatic Playlist Backup

You may have seen my monthly playlist posts on here; I put those together with a Shortcut that grabs the playlist, runs through all the songs, and makes a spirited attempt to fill in all the links off the iTunes Store Search API without hitting their mysterious rate limits.1
It’s not the be-all end-all, though — I’ve been wanting more and more lately to start making more and smaller playlists, things to match different moods. Y’know, the way normal people do playlists.
But, of course, I’m me, and I want to have the history of my music tastes, because, hey, sometimes you feel like reminiscing.
So, what to do? Well, I’ve done some work with the iTunes Library XML file, and while it’s sorta true that just wrapping that in, like, Git or something for version control could work, there are three problems with that:
1. iTunes is a weird, weird piece of software, and I don’t want to mess with its files too much.
2. The result is not at all human-readable.
3. It isn’t an excuse to learn something new.

So, what else can I do? Well, I’ve done a very light bit of tinkering with AppleScript,2 so I know it can interact with iTunes pretty well; there’s gotta be a way to do it there, right?
There is! I’ll share the script in a moment, but the functionality I wanted was “clear out the folder I give you, replicate my playlist hierarchy as directories, and spit out each playlist as a markdown file listing the title, artist, and album for each track, then commit the changes to a git repository.”
It took a while to get working — I’ve learned that AppleScript’s repeat with in loop is hilariously slow, unless you change it to repeat with in (get). I’ve also found out that the way it works with paths is super annoying, and that while it can write to a file, it can’t conceptualize creating a directory. There’s some great workarounds for that.
Now, here’s the script: I’ve left a couple {replace me} type things where you should fill in variables – namely, the path to your home directory (or wherever else you want it), and your own username, to fix some permission issues that can crop up.3

https://gist.github.com/grey280/9b95fdc8c16ec544a214f159bd008bbc

But wait, there’s a caveat: it’ll fail if the folder you gave it isn’t a git repository. Considering that I wanted this as a ‘set it and forget it’ sort of thing, I figured it wouldn’t be worth the effort to write a bunch of conditional code to do the setup. Do it yourself: git init && touch temp.txt && add temp.txt && git commit -m "Initial commit" takes care of all you need.4
Oh, and if you want it to be pushing the changes somewhere, because you’re paranoid and want everything in someone’s cloud, at least, add the remote and set it as the default upstream: git remote add origin {remote URL} && git push --set-upstream origin master

Set It and Forget It

So that’s pretty neat, but it isn’t really “set it and forget it,” now, is it? You’ve gotta open up Script Editor, pull up the script, and run it every time you want it to back up your playlists. Possibly workable for some people, but I don’t have a home server for nothing. Let’s make this truly automated.
From my prior experience with AppleScript, I know that you can set it off through a shell script by way of /usr/bin/osascript, so my first thought was to add a cron job. After a bit of research, though, I found out that Apple would prefer we use launchd instead, so I set about figuring out how to do that.
Now, if this wasn’t all an excuse to learn how to do something, I’d probably have just bought one of the GUI clients for launchd; Lingon looks pretty nice, and seems to work well.5
The process for writing your own launchd process is actually pretty simple: create a .plist file containing some XML, add it to the launchd queue with launchctl, and you’re off to the races!6
(Hint: if you want an easier way to see if your script runs than waiting and checking git log, you can add a line to the start of the AppleScript: display notification "Running playlist export".)
So, creating the XML: you want it to live in ~/Library/LaunchAgents/, and the convention is the usual reverse-TLD. (You can also use local.{your username}.{your script name}, but I’m so used to using net.twoeighty. in bundle identifiers that I just went with that.)
The important parts are the ProgramArguments array and the StartInterval integer. For ProgramArguments, give it the path to osascript,7 and as your second argument, the full path to the .scpt of the AppleScript.
Then, set the StartInterval to the number of seconds between runs; I’m using 3600, because hourly change tracking seems frequent enough for my purposes.
The result:

https://gist.github.com/grey280/f643a159a426ae25eb57139afd4f3cd5

(You can skip the StandardErrorPath and StandardOutPath – they help a little with debugging, more so if you’re running a full shell script and not a wrapper on an AppleScript.)
Finally, add it to the queue:

launctl load ~/Library/LaunchAgents/net.twoeighty.backupPlaylists.plist

And there you go – every hour, your iTunes playlists will get backed up to your Git repo, and you’ll have a nice history of your music tastes over time.


  1. iTunes Search is a really fun API to use, because via Shortcuts you only get a single input to it, and it is really bad at finding anything. Seriously — try to find anything off the top charts. As far as iTunes Search is aware, Billie Eilish doesn’t exist. 
  2. In lieu of Shortcuts having a way to set the volume on a HomePod, I’ve achieved a similar result with “run SSH script: osascript -e tell iTunes ...”. 
  3. Related: don’t put this anywhere with weird macOS access control things. Y’know, places like “Documents”, “Desktop”, anywhere in iCloud Drive or Dropbox, or even “Downloads”, which apparently is a much worse work directory than I thought it was. I eventually configured it to run out of and into my Public directory, because I figured that’d be easier than trying to mess around with the permissions somewhere else. 
  4. Without a file there, the git rm -rf . && git clean -fxd bit at the beginning is unhappy. 
  5. I used the ‘free trial’ version as a viewer for my works in progress; I figured if I’d done something really wrong, it’d complain about it being an invalid file or something. 
  6. He said, glossing over the couple hours of “fight me, macOS, why isn’t this working” 
  7. Probably /usr/bin/osascript, but you can use which osascript in Terminal to check. 
Categories
Programming Technology

Wrapping UserDefaults

UserDefaults, formerly NSUserDefaults, is a pretty handy thing. Simply put, it’s a lightweight way of storing a little bit of data — things on the order of user preferences, though it’s not recommended to throw anything big in there. Think “settings screen,” not “the image cache” or “the database.” It’s all based up on the Defaults system built into macOS and iOS,1 and it’s a delightfully efficient thing, from the docs:

UserDefaults caches the information to avoid having to open the user’s defaults database each time you need a default value. When you set a default value, it’s changed synchronously within your process, and asynchronously to persistent storage and other processes.

How handy is that! All the work of writing to disk, abstracted away just like that. Neat!
Now for the downside: it’s got a very limited range of types it accepts.2 Admittedly, one of these is NSData, but it can be a bit annoying to do all that archiving and unarchiving all the time.
One solution I use is writing a wrapper on UserDefaults. Swift’s computed properties are a very neat way to do it, and any code you write elsewhere in your project will feel neater for it.
The basic idea is this:
[gist https://gist.github.com/grey280/82b91e70ef49e087a0aefe3e9374d2b7 /]
There you go: you’ve got an easy accessor for your stored setting.
Of course, we can make this a lot neater; we’ll start by wrapping it up in a class, and make a couple tweaks while we do that:
[gist https://gist.github.com/grey280/ff61d2f31a0c9f3fc2e3595a55ae9de5 /]
First, we made a variable to point at UserDefaults.standard instead of doing it directly. This isn’t strictly necessary, but it makes things a lot easier to change if you want to switch to a custom UserDefaults suite later.3
Secondly, we pulled the string literal out and put in a variable instead. Again, this is more about code maintainability than anything else, but that’s certainly a good thing to be working for. Personally, I tend to wrap all my keys up in a single struct, so my code looks more like this:
[gist https://gist.github.com/grey280/77bccd85bb1843dbd7360f7e9eecc38a /]
That’s a matter of personal taste, though.
You might also have noticed that I made both the keys and the UserDefaults.standard private — I’ve set myself a policy that any access of UserDefaults that I do should be via this Settings class, and I make it a rule that I’m not allowed to type UserDefaults anywhere else in the app. As an extension of that policy, anything I want to do through UserDefaults should have a wrapper in my Settings class, and so private it is: any time I need a new setting, I write the wrapper.
There are a few more implementation details you can choose, though; in the example above, I made the accessors static, so you can grab them with Settings.storedSetting. That’s a pretty nice and easy way to do it, but there’s a case to be made for requiring Settings to be initialized: that’s a great place to put in proper default values.4
[gist https://gist.github.com/grey280/c067ec6f165e498f4a8e0a01164a74eb /]
In that case, accessing settings could be Settings().storedSetting, or
[gist https://gist.github.com/grey280/0080b5a0cfeb3d56e1140c81eec1edb4 /]
You could also give yourself a Settings singleton, if you like:
[gist https://gist.github.com/grey280/e393bc00557a0c24e407d25dc2b4cecb /]
I don’t have a strong feeling either way; singletons can be quite useful, depending on context. Go with whichever works best for your project.
And finally, the nicest thing about writing this wrapper: you can save yourself a great deal of repeated code.
[gist https://gist.github.com/grey280/2395469f039d96ba1e4d3558a74a839f /]
Or, if you don’t want to have a default return, make it optional, it’s not much of a change:
[gist https://gist.github.com/grey280/c03eb527f9ec5d7fb15d519563085875 /]
You can also do similar things with constructing custom classes from multiple stored values, or whatever else you need; mix and match to fit your project.
(Thoughts? Leave a comment!)


  1. If you’ve ever run defaults write from the Terminal, that’s what we’re talking about. 
  2. If it matters, it’s also not synced; the defaults database gets backed up via iCloud, but if you want syncing, Apple recommends you take a look at NSUbiquitousKeyValueStore
  3. If you want your preferences shared between your app and its widget(s), or between multiple apps, you need to create a custom suite; each app has its own sandboxed set of defaults, which is what UserDefaults.standard connects to. 
  4. UserDefaults provides default values, depending on types, but they may not be the same defaults that you want. If you want a stored NSNumber to default to something other than 0, you’ll need to do that initial setup somewhere. 
Categories
Technology Tools

Automatic OCR with Hazel

I recently got a copy of Hazel and have been doing a bit of tinkering around with various ways to automate my file management. Because, y’know, I can do it by hand, but why would I when I can make a computer do it for me? That’s the whole point of computers, after all.
I have a great deal of PDFs — something about scanning every paper, handout, receipt, or bit of mail I’ve received in the past six years or so does that. And if you have a commercial-grade scanner, it can be pretty easy to automate that stuff with Hazel, as the scanner will run everything it scans through Optical Character Recognition, and the PDF you’ll get will be nicely searchable.1
Unfortunately, the scanner I’ve got, while a pretty good one, is in a different price tier than the ones that’ll do the automatic OCR, so I needed a way of doing that after the fact.
There are some guides to doing that, such as this one,2 but they tend to require either Acrobat Pro or PDFPen Pro, which both have price tags above the “a couple hours of tinkering and no money” that I was hoping to spend on this project.
Throw a few computer science keywords on what you’re Googling, though, and you’ll find stuff that’s more in that vein.3 So, compiled here after I used Chase as a guinea pig, a guide to putting together automated OCR for free.4

Prerequisites

Before we can automate OCR, we need a few things installed. Open up Terminal, and let’s go.
sudo easy_install pip
(For those of you who didn’t put a few years into classes on computer science, I’ll try to explain as I go along. That first word, sudo, means “super user do”, basically; it’s the Admin Override for terminal commands. Be careful with it, you can make quite a mess tinkering with it. The next bit, easy_install, is part of the version of Python that comes default with macOS. pip is what we’re telling easy_install to install; ironically, pip is the modern version of easy_install.5)
The first time you use sudo in a Terminal session, you’ll be prompted for your password; if you’re not an administrator on the mac you’re using, you’ll need an administrator password. That’s a good opportunity to check with the administrator if this is something you should be doing at all.
Once pip is done installing, we’re going to get another installation helper, Homebrew:
sudo /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
Again, this is just installing a piece of software, Homebrew.

Components

Now that we’ve got the infrastructure built, we’re going to install the components that the OCR system uses.
brew install tesseract
brew install ghostscript
brew install poppler
brew install imagemagick
(If any of those fail, you can try to rerun them with sudo added to the front, i.e. sudo brew install tesseract.)
For reference: Tesseract is the actual OCR engine, Ghostscript makes it easier to interact with the PDF format,6 Poppler is similarly PDF-related, and ImageMagick handles conversion between basically any types of images.
Finally, we’ll use pip to install a specific version of another:
sudo pip install reportlab==3.4.0
ReportLab is yet another PDF-related library, but version 3.5.0 has some compatibility issues with the OCR system.

Installation

Finally, we’ll get the actual thing that ties these all together:
sudo pip install pypdfocr
PyPDFOCR is a lovely open-source project that ties all these components together into a single thing. Once it’s installed, you can use it from the terminal:
pypdfocr {filename}, where you replace {filename} with the non-OCR’d version of the file you want in OCR’d form.7 It’ll take a bit to run, but once it’s done, you’ll have a file (named {filename}_ocr.pdf) that contains, hopefully, the text of the document you scanned.89
Go ahead and test it; if you get an error about the file not being found, see if the file name or directory structure included a space. If it did, tweak the command a bit: instead of pypdfocr {filename} you’ll need to do pypdfocr "{filename}".
You may also get an error that mentions File "/Library/Python/2.7/site-packages/pypdfocr/pypdfocr_pdf.py", line 190… and a bit more after that. If it’s AttributeError: IndirectObject…, then you’ll need to tweak part of the code.10
cd /Library/Python/2.7/site-packages/pypdfocr
sudo nano pypdfocr_pdf.py
That’ll open up nano, a very lightweight text editor. Press control+W, type in orig_rotation_angle = int(original_page.get and hit return; this will take you to the line we want to edit. It’ll read orig_rotation_angle = int(original_page.get('/Rotate', 0)) — we want to change it to orig_rotation_angle = int(original_page.get('/Rotate', 0).getObject()) by adding .getObject() before the last close-paren.
Once you’ve done that, press control+X, then hit return again. Try OCRing something again; it should work this time.

Using Hazel

Now all you need to do to have Hazel automatically OCR a PDF is, in the actions, add a “Run shell script” action, use “embedded script”, and in the ‘edit script’ bit, put in pypdfocr "$1".
Keep in mind, this doesn’t replace the PDF in place, it’ll create a copy with _ocr added to the end of the name. If you’d like the original to be deleted once it’s done, rather than having Hazel do it, just add a second line to the embedded script: rm "$1"
You’ll probably want another rule to move the OCR’d versions somewhere else; while you’re building that, you can also use the ‘rename’ action to remove the _ocr bit, just tell it to replace “_ocr” with “”.
Have fun automating!


  1. And, as a result, useable for Hazel sorting by way of the ‘contents’ filter. 
  2. I was hoping to link to Katie Floyd’s original post about it, but her website is down at the moment, so I guess I won’t be doing that. 
  3. Technically speaking, I think all I added was “site:github.com”, but that did the trick. 
  4. This assumes you have a Mac, since you’re working with Hazel, and that you’re willing to do a bit of tinkering in the terminal, which I also kinda assumed, since you’re working with Hazel. 
  5. I think that’s irony; I was a computer science major, not an English major. 
  6. “the Printable Document Format format” 
  7. Tip: you can type pypdfocr  (including the trailing space) and then drag-and-drop the PDF from Finder into the Terminal, and it’ll automatically fill in the filename. If any part of the path includes a space, though, it’ll fail, so for filenames or folders that contain spaces, do pydpfocr "{filename}" – type pypdfocr ", drop in the file, and then ", and then hit enter. 
  8. Caveat: Tesseract isn’t perfect, especially with regard to the formatting, so don’t expect this to give you a perfectly-formatted version of whatever you scanned. That said, the process is lossless: {filename}_ocr.pdf is built by taking the original PDF file and then adding an invisible text layer over the analyzed text, so you won’t lose any information by doing this, you just might not gain anything useful. 
  9. Note that it’ll spit {filename}_ocr.pdf out not necessarily where the original file was, but wherever the Terminal session currently is; if you’re unsure about where that is, you can use pwd to have it displayed, or just open . to open it in Finder. 
  10. Don’t ask me why this is all “you might have to do this”, because I genuinely don’t know why this problem only pops up some of the time.