26 1 / 2012

My UTF-8 Byte Order Mark (BOM) Adventure

Another random issue that I resolved earlier with the help of the Internet…

For quite some time now, I’ve had one particular JavaScript file in Extensible that, after every run through my build script, would come out the other side with a garbage character at the top of the file. When running it in the browser the page would choke with a strange “ILLEGAL CHARACTER” error showing the following arcane string: 

I never even knew where to begin with this, and frankly just assumed it was a bug in the Java-based build tool. I’ve been manually removing that character after my builds, well, for way too long. Tonight I finally decided to fix it, and the brilliant idea struck me — how about Googling “”? Well, duh, of course that led me immediately to many explanations of the good old UTF-8 Byte Order Mark, which is:

“Character code U+FEFF at the beginning of a data stream, where it can be used as a signature defining the byte order and encoding form, primarily of unmarked plaintext files. A BOM is useful at the beginning of files that are typed as text, but for which it is not known whether they are in big or little endian format—it can also serve as a hint indicating that the file is in Unicode, as opposed to in a legacy encoding.”

OK, awesome. All of my files are already encoded as UTF-8, apparently without byte order marks, so I have no clue how this one file got one inserted. How to get rid of it?

I use Aptana as my JavaScript editor, so I tried converting the encoding, but turns out there is no explicit option to encode without BOM, so Apatana silently ignores the existing BOM regardless of the encoding (apparently it’s a general Eclipse issue).

A little more Googling led me to a pointer that in Notepad++ you can explicitly encode to UTF-8 without BOM. Although I work on OSX I use Windows VMs for testing, so I cranked up Parallels, copied my script into Windows 7, downloaded Notepad++, converted, copied back, and presto — problem solved! There’s undoubtedly a way to do the conversion in OSX, but this was easy enough and it worked.

Yet another completely useless factoid that was hard-won, and that I’ll almost certainly forget by tomorrow morning.

09 12 / 2011

Fixing a Stuck VM

My Parallels virtual machine got stuck in a “stopping” state earlier (after an aborted reboot of my physical machine) but it would never stop, and could not be restarted. Luckily this worked like a charm.