Personal Audio Deep Dive

  • Background
  • Chain of custody
  • Where things stand
  • Possible directions


Ah, the memories

I still remember the very first time I wore earphones. I was with my dad and we were browsing the various random electronics stores in Sham Shui Po in Hong Kong. I had just purchased my very first cassette tape – The Ghostbusters Soundtrack, he was giving me his old walkman, and we were looking for a good pair of earphones to go with it. We settled on a set of unassuming sony buds that came with a carry case which included a handy device for winding the cable up into a loop. On the street, just outside the shop, we put it all together – walkman, cassette, battery, and finally the earbuds. We put them in my 10-year-old ears and pressed play.

Hard to believe, but this was once the pinnacle of personal portable audio

The sensation was like no other I had felt up until that point. It’s one of those experiences you could describe as “magical”. My ears, up until that point, had only heard music through a crackly radio, or on the TV. Thinking back, the most high-quality audio I would have been exposed to at that point in my life was probably at the cinema. However, this was different; without the distraction of a screen, I was able to focus completely on exactly one of my senses, and even though the equipment was very rudimentary, even by the standards of the day, the combination of blocking out ambient sounds, as well as dramatically shortening the distance between the source of the sound and my ear canal combined to make a listening experience unlike any I had had before.

Not long after my initial foray into audo that was more personal than the living room radio, I was introduced to the idea of studio monitors. One summer I was sent to a fun summer music course in which a bunch of kids ware sat around a room with keyboards in front of us, and got to play around (as opposed to actually ‘play’) for an hour or so. To preserve his sanity, the instructor had all of us wear these huge headphones which allowed each of us to hear our own keyboard and ONLY our own keyboard. It was also around this time that the CD started to become the primary means by which people purchased music and without realising it, ordinary consumers now had easy access to extremely high quality audio, and with it, the justification for better listening devices.

For the longest time, HiFi stereo systems were the best way to listen to music. Fundamentally, you would have a bunch of speakers, each with their most accurate frequency response over a certain range, and the ‘system’ would distribute sounds from different frequencies to the different speakers to take advantage of those speakers’ strengths. Speaker construction itself also had a bearing on sound, since a speaker which was responsible for low frequencies would sound better in a certain sized (and shaped) container in a similar way to how a violin, cello, and double bass are different sizes to better accommodate the different frequency ranges that those instruments produce.

Of course, there’s also physics – the closer you can get the speakers to your ears, the better you can control the sound that reaches them. Anyone who has owned a large HiFi speaker system will know that as you move around a room, the nature of the sound can change dramatically; I once set up a small home cinema system with an amplifier which included a small microphone which you would place in the center of the room so that the amp could ‘calibrate’ the signal timing to the various speakers. Headphones can theoretically circumvent this problem without going to all the trouble. Therefore my next set of ‘phones was by overpriced Danish brand Bang and Olufsen – the B&O Beoplay A8. The speakers (drivers, in audiophile-speak) were relatively unsophisticated, but one of the things that B&O is known for is very carefully tuning the frequecy response to be “neutral”.

I bought these with money from my first ever paycheck from a summer job I did once, over 20 years ago

These ‘phones served me for a very long time literally travelling the world with me, outlasting several generations of iPod, and even more laptops. It was finally in 2007, when I got my (and the) first iPhone, back at a time when iPhones came with earphones, that I finally gave them up as my primary connection to my music. It’s not that they didn’t work with my iPhone, it was simply because the iPhone headphones also had a microphone which allowed you to talk on the phone (back then, it was something that I still did a lot). I was also chuffed to be able to use my phone as an iPod and save the trouble of having to carry two devices around.

Weirdly enough, my next set of headphones seemed like a step back in that it was both an “additional device” as well as being genuinely less functional as a music player compared to my iPhone. However, the lesson I learned here was the power of convenience. For reasons I cannot recall, I bought a set of Monster iFreePlay headphones and an iPod shuffle. Possibly the worst quality headphones I’ve owned in a long time, they paired with a device with no screen, and that couldn’t hold many songs. But they folded quite small, and allowed me to listen without having to fuss around with any wires. They say the best camera is the one you have with you, and in a similar way, the best listening device is the one you can most conveniently put on your ears, press play, and be listening to music.

My next headphones were motivated primarily by a use case which people who know me are probably wondering why it hasn’t come up sooner – plane trips. The truth is that I was in a period of my life where I was travelling a lot (more than normal, even for me) and I was on planes a lot. One day I showed up at the gate to fly to Calgary, and was unexpectedly upgraded to business class because all the other seats were full. Air Canada’s business class featured seats which folded down to completely horizontal and headphones which were made by Bose and had active noise cancelling technology. I originally thought ANC tech was a bit gimmicky, but after that flight, I was hooked and almost immediately purchased a set of Sennheiser MM450 headphones.

Primarily targeted at travellers, they were wireless, folded small, came with all kinds of adapters for airplane headphone jacks, and had a very decent active noise-cancelling system built in. Since the noise cancelling technology uses microphones to pick up background noise (in order to ‘cancel’ it with opposing-phased soundwaves), you could also use it to take calls. You could also charge the battery while you were using them, which is great for long flights, and when the battery went flat, you could plug a cable into them and use them old-school (this particular feature is why I chose them over the Bose, which become unusable when the battery is flat). I loved these headphones so much, I wrote a favourable review of them on this website, and despite some of the tech in them being quite old, I would still recommend these for anyone looking for a nice set of compact on-ear headphones (update: the model is discontinued).

These days, the market is split into either larger over-ear headphones (technically, “circumaural”, as opposed to “supra-aural” like the Sennheiser mm450) with similar features and functionality, or much smaller, what are sometimes called “true wireless” headphones – so called because not only do they lack wires to connect them to the source of their audio signal, but they also don’t have anything to connect the left to the right earbud. Despite me initially believing that my Sennheiser mm450s were the “end of history” as far as I was concerned in the personal audio space, there was one more use case which would make demands of technology which have only recently begun to be met – sports.

I like to skate. I also like to run. Try as I might (and I did), doing so with over-ear or on-ear headphones simply didn’t work; there was too much movement and the ‘phones would shake off my sweaty head. I used to ride my bike with the mm450s, but that only worked with my city bike (because I didn’t have to wear a helmet) and even then it was maybe a little dangerous because I couldn’t hear what was going on around me very well. For a long time, if I wanted to listen to music while I ran or skated, I had to go back to my old B&O A8 (I still have them), but eventually apple dropped the headphone jack from their phones, and I have long since lost my iPod shuffle. So what was I to do?

Having to exercise with earphones presents a whole new set of challenges. My only portable source of audio was my phone, which had no headphone jack, rendering my B&O A8 unsuitable for purpose. I considered getting apple air pods, but when I tried them out at the shop, they unfortunately did not fit well in my ears and would no doubt come loose or fall out during as rigorous an activity as running. turning to the internet, I learned about a company called snugs who could make custom ear pieces which conformed to the shape of your ears while also attaching to your earphones. In the process of investigating this option for airpods, I saw that they offered the service for other brands of headphones, among them my old friends Bang and Olufsen. Following the thread, I learned that B&O also made a true wireless earphone which had all the functionality of apple’s airpods, better sound, and (and this is what really piqued my interest) foam tips. Foam tips which were basically like earplugs; foam tips which would effectively block out most ambient sounds. I immediately bought a pair.

These ‘phones were a revelation. No fussing around with wires, excellent sound, great isolation, secure fit thanks to the foam tips, and a feature I hadn’t previously considered but now regard as indispensable – transparency. Curiously, my old Sennheiser mm450s also had a similar function but it wasn’t very good so I hardly used it. However B&O’s implementation worked very well and saved me the trouble of having to take them out whenever I had to talk to someone. In the end, the only reason I replaced them was because of their battery life (they once went flat when I was in the final 5k of a marathon), and I went back to another old friend: Mr Sennheiser.

When I first bought my Sennheiser Momentum True Wireless 2, I honestly thought I was buying more or less the same product as the B&O, except with slightly more waterproofing (sweat while working out had been a problem in the past) and significantly longer battery life (officially 7 hours up from 4), but there were other ways in which these were an upgrade. The microphone array in the sennheisers is more sophisticated, which means better call quality, better noise cancellation, and better transparency. Where previously transparency allowed me to easily converse with others and have a general idea of what was going on, now there was not only improved sensitivity and nuance in the microphones, but also a directional element. To my great surprise, when transparency mode was activated, not only could I hear everything around me, but I could quite accurately determine where sounds were coming from. This was one of those feature improvements that I didn’t know I needed.

In the world of personal audio there are, of course, higher quality earphones. Foam tips, while good, exert a certain amount of pressure on your ears (that’s how they stay in) which can become annoying or painful over time. As my previous research discovered, there is a world out there of custom-made tips and earphones which can produce an even better seal, with less pressure, against your ears, potentially blocking out even more ambient noise, as well as allowing the headphones to more accurately control how the sound enters your ears. The other rate-limiting factor is signal bandwidth – there is a limit to how much data can be transmitted over a Bluetooth connection, and that data rate is still nowhere near what can be sent via wires. The world of audio devices is ever changing and developing, and although we might not be there yet, there is every indication that we aren’t too far off that theoretical perfect set of headphones.

Right now, the sennheisers are fairly close to how good audio quality can get in a true wireless set of earbuds (I’m talking about accurate sound reproduction here, not bass-heavy equaliser presets, which are more a matter of personal opinion). There are a handful which can unambiguously be described as ‘better’ now – Final Audio EZ8000, Noble Audio Fokus Mystique, Bose QuietComfort 2, Sennheiser’s own Momentum TW 3, and maybe the Bowers and Wilkins PI7 and NuraTrue Pro, but not only are there other considerations (quality of the ANC, transparency mode, and call quality), but the rate-limiting factor in audio quality is no longer the quality of the drivers and seal with the ear canal, but bluetooth codec. (None of these are custom-fitted either, although there are companies like Snugs and The Custom Art which make custom ear tips for many models of headphones. Also, ADV make a custom true wireless, the M5-TWS, but it has no transparency or anc. So we’re tantalisingly close, but not quite there yet.)

Sennheisers with additional aftermarket custom tips made by snugs. The fit and sound-isolation are heaven, but they do obstruct the charging contacts, and have to be removed in order for the headphones to be charged.

Chain of Custody

The bluetooth bottleneck is a very interesting topic. (Before anyone asks, Wi-Fi obviously has a much higher data throughput, but consumes too much energy to be practical for portable headphones) It all starts with analog-digital conversion, that is; how we convert analog sound waves into the ones and zeroes inside a computer. For example, a CD is digital audio at 16-bits and 44.1kHz. 16 bits means the distance between the quietest sound and loudest sound can be represented by up to 2^16 numbers, or 65536, which in real-world terms translates to about 96 decibels, and 44.1kHz simply means that the temporal resolution, or sampling rate is 44,100 times per second. Why were these numbers chosen? 96db was probably because listening to sounds that are louder than that for a long time can damage your hearing, and 44,100 was chosen because the upper limit of human hearing is considered to be about 20,000hz (it gets less as you age) and Nyquist’s Theorem tells us that you need a sampling rate twice the highest expected frequency to prevent loss of information. An important corollary of all of this is that more bits only improves your theoretical maximum volume (into hearing-damaging ranges), and a higher sampling rate is literally imperceptible. Many audio engineers insist that a higher sampling rate improves the sound, but it turns out that the reason this happens is because the additional overhead does good things for the recording equipment (or rather, prevents weird/bad things from happening), and has nothing to do with sound reproduction. Further to that, very high sampling rates (e.g. 192khz), although theoretically imperceptible, will often cause audible distortion, again because of the effect of sampling very high frequencies on recording equipment. All this is to say that CD-quality 16-bit 44.1kHz audio has a bitrate of 44,1000 x 16 x 2 (stereo) = 1,411,200 bits per second (1411kbps) – remember this number.

Bluetooth was originally conceived of for mobile phone headsets. The audio demands of those are certainly not in audiophile territory, but the bluetooth standard has evolved over the years for all kinds of short range data transfers from heart rate monitor chest straps, to smart watches, to portable temperature sensors and coffee cup warmers. The audio codecs have also slowly evolved to carry more information. As a starting point there are the SBC and AAC codecs which are standard on most bluetooth listening devices. They support bitrates of between 256 and 345kbps. You’ve probably noticed that 345kbps is significantly lower than our CD-quality bitrate mentioned above. This necessarily means that some compression is needed, and for that amount of compression, with current technology, it is necessarily “lossy” which simply means that after a round of compression and decompression (and then reconverted to analog sound waves) there is a loss of information. More recently there have been codecs like aptX (384kbps), aptX HD (576kbps), LDAC (990kbps), and very recently aptX lossless (1100-1200kbps), and as you can see, the bitrates are getting very close to that magicall 1411 number, with (as the name suggests) aptX lossless claiming CD-quality with lossless compression, which seems entirely possible with that high a bitrate. (there are a lot of moving parts, but there is a known tradeoff between computational power and bandwidth when talking about information throughput – if you’re trying to squeeze 1411kb through a 1411kb ‘pipe’ then you don’t need any computational power, you can just send it, but with a very high amount of computational power at either end, you could theoretically squeeze that same packet through a much smaller pipe. you can reasonably assume the tiny chips in earphones to not only have a limited amount of computational power, but also for those chips to be limited by power-use considerations).

This sounds like game over, but it really isn’t. Unfortunately these higher bitrate codecs appear in a vanishingly small number of earphones. LDAC, for example, was invented by Sony and mostly appears in Sony headphones and (with a licensing fee) in a handful of other brands’ products. AptX is a product of chip-maker Qualcomm (your laptop’s wifi and bluetooth chips are probably made by them) and even though some of the older AptX codecs are fairly widely adopted, the most recent one (lossless) at the time of writing, only 1 set of mainstream earbuds (and 3 sets of brands I’ve never seen or heard of) have the technology installed. Also of concern is that only a handful of phones have the technology to transmit the codec, meaning that your options to combine transmitter and receiver (necessary for actually experiencing lossless audio over bluetooth) are extremely limited. The vast majority of available audio sources (phones, mostly) are still using SBC or AAC. Infuriatingly, ALL portable Apple devices are still using AAC, which makes no sense since Apple Music now offers high quality lossless music files through its streaming service. One can only hope that they’ve been working on their own bluetooth codec to compete with Qualcomm, but even then you can bet that for the first few years at least, the codec will only be supported by Apple’s own headphones, which as I mentioned above, fall out of my ears.

So what’s an audiophile to do? Go back to wires? Of course, in researching the space, I also came across custom in-ear monitors, which are considered the pinnacle of audio quality, and are the go-to for professional musicians either working in the studio or performing live, because of both their accurate sound reproduction and ability to block out ambient sound. Because these were developed with a very different use case in mind, none were designed with transparency, the ability to take calls, or any kind of active noise cancelling. They also all have wires. Although this avenue of investigation initially seemed like a dead end, there are now companies out there who make wireless ‘dongles’ which are not so unlike the electronics-containing part of hearing aids, and many of these devices are designed specifically with wired monitors in mind; some even come with features like transparency, the ability to take calls, and active noise cancellation.

A set of dongles – the gold tips are mmcx connectors, a very common standard for the cables of high-end in-ear monitors.

Where Things Stand

So while we’re not quite there, we are surprisingly close. not only that, we have two distinct avenues of development which could both produce our ultimate personal audio device. On the one hand, we have true wireless earbuds which are not considered true audiophile quality just yet, but are very quickly improving (the Fokus Mystique have two balanced armatures and a direct driver FFS – which is similar hardware to what you find in many professional IEMs), and on the other you have the option to pair an existing custom IEM with a detachable Bluetooth dongle. The former has the appeal of everything existing in a unified device and being designed to work together, while the latter pairs a mature, proven technology with a fast-developing one (which is, in truth, already very good), and has the additional appeal that you can choose to unplug the wireless dongle, and plug the wire back in for example when you’re sitting at home at your desk, or maybe when you’re on a plane. The downside, however, is that it is very difficult to effectively weatherproof (or even sweat-proof) a device with a connection exposed to the elements.

ADV Audio make custom true wireless in-ear buds, but the charging system is awkward, and the charging ports are very exposed

I must say that I find both paths very appealing. I already use my Sennheisers with custom silicone tips, and I can’t recommend these highly enough, both for the improvement in sound quality, isolation from outside noise, and comfort. However, the tradeoff is that I must remove the tips in order to be able to fit them into their charging case, and to expose the charging contacts. The custom in-ear true wireless buds manufactured by ADV don’t even have a charging case, and implement a charging solution which I can charitably describe as “awkward”. We are, however, not doomed in the space of custom in-ear true wireless buds – ultimate ears UE Fits (which have light-activated, mouldable ear tips, but curiously, poor sound quality), and now Final Audio with their recent ZE8000 have a form factor where the charging contacts are not in a position to be encumbered by the addition of custom tips. The only thing that makes me hesitant is that, at this level of features and audio quality (and price), I’d like my buds to last a long time and at the rate that technology like bluetooth advances, coupled with the rate at which batteries degrade I’m not sure these buds would have a particularly long life.

Due to the design of these ‘phones, a large custom-fitted ear tip wouldn’t obstruct the charging contacts.

With that in mind, maybe the better direction to go is pairing a custom IEM with a bluetooth dongle. The space for these dongles is quickly evolving, and we can expect bluetooth to gradually increase its rate of data throughput, perhaps eventually equalling that of wires or at least being able to match the information throughput of human hearing resolution. That is just as well, because good custom IEMs can cost an order of magnitude above what we’ve become used to paying for true wireless in-ear phones. Being able to hold on to the “expensive bit” and swap out the bluetooth bit every few years, as technology improves, sounds quite appealing. So why haven’t I bought a nice set of custom IEMs? Well, first of all, there is the cost; while even the most expensive true wireless earphones can be had for less than $500, you can easily find your bill in the thousands for good custom IEMs. The other reason is somewhat less obvious if you (like me) have been focussing hard on the hardware of the earphones – the quality of your source audio.

Dongle paired with an IEM

Anyone who’s ever messed around with HiFi audio before knows that it’s a gadget-lover’s dream or nightmare depending on your budget. There are all manner of little widgets which stand between a musician performing their thing, and you hearing it. I could write an entire article on this, but to summarise, the primary way that we listen to music these days is through computers (mostly smartphones, but even modern high-end dedicated music players are essentially small computers), and the vast majority of music files out there are not high quality enough to realise the subtle differences between good, and very good earphones. Thing are changing though. Apple Music’s streaming service for example is starting to offer high quality lossless-compression files and Spotify is rumoured to have a similar feature in the works. What this theoretically means is that the file you’re getting is the same quality as the original recording. Is that all? Of course not. Anyone with even a basic understanding of physics knows that sound is a a bunch of pressure waves moving through the air and not, oddly enough, a bunch of ones and zeros. What I’m getting at is that the information in these digital files must be converted into the analog signal which is “sound”, and that requires a good digital-analog converter (DAC). Computers all come with their own DACs, but despite the computing power available, these are very lacklustre. External, specialised DACs are generally superior because not only are they built specifically for purpose, but are also not subject to the kind of EM interference that you might come across inside a computer. A digital signal retains its fidelity in transmission because it implements error-correction. Without going into too much detail, when sending digital packets of information, a computer ‘pads out’ these packets with spare bits so that when they are received the decoding computer knows exactly how many bits there should be, and if a one or zero is flipped here and there, it can be ‘corrected’. An analog signal has no such facility so is heavily dependent on its environment being free of anything which might introduce noise (seriously, even the material of your headphone cord has a slight but often noticeable impact on sound quality).

So in other words, apart from laziness, I’m probably going to pause for the time being on my Sennheiser Momentum TW2 paired with custom-molded silicone tips (which almost doubled the cost of the entire setup, in a similar way in which my road bike’s power meter nearly doubled the cost of my road cycling setup). Since those buds were released there have been a handful of significantly better-sounding true wireless earphones (including Sennheiser’s own Momentum TW3), but the improvement is slight and, to me, not worth the extra cost. The quality of the drivers in the earphones, coupled with the perfect fit of custom tips means that the real rate-limiting factor is the data throughput of the Bluetooth standard, and the quality of the original audio file (90% of my music files are 320kbps mp3s which is easily covered by the base SBC/AAC codecs). I’m sure that in a few years, some combination of advances in bandwidth, lossless data compression, and the computational power to put it all together will make it possible to wirelessly transmit sound to a pair of earphones at a high enough quality to faithfully reproduce the quality of the original master recordings. By then, who knows if it will all be packed into a true wireless set of earbuds with decent battery life, good transparency function, and well-implemented active noise cancellation technology, or whether I’ll have to fork out thousands for a set of custom in-ear monitors to attach to a separate bluetooth module. Maybe there will be something completely different to fill the space.

It’s hard to know what the endgame is here. Perhaps an earbud with an unobstrusive form factor, which at the touch of a button can let all of the outside sounds through, as if you weren’t wearing earbuds at all, or at the touch of another button, let in a pre-defined percentage of outside sounds (e.g. for being in an orchestra pit, where you need to hear the sound of the intstruments around you, but don’t want to damage your hearing), or at the touch of yet another button, completely block out all outside sounds. And in addition to all of that, take perfectly clear calls and reproduce sound exactly as it was recorded. One can only hope.

Unexplored Paths

A significant avenue I haven’t explored is the rapidly developing technology of bone conduction headphones and microphones. The principle here is that a part of your headphone presses against some bones in your skull and vibrates them in such a way that you ‘hear’ the sounds in your ears even though no part of the mechanism is doing the ‘normal’ thing of vibrating a membrane at certain frequencies, causing vibrations to propagate through the air, which is how we usually receive sounds. This method has the advantage of not fatiguing your eardrums or ear canals (the pressure from sound vibrations in your ears over long periods of time can be fatiguing, and over very long periods of time or at very high volumes, can damage your hearing). At present, the technology doesn’t seem to be capable of reproducing sound at nearly the same fidelity as standard speakers, but who knows what’s possible in the future. Bone conducting microphones are also a fascinating technology as they reproduce sound by sensing the vibrations in your jawbone, and they’ve been very popular not so much in contexts requiring sensitive reference microphones but in situations where there is a lot of background noise (which obviously doesn’t vibrate your jawbone in quite the same way that your own voice does). Perhaps some combination of all of these technologies will eventually lead to the perfect earphone.

Leave a comment

Your email address will not be published.