In this first article in a new category of entries ‘Design it better’, I’m going to take things, usually websites or applications, I’ve used or seen that have potential, but could be designed in a better way.
I’m also going to provide an insight into some of the thinking that went in to the design enhancement so you can see why it was necessary. While the example is trivial and the solution obvious, you’ll see there are other solutions that could have been used, and why the selected one is better.
The first thing I’m going to cover is a new feature found in iTunes 9.1 called ‘Convert higher bit rate songs to 128 kbps’.
Converting higher bit rate songs to 128 kbps
The first question is why is this available at all? You can see it in the image below:
I imagine the primary answer is that the Apple Store now sells 256kbps (referring to the file’s ‘bit rate’) songs, but the iPod Touchs, which sell like hotcakes, have limited storage capacities (32GB / 64GB) compared to the hard-drive based iPods (160GB). Also, the smaller iPods, the Nano and Shuffle, have even smaller capacities, limiting the amount of music to a few tens of CDs.
At first glance, it seems reasonable, but let’s look at some other circumstances in which it would be useful, some of the flaws in the implementation and therefore why it needs some enhancement to make it truly useful for a variety of situations.
Ripping CDs for listening and / or for archiving
In my own situation, when I first used iPods and iTunes, I ripped my CDs at 160kbps (instead of the default 128kbps) to give the ripped file a bit more quality. I’d like to think I’m a bit of an audiophile in that I like high quality music and equipment, but I’ve got enough self control not to blow a small fortune!! A few years later, I then re-ripped them at 256kbps VBR, again to increase quality as I upgraded my iPod from a 40GB to 160GB version.
However, more recently, I’ve been re-ripping my CDs using ALAC (Apple Lossless Audio Codec) to provide both an archive copy of my CDs, as well as play back my music through my AV system in CD quality, which is why, of course, people buy CDs instead of tapes. I play them back through an AppleTV, which feeds my receiver through an optical digital cable. This gives me CD-quality sound – the same as the original.
ALAC results in files that are about half the size of the original CD (e.g. 350MB per CD), while recording at apple’s standard rate of 256kbps results in files that are about 1/6th of the CD size (e.g., 100MB per CD). The corresponding bit rate is a little over 300kbps up to over 1,000 kbps, depending on the complexity of the source.
For a while, I maintained two copies of my CDs; one using ALAC and the other transcoded at 128kbps or 256kbps, depending on the type of music (e.g. general pop / rock music would be at 128kbps, while my renaissance polyphony and other classical music would be at 256kbps. This meant having special playlists created to separate out all the duplicates. As a consequence, it would be fair to say that my library became unmanageable. I then bit the bullet, deleted all the lesser quality copies and began selectively re-ripped my CDs using ALAC, meaning that I could still get most of my really good music on my iPod.
Buying music at 256kpbs and transcoding to 128kbps
The Apple Store sells music at 256kbps. It’s common knowledge amongst audio fans that transcoding from one lossy format to another exacerbates quality problems and results in a poor quality experience. This is in contrast to transcoding from one lossless format to another (e.g., ALAC to FLAC). The lossless compression is similar to the popular ZIP compression method which can be used to compress large documents for emailing, which can then be uncompressed to access the original document. that is, the decompressed version is identical to the original version. Lossless encoding seems to allow around a 50% reduction in file size, but no better.
When compressing audio files using a lossy method, in order to make the file smaller than the original, an algorithm is used to remove some of the audio information that we can’t perceive. For example, a loud drum sound masks other quieter sounds. When these quieter sounds are removed, the file size becomes progressively a little bit smaller.
Imagine if the ZIP compression format removed information from your documents, such as all the 1, 2 and 3 letter words, leaving behind only the bigger (and more meaningful) words? It simply wouldn’t work since it would take a lot of effort for us to read the document and try and figure out what the missing words are.
While for audio compression, it’s not quite as serious (since we can’t perceive the missing sounds anyway, and it doesn’t generally affect sound quality), however, it does become serious when transcoding the compressed file to another format. This is similar to taking a photocopy of a photocopy. The second copy is not as good as the first, and is obviously so, though we can generally tell what the picture in the second copy is.
The transcoding effects become more serious when the file sizes are even smaller due to lower bit rates. For example, transcoding from a 320kbps file to a 128kbps file is OK, while transcoding a 160kbps to a 128kpbs file can be worse. This is because the smaller file was created using a lower bit rate, which means more information was removed from the source to make the file smaller.
One of the ways to prevent the transcoding issues is to create the second lower bit rate version from the original source.
Transcoding other music formats, variable bit encoding and managing the threshold
In the last section, I discussed the transcoding issues and how the quality of end file is dependent on the quality of the source. a low quality source results in a low quality second file.
This is important because many people, like myself, have collected music over time, some of which is in the older formats, like MP3. In contrast, Apple’s format, called AAC, is generally considered to have better algorithms for compressing the music. Therefore, an MP3 file at 128kbps does not sound quite as good as an AAC file at the same bitrate. Therefore, transcoding a lower quality source file to the specified target of a 128kbps will cause a loss in the quality of the file and is likely to become perceptible.
With Apple’s implementation of the transcoding feature, it is not clear where the threshold is, but it can be assumed that any file greater than 128kbps will be transcoded.
In my collection, I have a number of MP3 files that are encoded using the Variable Bit Rate encoding method (VBR).
VBR encoding is one of three methods, with the other important one being constant bit rate (CBR) encoding. With CBR encoding, each file ends up with a target bit rate of exactly 128kbps. Without going into the detail, VBR encoding usually results in better quality sound at the same bit rate, but the final bit rate can vary quite a bit. for example, with CBR each file from a CD will be 128kbps, while for VBR, each file can fluctuate between 100kbps and 130kbps, as the algorithm optimises the final bit rate based on the complexity of the music.
The effect this has is that some files that are just above the threshold of 128kbps will be transcoded, while others will not. If you’re listing to an album you know well, you’re very likely to hear the difference in quality from one track to the next.
Summary of the issues
Knowing a bit about the domain (what determines audio quality and how the various formats differ from each other), people’s music libraries (different formats, different bit rates) allows you to think about the different problems that could occur with a simple implementation.
These can be distilled into the following three core issues:
- No ability to change the source file bit rate threshold (e.g., all files over 128kbps, 160kbps, or other)
- No ability to specify the source file format (AAC, MP3, both)
- No ability to specify the target file bit rate (e.g. 128kbps or 256kbps, or other)
Now, Apple prides itself on delivering a simple solution that works for most circumstances, but doesn’t necessarily work for all. This is especially important in a mass-market product where many customers are not overly tech savvy, nor care to be so. The solutions to the three core issues need to take into consideration the nature of the product and the target market.
So that brings us to thinking about solutions.
In determining solutions to the issues, knowledge of the domain is very important. If you didn’t know that there were lossy and lossless encoding formats, and that transcoding from one lossy format to another causes a greater loss of quality than encoding to the smaller file size direct from the lossless source, you would design a solution that would be ignorant of these factors, letting people possibly have a low quality experience with their music.
I’m going to approach the solutions by addressing each issue individually, rather than simultaneously. You’ll recall that the original implementation was like this:
Selecting the source bit rate threshold
In its simplest form, this is solved as follows:
The user can now select the minimum threshold before transcoding takes place. This satisfies a need to only convert files of a sufficient quality so that the transcoding effects on quality are less perceptible. This solution has a number of issues.
Going back to the domain of audio encoding, it turns out that the highest bit rate MP3 files (320kbps) overlap with the lowest bit rate ALAC files (just above 300kbps). This means that users cannot use the bit rate setting as a proxy for only selecting ALAC files for transcoding where the transcoding quality artefacts will not be as obvious.
That is, uses cannot simply select all files over 320kbps, since some ALAC files will not be included. therefore, users either need to know what bit rate to select, or to simply accept the fact that not all the files will be transcoded, possibly affecting how much music is stored on their iPod / iPhone
The second issue is that it uses a drop down menu to select the source bit rate. the menu would contain the popular bit rates, such as 128, 160, 256 and 320kbps. This approach can be faster and more usable, but it limits user specificity.
Selecting the source file format
This can be solved as follows:
This allows users to only transcoded files of a specific format. In this case, only ALAC (lossless) files will be transcoded. This means that lossy file formats will not be transcoded.
However, this has the issue that anyone using ALAC is likely to be sensitive to file quality and will probably want to select a different target bit rate.
Selecting the target bit rate
This can be solved as follows:
This allows users to select higher bit rates for the target files, especially important for those users who want higher quality music on their iPod. by implication, all files over 256kbps would be transcoded (since it would not make sense to transcoded a 128kbps MP3 file to a 256kbps AAC file!!)
However, it doesn’t allow users to select the source file format, or a threshold at which they will be transcoded, especially when using 256kbps VBR encoding, where the file sizes fluctuate, but not enough to make it worth transcoding a 270kbps file to a 256kbps file.
Combining the individual solutions
It’s easy to provide individual solutions to the three core issues. Combining them into something that is usable, understandable and aligned with Apple’s design philosophy is something else.
There are two approaches that can be used:
- Combine the individual solutions to make a single selection of source and target settings
- Use a method similar to smart playlists where multiple expressions can be created to select source and target settings.
Combining the individual solutions into a single selection
From the domain knowledge we gathered, it would seem that the most critical determinant of the final quality of the audio would be the source format, rather than the bit rate. If you didn’t know this you could have simply used a target bit rate instead.
Therefore, one of the solutions is as follows:
This allows users to only select lossless files for transcoding, and to select the bit rate they want. Therefore, if they have bought music from the Apple store, then all their quality music will be at the same quality. Further, lower bit rate files remain untouched and therefore there will be no transcoding artefacts affecting quality.
However, this has the following issues:
- It only makes sense to include the lossless formats (WAV, AIFF and ALAC) in the drop down menu, because you can’t (shouldn’t) select all MP3s for conversion, especially lower bit rate ones due to transcoding quality issues)
- 256 / 320kbps files (MP3 or AAC) cannot be selected to be transcoded to 128kbps files
Consider instead this solution:
This allows users to select to convert only ALAC files to AAC, or to convert high bit rate lossy files to a lower bit rate. However, it has the following issues:
- Users cannot select both high bit rate files (e.g. 320kbps MP3 files) and ALAC files and convert them to a lower bit rate AAC
- It is a somewhat complex expression and not quite as simple as the previous one allowing selecting of just the file format.
Consider instead this solution:
This allows control over the source bit rate, however, it can’t easily be used to select only ALAC files (since the lowest bit rate ALAC files overlap with high bit rate lossy files).
Using multiple expressions to select different formats
A number of the single solution approaches, while richer, don’t offer enough flexibility to include songs of a range of different formats. They don’t offer enough value to consider implementation at this stage.
In iTunes, the smart playlists editor provides an approach where multiple rules for selection can be included. This screenshot shows the smart playlist editor:
We can adopt this approach for the selection of file formats:
Note that the ALAC format has no bit rate selection because the format is lossless and the bit rate is a function of the song complexity, not the level of compression. Therefore, only if the format is lossy, then a selection of the source bit rate is presented.
When the ‘simple’ version is selected, then this display could look like this:
The background processing and the user interface design
One thing we haven’t considered is the behind the scenes aspects to the solution, namely, what is the selection criteria? At what bit rate will the files be selected?
Given the issues regarding transcoding and audio quality, it makes sense to only transcoded files that are sufficiently greater that the target bit rate. For example, if the target bit rate is 128 kbps, then only files greater than 256 kbps would be transcoded.
In this case, consider the following design:
In this solution, the solution borrows from the design that selects a target bit rate, but uses supporting text to decrease the ambiguity over which files will be included in the conversion.
The supporting text would be updated based on the target bit rate selection. For example, if the target is 128 kbps, then the text would read ‘only songs greater than 256 kbps and / or encoded in a lossless format will be converted’.
The final solution
You can see that there are many solutions, each with pros and cons. In general, attempting to add a little bit of flexibility did not result in a strong design, compared to adding a lot more complexity (i.e., the multiple expression interface).
There are two solutions. The first is likely to be in alignment with Apple’s design philosophy and offers just enough flexibility and clarity for audio quality sensitive people. It also ensure that songs with a bit rate that is too close to the target rate are not converted as there is no material gain, but a good chance for a degradation in quality.
Solution 1 – Simple selection of target bitrate with supporting text
If you select 128 kbps as the target bit rate, then, say, only songs 256 kbps or greater would be selected.
Solution 2 – Selection of source formats and bit rates
The second solution builds on the first and allows for multiple selections. The rationale is that anyone making such selections is likely to be savvy enough to make sensible selections and understand the impact.
When expanded, the following is shown, in page:
The design is simpler than allowing for a target bit rate for each of the song formats. This extension could be adopted if necessary, but I don’t think it is.
I hope you’ve enjoyed reading this article and considering the thinking that went in to determining the final solution. While the final solution may be obvious, quite a bit of thinking went into it to test out the consequences before settling on one that worked the best. much of this was based on understanding a bit about the audio domain and therefore not making decisions that would result in a detrimental effect on audio quality.