Features

Improving the quality of mobile ringtones, quality of life?

by Thomas Dolby Robertson | posted on 12 April 2002


"They annoy the hell out of people. They sound like crap. You may think you're the coolest guy on the block but one night you're out at the movies and just as the cancer-riddled heroine is saying her last goodbyes to her tearful children your phone suddenly blurts out the Match of the Day theme!"

This was the response I got from my 21-year-old nephew in South London after telling him my company Beatnik was now focused solely on ring tones. He didn't get it. All things considered, why would I bother with bleepy mobile phone ring tones when I could be getting ready for my big comeback on Top of the Pops with Dave Hill of Sweet behind me on the bass?

Well, there are three reasons I find the mobile audio market interesting. Firstly, downloadable ring tones have been the surprise smash hit of the wireless data world, generating over $1 billion in sales in the last twelve months. Secondly, I believe that advances in mobile audio technology will eventually bring about a convergence between three recent cultural phenomena: the ring tone fad, instant messaging, and peer-to-peer file sharing. And thirdly, my company Beatnik is in a unique position to dominate this emerging marketplace by virtue of its unchallenged leadership in software synthesis on handheld devices.

If you think this all sounds very mercenary, you'd be mostly correct. But I was also intrigued by the altruistic opportunity: maybe I could make life a little better for all of us by improving the musical and sonic quality of the inevitable noise pollution that mobile phones have brought upon us!

The mobile phone industry, barely ten years old, has touched close to a billion people worldwide. It's now entering an awkward adolescence as giant companies, bemused by their own overspending in the race for spectrum and market share, scratch their heads and wonder how they can possibly sustain the astonishing growth of the Nineties. Yet it was those years (almost accidentally) that spawned SMS. In the heavily-hyped Wireless Data world, SMS is one of a very small number of bona fide success stories. Even though SMS fees average out around a tenth of a typical user's phone bill, SMS accounts for nearly half the annual profits of several large European operators. And what's the most lucrative single category of wireless data after personal texting? Personal ring tones.

The personal ring tone dates back to the early 1990s when a senior Nokia marketing executive, walking past the handset lab and hearing the ringer being tuned and tested for maximum penetration, commented that it almost sounded like real music. It was decided to risk putting musical ring tones on a mass-market handset. At the last minute, one of several tones had to be picked as the default; without much thought, a tone containing a phrase from an obscure waltz called “Gran Valse†was selected. This gave birth to what many people consider a Nokia-branded jingle, and it has become so ubiquitous that thrushes in Copenhagen trees are starting to imitate it with their beaks.

But the role of the telephone ring dates from many years earlier. The first commercial telephones became available around 1880; and given that only affluent city people could afford them, they had to ring loudly enough to be heard from every room of a three-storey New York brownstone. (These days, a phone's owner is rarely more than a few feet away, yet its ability to annoy a whole building full of people seems to remain unchanged.)

The first function of a ring tone, then, is to alert the user that there's phone call waiting to be answered. In a crowded office or pub, it's certainly helpful to have your phone set to play a recognizable tone to distinguish it from a hundred others within earshot. But consumers have made custom ring tones popular for a different reason altogether: they personalize your phone, just like a custom colored faceplate, or a fistful of branded wrist straps, as is commonly seen in Japan. This is especially relevant to younger (teenage and college age) users, who often rely on parents to buy their phone and pay their bills, but who still want to express their musical tastes, fashion allegiances and rebellious individualism.

Considering that the technology used to generate ring tones—typically a monophonic square wave played through a small transducer or “buzzerâ€, allowing for a simplistic electronic rendition of a known tune—is so primitive, it's quite astonishing how popular downloadable ring tones have become. Last year in Japan for example, where DoCoMo's I-Mode service has over thirty million subscribers, ring tones shared the top spot with games (about 22% each) as the most in-demand wireless data service on the menu. Downloadable ring tones in Japan in 2001 generated half a billion dollars in revenue.

In Europe, where the majority of ring tones are illegally distributed, music publishers are suddenly waking up to a new possible revenue stream from ring tones if they can solve the piracy issue. The entire global music publishing industry has been estimated at about $8bn per year, yet downloadable ring tones added up to ten percent to that number in the very first year of their existence. And most of that didn't make it into the hands of the song copyright owners. Instead the profits went to a brand new breed of “ma-and-pa†ring tone vendor, taking full page ads in the Daily Mirror and charging a couple of pounds for connection minutes while users navigated through an automated tree to find their favourite Top 10 tone-du-jour.

Two problems limiting the further growth of downloadable ring tones are poor audio quality and lack of copyright control. How can these be addressed so that end users are happy and everyone comes out on top? In Japan, phone manufacturers have gone to silicon chip makers for the answer. Most Japanese phones integrate a dedicated audio chip supplied by Yamaha or ROHM. These are capable of multi-timbre, multi-part ring tones, and they are available with different numbers of polyphonic voices. 32 voices is currently the magic number.

But these dedicated audio chips add several dollars to the per-unit manufacturing price, and which dissuaded European and American manufacturers from following Japan's cue. What was required was a way to achieve richer and more diverse sounds on the phone without adding to the cost of memory, battery power, and speaker components. In addition, Japanese phones tended to sound too - well—too Japanese.

In 1999, Beatnik Inc saw these as problems it knew how to tackle. After several years in the game sound, web sound and set-top box arenas, Beatnik's engineers were expert at putting powerful sound engines into small spaces, and without specific hardware dependencies. This was clearly a lucrative market to go hunt down, and so Beatnik's core technology was re-designed and improved on many fronts—such as a smaller memory footprint, more voices per CPU cycle, more instantaneous interactive functions, a greater range of API calls, and a more compact sound bank. Research pointed to certain new mobile processor designs from companies like ARM, Intel, and Texas Instruments as the ones most like to be in mass-market handsets circa 2002. The new “mini†Beatnik Audio Engine (mBAE) was optimized for top performance on these chip sets, with strong engineering and marketing support from the parent companies.

The first major company to license mBAE was Nokia, the global handset leader with close to 40% of the market. Often the innovator in the mobile field, Nokia decided to up the ante for ring tones by incorporating Beatnik into its designs. The first mass-market handsets to feature the mBAE will be Nokia's 3510, 3585, 3590, and 7210 models, due out in the middle and later parts of 2002. As well as their enhanced ring tone ability, these phones have richer game soundtracks, taking advantage of the same Beatnik engine. The mBAE can be invoked by a game or application, allowing sound like gunshots and explosions to be triggered in real time. To cap this, rich messaging applications that use the new MMS (Multimedia Messaging) standard will now be able to include high-quality, low-bandwidth SP-MIDI content, and the mBAE will handle playback. (SP-MIDI has been approved by the 3GPP for use in MMS and EMS messaging.)

The key buzz word for 2002 is “polyphonicâ€, as these Nokia phones will ship with 20 or more multi-note, multi-part ring tones. These include pop, hip-hop, classical and abstract styles. Many people are also talking about “MIDI†ring tones. MIDI is a professional audio specification for essentially listing musical note descriptions and sending musical data between devices, with needing to send all the waveform data. The Nokia phones support a new industry-standard flavour called Scalable Polyphony MIDI (SP-MIDI) which has now been ratified by the 3GPP and the MIDI Manufacturer's Association.

The SP-MIDI spec was co-authored by Nokia and Beatnik under the umbrella of the MIDI Manufacturers Association. The goal was to enable a single ring tone file to play acceptably across a wide range of devices. In an SP-MIDI file, musical parts are assigned a priority so that depending on the available polyphony of the device, the most relevant parts (lead line melody, drums, piano, strings etc) will always get the correct voice allocation. The SP-MIDI standard is non-proprietary and is openly available, so anyone can start composing files, or for that matter, designing audio engines like Beatnik's. Our strategy is to be first and best with those engines, and to continue to drive the standards bodies to help us stay a step ahead of the competition. Beatnik provides tools for developers, composers, and ring tone vendors to build libraries of SP-MIDI content, in addition to licensing our engine to handset makers.

While mass-market phones in 2002 will start to support polyphonic ring tones and game soundtracks, there is a next-generation technology already in the pipeline. This is XMF (eXtensible Music Format.) Whereas SP-MIDI files are limited to a static sound bank, an XMF file can contain its own custom samples, which can be any type of recording—a dog bark, a motorbike revving up, or a line from a classic movie. The advantage here is that now “real†songs can be used as ring tones, as in the kind you hear on the radio. Samples can be looped and sequenced in a way that closely replicates the original perfomance, complete with all instruments, vocals and drums. Yet unlike WAV or MP3 files, which typically occupy 3mb or more of storage space and therefore take a long time to donwload over slow connections, an XMF file might top out at 100kb. A file this size will load in mere seconds on 2.5G and 3G networks.

The XMF spec was also co-authored by Beatnik, Sun Microsystems and Nokia, with input from and a host of professional audio companies including Roland, Yamaha, Korg and Creative Labs, and with the blessing of IBM and Microsoft. It is expected to be supported in some high-end handsets by the end of 2002, and to roll out into mass-market devices the following year. What XMF will enable, above and beyond the polyphonic ring tones in this year's devices, is a much richer and more interactive richer style of ring tone, game sound or MMS audio clip.

For example, imagine this scenario: you're waiting for a bus and you power on your phone. There are 6 MMS clips waiting for you featuring new music releases you're your favorite artists, labels or genres. Each one includes a graphic of the artist, a 10-second audio clip, statistical data, and a personal text message. Of course, you can elect to save the audio from a clip as a ring tone; add the song to your playlist that exists elsewhere in your home, car or office; sign up for the digital download subscription service; order the CD from Amazon; or simply send the clip on to your friends as a kind of e-card. In effect these are musical trading cards, informative but highly desirable, and fun to trade among family and friends, without invoking the wrath of the copyright owners. Beatnik has even developed a way to “lock†a clip to an individual device, to prevent unauthorized copying and redistribution.

This kind of highly personalized, localized, rich media browsing and messaging is likely to receive very strong support from the Mobile and Entertainment Industries. The rewards are clear: fun, exciting services, offering the latest in desirable fresh content, will help sell new phones, sign up users to premium service plans, increase loyalty and reduce “churn†for carriers, and generate additional revenues via affiliate deals with media and e-commerce companies. Audio plays a very strong role in this, because screen displays will always be small, while audio quality will improve exponentially.