Something Good

While I like Taylor Swift, I prefer Olivia Rodrigo. It was really the spectacular “Vampire” that tipped the balance for me, and since then I really liked “Drivers license” and “Traitor”. I love the balance she strikes between really softly spoken sections and full-on passionate belting. Now, I realise that as a 50-something bloke I’m not exactly her target demographic, but frankly, I don’t care, she’s great! I’m also a big fan of Aaron Francis‘ positive outlook, especially his “you can just do stuff” viewpoint. Long story short, I thought I’d try writing an Olivia Rodrigo-style song, and here it is, “Something Good”.

Like the best of such songs, it’s a breakup song, with a strong contrast between quiet, introverted self-blame, and a massive, triumphant sense of escape . As can be ascertained from most of my other songs, there’s no chance I’m going to sing like a teenage girl, so I thought I’d give my usual synthetic vocalist a workout. I also discovered a feature of this software that I’d somehow missed before – it’s possible to automate voice parameters over time. I’d previously done such changes by using multiple vocal tracks with different parameters, but that’s a bit clunky; now I can do smooth changes, and it’s totally cool. It’s how I do the rising passion before the first chorus and the quiet back-off at the end of each chorus; all one track with automation.

Putting myself in the shoes of this notional teenager that’s dropping a toxic ex (I mean, that’s not me, but empathy, right?), I nurtured a little crop of somewhat stereotypical lyrics; it’s a song, not high literature! I’m quite pleased with them overall: decent density, some good rhymes, and no problems with awkward timing or orphaned syllables.

It’s a proper ballad, at a really slow 50bpm, but has a double-tempo 100bpm section, an idea I straightforwardly stole from “Vampire”. I don’t know if it’s just the specific voice I’m using, but the last line of that section (“make up with a kiss”) is pure Katy Perry.

The backing track uses a fairly standard set of instruments. I’m using drum sounds from Klevgrand’s “Slammer” industrial drum kit. The filter-sweepy bass is courtesy of Logic’s Alchemy and lots of automation, the backing pad a simple RetroSyn patch. The piano is Logic’s “vintage upright”. I played the guitar part on my Squier Strat.

The distortion on the vocals in the fast section is by Logic’s ChromaGlow, and the overall reverb is by ChromaVerb.

Creating the vocal tracks was quite tricky; there are a lot of lyrics and timing is really tight in places; it’s hard to make things sound natural when dragging blocks around a grid, so I did lots of singing to myself while I was out skiing to practice the timings. I love the “breath” sound that’s available in SV, it’s just so believable!

A screen shot of Synthesizer V's grid editor showing the notes, lyrics, phonemes, and pitch curves from the fast break section. Note the first "note" is actually a breath, not a sung note.

I wrote the melody first, and then got some chord progression suggestions from Mistral Chat, in particular the use of the occasional A♭sus4 and Esus4 I would not have otherwise thought of, but otherwise there’s nothing complicated in here – all basic triads and 5ths.

There are some very strong contrasts in the song, from the harsh synthetic bass sweep in the intro, through the cut-back piano and super-dry vocal bridge (“There’s no space for fantasy…”), clean simplicity in the verses, minimalist distorted guitars and vocals on the second bridge, to the symphonic, operatic chorus. I’m pretty pleased with this one.

[Verse]
I’d like to go one day alone
without you bringing me down.
A single day without having to atone
for all the things that you said.
Something clean, so elegant and simple
that won’t break my heart again.

[Bridge]
There’s no space for fantasy,
no room for promises.
Just go with what seems right;
I’m gonna leave without a fight

[Chorus]
Something good is sure to come to me.
I’m on my way to something so good.
I’ve had enough of picking up the pieces
of the something good that we used to have.
It’s time to stop all this pretending,
to fool myself that it wasn’t so bad.
I’m moving on to something new,
hoping it turns into something good.

[Verse]
You used to say you’d be better off without me;
now you get to find out.
I’m not used to being alone;
it’s harder than I remember.
But then I look back at the things I won’t miss
and look forward to new things to come
and I realise it will work out just fine,
something new that will be all mine.

[Break]
I know every little word I said is gonna come back and bite me, and
you’re gonna pick up on every last thing and use it just to spite me
but there’s no way I can let you keep on treating me like this;
it’s not the kind of thing you can make up with a kiss.

[Chorus]
Something good is sure to come to me.
I’m on my way to something so good.
I’ve had enough of picking up the pieces
of the something good that we used to have.
It’s time to stop all this pretending,
to fool myself that it wasn’t so bad.
I’m moving on to something new,
hoping it turns into something good.

If you like this song, please consider supporting me by buying my album, “Developer Music” on Bandcamp, and sharing links to my song posts on your socials.

My synthetic vocalist: Dreamtonics Synthesizer V

I find it strange to be able to say that I’ve now created several songs that use a synthetic vocalist. This is a somewhat weird concept, but it’s right at the bleeding edge of music technology. We’ve had voice synthesis for years – I remember using a Texas Instruments “Speak & Spell” when I was small in the 1970s, and it’s gradually got better ever since. The first time I ever heard a computer trying to sing (I’m not counting HAL singing “Daisy, Daisy” in “2001”) was in a Mac OS app called VocalWriter, released in 1998, which automated the parameter tweaking abilities of Apple’s stock voice synthesis engine to be able to alter pitch and time well enough for it to be able to sing arbitrary songs from text input. It still sounded like a computer though. A much better “robot singer”, released in 2004, was Vocaloid, but even then, it still sounded like a computer. A Japanese software singer called UTAU, created in 2008, was released under an open source license, and this (apparently) formed the basis of Dreamtonics’ Synthesizer V (SV), which is what I’ve been using. SV finally crosses the threshold of having people believe it’s a real singer.

The entry of my song in the 2024 Fedivision song contest sparked quite a bit of interest. I posted a thread about it on Mastodon, and I wanted to preserve that here too. One commenter said “I thought it was a real person 😅” – which is of course the whole point of the exercise!

SV works standalone, or as a plugin for digital audio workstations (DAWs) such as Apple’s Logic Pro, or Steinberg’s Cubase, and is used much like using any other software instrument. It doesn’t sing automatically; you have to input pitch, timing, and words. Words are split into phonemes via a dictionary, and you can split or extend them across notes, all manually.

Synthesizer V’s piano roll editor

In this “piano roll” editor you can see the original words inside each green note block, the phonemes they have mapped to appear above each note, an audio waveform display below, and the white pitch curve (which can be redrawn manually) that SV has generated from the note and word inputs. You would never guess that’s what singing pitch looks like!

For each note, you have control over emphasis and duration of each phoneme within a word, as well as vibrato on the whole note. This shot shows the controls for the three phonemes in the first word, “we’re”, which are “w”, “iy”, “r”:

The SV parameters available for an individual note, here made up of three separate phonemes

This note information is then passed onto the voice itself. The voice is loaded into SV as an external database resource (Dreamtonics sells numerous voice databases); I have the one called “Solaria”. Solaria is modelled on a real person: singer Emma Rowley; it’s not an invented female voice that some faceless LLM might create from stolen resources. You have a great deal of control over the voice, with lots of style options (here showing the “soft” and “airy” modes activated). Different voice databases can have different axes of variation like these; for example a male voice might have a “growly” slider:

SV voice parameters panel
Synthesizer V’s voice parameters panel

There are lots of other parameters, but most interestingly tension (how stressed it sounds, from harsh and scratchy, to soft and smooth), and breathiness (literally air and breath noise). The gender slider (how woke is that??) is more of a harmonic bias between chipmunk and Groot tones, but the Solaria voice sounds a bit childish at 0, so I’ve biased it in the “male” direction.

The voice parameters can’t be varied over time, but you can have multiple subtracks within the SV editor, each with different settings, including level and pan, all of which turn up pre-mixed as a single (stereo) channel in your DAW’s track:

Multiple tracks in the SV editor
Multiple tracks in the SV editor

In my Fedivision song, I used one subtrack for verses, and another for chorus, the chorus one using less breathiness and trading “soft” mode for some “passionate” to make it sound sharper and clearer.

This is still all quite manually controlled though – just like a piano doesn’t play things by itself, you need to drive this vocalist in the right way to make it sound right.

Since the AI boom, numerous other ways of getting synthetic singing have appeared, for example complete song generation by Udio is very impressive, but it’s hard to make it do exactly what you intended; a bit like using ChatGPT. Audimee has a much more useful offering – re-singing existing vocal lines in a different voice. This is great for making harmonies, shifting styles, but only really works well if you already have a good vocal line to start with – and that happens to be something that SV is very good at creating. I’ve only played a little with Audimee; it’s very impressive, but lacks the expressive abilities of SV; voices have little variation in style, emotion, and emphasis, and as a result seem a little flat when used for more than a couple of bars at a time. Dreamtonics have a new product called VocoFlex that promises to do the same kind of thing as Audimee, but in real time.

All this is just progress; we will no doubt see incremental improvements and occasional revolutions, and I look forward to being able to play with it all!