Dependency – the song

A screen shot of Apple Logic Pro's arrange window showing the "Dependency" song

There are all kinds of dependencies in life. Us software developers have a few extras that “regular” people don’t, specifically software dependencies.

I was reviewing talk proposals for ConFoo and ran into this video by Darcy Clarke, which covers the terrifying differences in dependency resolution between Javascript package managers. This gave me the initial idea for the song, in particular the line “All of my friends bring all of their friends and all of yours bring theirs too“. Inspiration comes from the oddest places, and Darcy might be a bit surprised by this!

Of course we also have the same dependencies as everyone else – friends, family, lovers, colleagues, medications, recreational habits (chemical or otherwise), income, a civilised society, a functioning economy, breathable air, and so on. This song is mostly about the software thing, but the other stuff can’t help leaking in, as reality is wont to do.

In software, a sometimes tricky thing to deal with is a circular dependency, where something depends on something else, that depends back on the thing you started with, and it all gets stuck going in circles. However, in a non-software context, you might actually like the idea of a mutual, trusting dependency.

I did write a verse that made a play on dependency injection, but I couldn’t find anywhere to stick it in.

This is probably the quickest I’ve written a song, taking just a couple of days to get the bulk of it in shape, and to remind myself of how to play my guitar, but it’s a relatively big mix (for me), with more vocals than usual, which took a while to nail down.

I decided to forego my favoured synthetic vocalist for the main vocals (though I’m still using it for backing), and have a crack at singing it myself, which not I’ve dared do since Tailwind, but it came out OK, partly thanks to the wonders of Logic’s Flex Pitch editor. Right from the start, and obvious from the lyrics, this song was intended to be a duet, so I enlisted the help of a friend’s daughter who happens to like singing – thank you Greta! Greta had never done any proper recording before, and learned some mic technique, the joys of loop recording, comping, and using the pitch editor to try out harmonies.

As in my previous song, Uncomfortable, I gave Logic’s virtual players a workout, playing drums, bass, and a couple of keyboard parts. I played the guitar parts, apart from the epic lead solo, which was played by Wassim Rahmani from Fiverr, who also played on Uncomfortable. I also used Claude again to help out with chord progressions; I would never have guessed at using Em like that, nor the final, dangling Am7(9).

Claude’s idea of an E7 chord

Amongst all this, Claude created the first musical hallucination I’ve seen from an LLM; I asked it to draw a chord box for the E7 chord, and it generated this:

While beautifully symmetrical, this is not an E7 chord, or (apparently) any other chord (try playing it, in either of the two interpretations of this diagram; they both sound horrible!), and nor does my guitar have 7 strings. It also stated that the “o” above the box indicates that these strings should not be played; that’s not correct – the “o” denotes an open, unfretted string; an unplayed string is usually denoted using an “x”. It also provided a text description of how to play the chord, but this had no correlation with the diagram, and was also completely wrong. Further evidence that AI is not about to conquer the universe.

The instruments are Logic’s “Smash” drum kit, Logic’s “Rock” modelled bass, a sampled marimba, an analogue synth patch in “RetroSyn”, and “Disco Strings” from the Studio Strings instrument. Logic’s built-in instruments are really good.

[Verse]
There’s too many things
that I don’t control;
I really don’t know which way to turn.
Got an overwhelming urge
to get rid of it all;
have the feeling I might crash and burn

[Verse]
I asked my friend for advice,
she said
“Well, it depends.
Do you really want to go that way?
I don’t think you’ll like how it ends”

[Chorus]
So many things
that depend on me
so many that depend on you
If we get it all together
we can cut them down,
until it’s just me and you

They say to never trust a stranger,
yet here we all are,
handing over keys to the kingdom.
Do we really need to go that far?

[Verse]
All of my friends bring
all of their friends
and all of yours bring theirs too.
(there’s just too many people here)
They’re in such a mess
that nobody’s sure
exactly what they’re meant to do

[Break]

[Chorus]
So many things
that depend on me
so many that depend on you
If we get it all together
we can cut them down,
until it’s just me and you

They say to never trust a stranger,
yet here we all are,
handing over keys to the kingdom.
Do we really want to go that far?

[Outro]
So many things
that depend on me
so many that depend on you
When all I want is
a circular dependency
with you

As always, I really appreciate reposts of links to my songs on here or on BandCamp.

My synthetic vocalist: Dreamtonics Synthesizer V

I find it strange to be able to say that I’ve now created several songs that use a synthetic vocalist. This is a somewhat weird concept, but it’s right at the bleeding edge of music technology. We’ve had voice synthesis for years – I remember using a Texas Instruments “Speak & Spell” when I was small in the 1970s, and it’s gradually got better ever since. The first time I ever heard a computer trying to sing (I’m not counting HAL singing “Daisy, Daisy” in “2001”) was in a Mac OS app called VocalWriter, released in 1998, which automated the parameter tweaking abilities of Apple’s stock voice synthesis engine to be able to alter pitch and time well enough for it to be able to sing arbitrary songs from text input. It still sounded like a computer though. A much better “robot singer”, released in 2004, was Vocaloid, but even then, it still sounded like a computer. A Japanese software singer called UTAU, created in 2008, was released under an open source license, and this (apparently) formed the basis of Dreamtonics’ Synthesizer V (SV), which is what I’ve been using. SV finally crosses the threshold of having people believe it’s a real singer.

The entry of my song in the 2024 Fedivision song contest sparked quite a bit of interest. I posted a thread about it on Mastodon, and I wanted to preserve that here too. One commenter said “I thought it was a real person 😅” – which is of course the whole point of the exercise!

SV works standalone, or as a plugin for digital audio workstations (DAWs) such as Apple’s Logic Pro, or Steinberg’s Cubase, and is used much like using any other software instrument. It doesn’t sing automatically; you have to input pitch, timing, and words. Words are split into phonemes via a dictionary, and you can split or extend them across notes, all manually.

Synthesizer V’s piano roll editor

In this “piano roll” editor you can see the original words inside each green note block, the phonemes they have mapped to appear above each note, an audio waveform display below, and the white pitch curve (which can be redrawn manually) that SV has generated from the note and word inputs. You would never guess that’s what singing pitch looks like!

For each note, you have control over emphasis and duration of each phoneme within a word, as well as vibrato on the whole note. This shot shows the controls for the three phonemes in the first word, “we’re”, which are “w”, “iy”, “r”:

The SV parameters available for an individual note, here made up of three separate phonemes

This note information is then passed onto the voice itself. The voice is loaded into SV as an external database resource (Dreamtonics sells numerous voice databases); I have the one called “Solaria”. Solaria is modelled on a real person: singer Emma Rowley; it’s not an invented female voice that some faceless LLM might create from stolen resources. You have a great deal of control over the voice, with lots of style options (here showing the “soft” and “airy” modes activated). Different voice databases can have different axes of variation like these; for example a male voice might have a “growly” slider:

SV voice parameters panel
Synthesizer V’s voice parameters panel

There are lots of other parameters, but most interestingly tension (how stressed it sounds, from harsh and scratchy, to soft and smooth), and breathiness (literally air and breath noise). The gender slider (how woke is that??) is more of a harmonic bias between chipmunk and Groot tones, but the Solaria voice sounds a bit childish at 0, so I’ve biased it in the “male” direction.

The voice parameters can’t be varied over time, but you can have multiple subtracks within the SV editor, each with different settings, including level and pan, all of which turn up pre-mixed as a single (stereo) channel in your DAW’s track:

Multiple tracks in the SV editor
Multiple tracks in the SV editor

In my Fedivision song, I used one subtrack for verses, and another for chorus, the chorus one using less breathiness and trading “soft” mode for some “passionate” to make it sound sharper and clearer.

This is still all quite manually controlled though – just like a piano doesn’t play things by itself, you need to drive this vocalist in the right way to make it sound right.

Since the AI boom, numerous other ways of getting synthetic singing have appeared, for example complete song generation by Udio is very impressive, but it’s hard to make it do exactly what you intended; a bit like using ChatGPT. Audimee has a much more useful offering – re-singing existing vocal lines in a different voice. This is great for making harmonies, shifting styles, but only really works well if you already have a good vocal line to start with – and that happens to be something that SV is very good at creating. I’ve only played a little with Audimee; it’s very impressive, but lacks the expressive abilities of SV; voices have little variation in style, emotion, and emphasis, and as a result seem a little flat when used for more than a couple of bars at a time. Dreamtonics have a new product called VocoFlex that promises to do the same kind of thing as Audimee, but in real time.

All this is just progress; we will no doubt see incremental improvements and occasional revolutions, and I look forward to being able to play with it all!