An open source mini-adventure

I’m using Spatie’s Media Library Pro in a project for dgen.net, and ran into a problem when I tried to use a TIFF-format image, and it failed to show a thumbnail of the image:

Drag and drop works, but no TIFF image preview.

So I set about tracking down why this image didn’t work, since the project this was being used for has lots of TIFF images. This turned into quite the can of worms, but all worked out beautifully in the end.

TIFF images are not supported by most web browsers as they are not a typical “web format”, but they are very common in print and archiving contexts. It doesn’t help that Safari is about the only browser will display them at all, but here the aim is to display a thumbnail, not the actual image, and the thumbnail doesn’t have to use the same format.

Media Library Pro is a set of user interface widgets providing access to Spatie’s Laravel Media Library package, and so it’s dependent on that package to provide all the underlying file management and thumbnail generation, which is handled by a more general mechanism for creating “conversions” of underlying file types. This is especially useful for files that are are not images – for example it’s possible to create thumbnails for audio files using a package I wrote, but being able to do something similar for otherwise undisplayable image types is useful too.

It turns out that Media Library’s image support is handled by yet another Spatie package called (imaginatively) Image. So I started looking there, and found that it did not actually take responsibility for performing image processing operations either, but used yet another package called Glide by the PHP league. In searching for info about using TIFF files with Glide, I found this issue, which told me that Glide already supported TIFF, so long as you were using the imagemagick PHP extension (as opposed to the slower, less capable, but more common GD) as the image processing driver, which I was already. But as I’d seen, this didn’t seem to work. So I set up a simple test script to convert a JPEG image into TIFF using spatie/image (I needed it to convert in both directions), and found that it did indeed create a TIFF file. However, apps I tried could not open it, saying that it was not a TIFF format file. The file command line utility showed me that the file was in fact a JPEG-format image saved with a .tiff extension:

file conversion.tiff
conversion.tiff: JPEG image data, JFIF standard 1.01, aspect ratio, density 1x1, segment length 16, baseline, precision 8, 340x280, components 3Code language: Bash (bash)

This was not helpful! So this was a bug in Glide. I tracked down the cause of that and submitted a PR to resolve it.

One general problem with open source projects, is you never know when maintainers are going to get around to merging (or rejecting) PRs, or having merged them, when they will be tagged for release. I know this because I have been guilty of this myself! Here I struck lucky – a maintainer merged it the same day, and also tagged it for release.

Now I had a different problem. This fix was several layers down in my stack of dependencies, and those projects didn’t know about this change in Glide, so if I wanted spatie/image to gain TIFF support, I needed to bump its dependencies to force it to use the new version. It also turned out that while Glide now had TIFF support, Image did not pass that support through to its consumers, so I needed to let it know that TIFF was also a supported format. All that happened in another PR. Spatie has a very good reputation for supporting its open source packages, not least because they constantly dogfood them, and have a great track record of merging PRs quickly and tagging them for release, and this was no exception – my PR was merged and released very quickly.

Now I was nearly there – but not quite! I discovered two almost identical problems in spatie/laravel-media-library and spatie/image: despite delegating image processing functions to their dependencies (i.e. having image say “I support whatever image formats that glide supports”), they both had their own hard-coded list of supported formats. I had already updated this in image in my previous PR, but now I needed to do the same thing (and something similar for tests) for Media Library. Cue PR number 3! True to form, Spatie merged and tagged this release quickly, and my chain was complete! I followed this up with another PR to port my changes to their later version 10 branch (supporting Laravel 9), most of which involved a switch to the pest testing framework.

Finally, back in my app, I bumped my dependency version constraints (so my app picked up the latest versions of these packages), and then I got this:

The fruits of all that effort!

I observed that there’s more that could be done in these packages, in particular that knowing what image formats and MIME types you can support should be limited only at the lowest-level – all higher dependencies should defer to the lower-level packages. This would mean that there is less code to maintain in those packages, and new formats would automatically start working without PR chains like this. So if you have time on your hands… This is of course how a lot of open source software comes into being – there’s always another yak that wants shaving!

This might seem like a lot of effort for a very small feature, but this is how open source works, on its good days! Every package you use is an accumulation of effort by original authors, maintainers, contributors, and reporters, all of whom want to solve one problem or another, and share their efforts so that others can avoid having to solve the same problems all over again.

This particular chain is the longest nested set of PRs I’ve ever done, it was fun to do, was about the first thing I’ve ever “live tweeted”, it resulted in a solution to the specific problem I had, and that solution is now available to all. This is how open source is meant to work, but it’s not always this (remarkably!) smooth. Some package creators can’t be bothered to maintain their packages, others are on holiday, have just had a baby, or have died; raging flamewars erupt over the most trivial things; discrimination (racial, sexual, religious) is unfortunately common; bug reporters often fail to describe their problems well, or make excessive, unrealistic, entitled demands of maintainers. Sometimes this proves to be too much, resulting in great people stopping (or never starting) their participation in the open source ecosystem, which is a terrible shame.

The web would not exist without open source, and if you want to continue to reap the benefits of this beautiful thing we have collectively created, the best way is to support the maintainers. Whether it’s individual developers like me, package creators like Spatie and The PHP League, or open-source juggernauts like Laravel and SensioLabs (Symfony), we can all benefit from support. There are many different ways you can provide support (not just financially), for example making developer time (or other resources) available, paying for products and services sold by companies that back open source projects, paying maintainers, either directly through things like GitHub sponsorship and Patreon, or through broader programmes such as TideLift that might be more acceptable to accounting departments. I’m tooting my own trumpet here (my blog!), but there are literally millions of open source developers out there, and if you’re reading this, you’re using software that we have all created together.

The Stack Overflow Antipattern, part 2

I enjoyed Riggraz’s observation of “The Stack Overflow Antipattern”, and it made me think of another very similar pattern that I see a lot on Stack Overflow, but it occurs after the pattern that Riccardo describes, and I thought I’d outline that here.

Image by wal_172619 from Pixabay

I answer a lot of questions on Stack Overflow. I ask very few, but I’ve still fallen into this trap myself.

Once you’ve been through Riccardo’s antipattern (ignoring the other antipattern of those that don’t even make it to step 1), you are here:

  1. You’ve searched and found some random results
  2. You’ve read some SO questions that were in those results
  3. You’ve still not found a solution

If you’ve got this far, the breadth of the question you want answering has probably been narrowed a little (which helps in its own right: searching is a mild form of rubber ducking), and probably contains the basis of a worthwhile Stack Overflow question.

So you focus on the problem, write it up, and (assuming this rubber-duck exercise didn’t lead you to a solution) post the question, and answers and comments appear reasonably quickly (hey, Stack Overflow rocks!). But often these responses are bogus, half-answers, or raise further questions. It’s at this point that we see the same “not taking the time to think” that Riccardo observed. You are so focused on the original question, you become incapable of solving a far simpler follow-on question.

Here’s a small example I often see:

“I’ve seen docs and code referring to autoload.php, but I can’t find that file”

Searching for autoload.php will find a zillion irrelevant results, because pretty much every PHP project in existence has one. So the question is posted on SO. There is a simple answer to this question, which is

“install composer, run `composer install`, and it will create the autoload.php file for you”

This inevitably leads to the follow-on question:

how do I install composer?”.

This is a new sub-problem, but one that is instantly solvable by searching because it’s far less ambiguous. However, this is where the abdication of thinking kicks in, and rather than actually doing that search, you ask in the SO question comments, and sit around waiting for an answer from Someone Who Knows™ that posted an answer to the original question. This is frustrating for the answerer, who knows that the asker could find the answer to this question far faster by searching for themselves, but they choose not to because their thinking is turned off. There’s also an element of panic – the asker has obtained the attention of someone capable of understanding their problem, and doesn’t want them to escape before they have addressed the full recursive stack of sub-problems. This has led to the existence of passive-aggressive responses like LMGTFY, which are demeaning and condescending, but reflect the frustrations of those who answer questions.

What’s weirder is that I have observed myself doing this very thing, and I’ve sometimes had to stop myself posting trivial follow-up questions without thinking. Avoiding having to think is evidently a compelling driver.

I emailed Riccardo about his article with some of the thoughts that led me to write this, and he came back with another interesting observation: This loss of confidence that leads one to post trivial follow-up questions is very much like imposter syndrome. Having had to ask a question in the first place can provoke feelings of embarrassment or inadequacy, and anyone that responds in a positive way will appear to be in some way superior, which is fully expected, but at the same time provokes feelings of “we’re not worthy”, further reducing one’s confidence to be able to deal with even simple problems.

We’re not worthy

I know that Stack Overflow (and GitHub issues) can sometimes be harsh on new users, and old hands (like me) can forget what it’s like to be a beginner. It can be very frustrating to answer questions that have been answered many times before (“my PHP script just gives a white screen”), and I’ve occasionally found myself editing my initial reaction to avoid unkindness. In those situations I usually try to overcompensate by offering more general advice about how to avoid getting stuck in dead-end situations like that, rather than just answering the precise question asked.

In PHPMailer I have tried to head off support questions before they arise by adding links to documentation in error messages, but it doesn’t stop people posting questions like this:

I have this error:
> 2020-05-16 07:28:11 SMTP connect() failed.
> https://github.com/PHPMailer/PHPMailer/wiki/Troubleshooting

I have been stuck on this for 2 weeks, and I searched the entire internet three times
Where can I find out how to fix it? You must help me urgently!

I don’t really know what to do when faced with this. Posting a substantive answer is probably pointless – if they have not read what’s right in front of them despite their evident frustration, chances are they will not read any answer you post either, especially since it will only contain exactly what’s in the link provided anyway. Sometimes the best thing to do is vote to close the question, usually as an inevitable duplicate. I see very similar things happening in GitHub issue templates – askers delete the boilerplate text, removing something which would usually help them solve their exact problem in a few seconds, but they go out of their way to make the process take longer and involve others unnecessarily, because apparently not having to think is a more attractive proposition.

I’ve also considered pushing in the other direction, such as by adding “delete this line from the debug output before posting questions about it, or your question will be ignored” as a way of enforcing reading the error messages, but that’s unkind.

I’m not sure how to address the abdication of thinking issue though. Perhaps offer up search results derived from comments or answers, much in the same way that Stack Overflow does when you first post a question? There are probably extensive psychological studies on this pattern of behaviour, and it may well have a name, but that’s a question for another Stack Exchange site.

Using a Behringer DSP8024 for Room EQ

I have a Behringer DSP8024 Ultra-Curve Pro audio processor on the output of my computer.

Behringer DSP8024 audio processor

I picked up this relatively ancient unit for £50 about 15 years ago (it cost about $500 back in 2001!), and they can still be found on eBay, along with later models like the DEQ2496, and related hardware like Focusrite’s (discontinued) VRMBox. It provides many different audio processing functions, including:

  • Stereo 31-band ⅓-octave graphic equaliser
  • Real-time stereo 31-band spectrum analyser
  • Stereo 6-band parametric equaliser
  • Delay up to 2.5 sec
  • Noise gate
  • Automatic “feedback destroyer”
  • Accurate level meter with selectable scales
  • “Brick wall” limiter for output protection
  • Automatic room equalization using microphone input and internal noise generators

It’s this last feature that is the most useful, combining the analyser with the graphic equaliser. Room equalisation (EQ) can correct a lot of acoustic deficiencies in a room. The shape, composition, and contents of a room, and non-linearities in your speakers and audio interface all contribute to how audio sounds within it. Ideally you want to minimise these effects so as to hear as true a signal as possible. It’s a good idea to apply corrective EQ after adding simple physical acoustic controls (e.g. absorber panels, diffusers, and bass traps, or just old duvets and cushions). Room EQ gets some criticism from audiophiles because it can be very hit & miss and can’t address bigger issues, but it can work very well if you listen from a single location in your room (e.g. in front of your desk).

To measure the room equalisation accurately, you need a microphone with a flat (or at least well-documented) frequency response; I use a t.bone MM-1 for this.

The t.bone MM-1 measurement microphone

The equalisation process works like this, starting from flat EQ (no alteration):

  • Output pink noise from the unit through the speakers
  • Analyse what it sounds like through the microphone, from your usual listening position
  • Alter the equalisation towards a flat response
  • Iterate over this process until the overall response is as flat as possible

This process is loud and quite unpleasant, so leave the room or stick on some closed headphones while it’s busy! It takes a minute or so, and you can hear the change in characteristic of the noise playing through the speakers, and see the changes in EQ on the screen of the unit during the process. After it’s done you can save the EQ curves, and switch the EQ in and out to A/B the config. The difference is pretty noticeable, particularly at the low end where most room-related acoustic problems tend to be; overall it’s like having a major speaker upgrade! One benefit I really notice is when switching between my corrected speakers and a decent pair of monitoring headphones – the audio really doesn’t change in character; there’s no significant tonal shift between the two.

Some people have noted problems with “digital noise” when using this unit, particularly at low volume levels. I suffered from this for a long time, but then realised what caused it and solved the problem. If you have a volume control that is before the processor, you will end up with a small signal going through the analogue to digital converters (ADCs), effectively throwing away much of their available resolution, and you’ll get a lot of quantization noise as a result. The best way to hear this deliberately is turn the input level down, and the output level up, then play something smooth and quiet. It will sound horrible, gritty and noisy, you can really “hear the bits”. This isn’t a problem unique to this unit – any ADC provided with insufficient signal will suffer the same problem.

You want to maximise the use of ADC resolution by giving it a full-range signal to convert. So if you have an audio interface before it, make sure it’s turned up full, and if you have any software level control (e.g. macOS system volume), make sure that’s turned up full too, so you’re always sending a full-volume signal. This way the converters will always use their full 24-bit resolution and the quantization noise will be so small you won’t hear it (it’s impossible to remove completely). However, you still want to control your output level. There are 2 ways to do this: alter the level on your monitors (which can be inconvenient as volume controls on active monitors are often on each speaker separately, and often hidden around the back) or use a passive volume control between the equaliser and the speakers. I use a Mackie Big Knob Passive for this.

A Mackie “Big Knob” passive volume controller.

Passive volume controls have no power supply (so no noise or extra cables), and can only turn a signal down, not up. It’s analogue, so there are no DACs or ADCs, just simple passive components. Ideally when it’s turned up full, it should be indistinguishable (acoustically speaking) from a length of cable.

Controlling level directly on the speakers (or on a separate amplifier if you have one) is possibly better than this approach, but usually less convenient. If you want to be able to run your speakers at full volume via the passive volume control, you need to have the monitors turned up full, and this often means you’ll get significant analogue noise (hiss) from the speakers when you’re listening at lower volume, however, that’s generally less unpleasant and not the problem we’re addressing here.

All of this attention to correct signal levels throughout an audio signal path is part of a wider concept known as gain staging, and occurs in many other places in audio recording, processing, mixing, mastering, etc.

It is possible to do all this processing in software using systems like REW or RoomEq, or even to go further and emulate other listening environments, famous studios or speakers, but I quite like having all this externalised and independent of software, and it also means that it can be applied to external inputs too, if you’re playing an instrument directly through a mixer. The “big knob” also provides a very convenient single control for output level, along with other features such as mute/dim and speaker and input switching.

…and after the MVP?

You may be familiar with this triangle representation of a “Minimum Viable Product”, or MVP. The idea is that you have a product that contains just enough of all its critical components to be actually sellable, and the pink shaded area represents the amount of work or resources required to bring it to fruition, out of the “full” range of possibilities if budget/resources were not an issue. This diagram is usually presented side-by-side with this one that shows a contrasting “bottom up” approach:

The same amount of resources just fills the bottom layer, giving you lots of functionality but no way of using it, making it unsellable.

That’s where the analogies generally stop. I’ve encountered several misunderstandings of what happens next. Firstly, “the same, but bigger”, where more budget arrives, accompanied by a matching expansion of the spec:

Sure, you get “more” built, but it’s still the same proportion of what you’re aiming for, so it has not really progressed beyond the MVP.

Next we have the “expanded spec”, where the intended implementation is increased, but the MVP implementation stays where it is (i.e. no additional outlay). While management might want the MVP proportion to scale with the spec, of course that doesn’t happen – you have just made your MVP proportionally smaller rather than bigger, likely breaking its “viable” status:

Next up, is “do the same again, because scaleable”!

This is clearly a management disaster, and might be illustrated by jumping straight into a basic project using kubernetes and microservices, or perhaps writing the first page of all the chapters in a book. You will drown in complexity while achieving less than the intended MVP.

The missing image is very simple, I’ve just never seen it actually drawn; it’s this:

Stick to your plan, implement more of it as resources and budget permit. This may involve rebuilding some of the things you did earlier but bigger/better/differently, not just adding. There is another common set of these diagrams that uses a skateboard / scooter / bike / motorbike / car progression analogy that does work a bit better to convey that (e.g. both skateboards and cars have wheels, but skateboard wheels are not good enough for a car). That’s all.