Many criminal trials feature forensic evidence in the form of audio recordings, typically from bugging houses or cars, or intercepting phone calls.
Unfortunately, the audio is often of very poor quality, making it hard for the jury to discern what is said.
Here’s a quick example (you might like to jot down what you hear before reading on).
When indistinct audio is admitted as evidence, Australian (and other) courts allow the jury to be given an “enhanced” version to assist their hearing.
You might now be eager to hear the enhanced version of the audio you just listened to. Sorry to disappoint, but that actually was the enhanced version.
This example highlights how the misunderstanding of enhancing exacerbates the problem that inaccurate transcripts influence juries’ perception of indistinct audio.
What enhancing can and can’t do
There are no general techniques that can reliably and objectively make unintelligible audio intelligible. But this does not mean enhancing is ineffectual.
What enhancing can do is make audio sound “clearer”, in the sense of “less noisy”. Making it “clearer” in the sense of “more intelligible” requires a transcript.
A segment from the 2016 film The Case of: JonBenét Ramsey shows how a transcript and enhancing work together. The film revisits the unsolved 1996 murder of a six-year-old beauty queen in the USA. The audio you listened to is one of several pieces of evidence purporting to show that the child’s family was implicated in her murder.
The video below begins after 12 minutes, and the enhancing segment ends at 14 minutes and 37 seconds.
Judging from public reaction, many viewers accept the four phrases were “revealed” by the “enhancing” – but is that really what happened?
A recently published experiment suggests not.
At Step 1 of the experiment, the audio was played “cold” – with no contextual information – to 78 participants. Half listened to the film’s original and half to its enhanced version. No one in either group heard anything remotely like any of the phrases. Most didn’t even hear human speech (did you?).
So how did the movie persuade so many viewers the enhancing had “revealed” the phrases?
It presented the enhancing with a transcript that “primes” listeners to hear these particular phrases.
This effect is demonstrated by Step 2 of the experiment, where participants were given a transcript. After failing to hear any of the four phrases while listening cold, nearly half now agreed they could hear at least one of them.
Here’s what’s important
Participants who were primed by the transcript while listening to the enhanced audio were more likely (63% vs 24%) to accept more of the phrases more confidently than those listening to the original.
That would show a good effect of enhancing if the transcript were a reliable account of what was actually said. But is it? To answer that, consider where the phrases came from.
The movie portrays the investigators spontaneously hearing the phrases as the audio is enhanced. But that is disingenuous.
There is good evidence the phrases originate from police in the 1990s listening to noises at the end of a cassette copy of the 911 call in which the child’s disappearance was reported.
So what are those noises?
Listening to the whole call (start at 6 minutes 34 seconds in the movie above), it seems likely they are the sound of the agent typing up information provided by the caller. Interestingly, some commentators provide evidence (not tested in court) suggesting, when the audio was transferred to the cassette during the investigation, it was processed in ways that make the typing sound more like speech.
So the movie’s “original” may not actually be the real original.
Be that as it may, Step 1 of the experiment makes clear that the movie’s “enhancing” has no effect whatsoever in revealing the phrases. That effect is entirely the work of priming by their (misleading) transcript.
The same thing happens in real trials
The movie’s flashy visuals and sensational tone seem far removed from a courtroom. Yet, the way the movie presents the audio is very similar to how audio is presented in a trial.
In trials, as in the movie, listeners hear an enhanced version of indistinct audio with the “assistance” of a police transcript.
The problem with this can be explained via an analogy from forensic image enhancement. Consider the very indistinct number plate below, and an enhancement that looks “clearer”. Does it help you see DUN 150J?
Knowing the truth makes a difference
In this case, we know what the number plate actually was. Click here to see a clear image of the actual number plate.
Knowing “ground truth” – the absolute, undisputed truth – about the real number plate makes it easy to see that, while the enhancing may have made the indistinct image look “clearer”, it has not thereby made it closer to reality.
The problem, of course, is that in a trial, ground truth is not known. The court has only an indistinct original and a “clearer” enhancement.
With no access to ground truth, it is impossible for the jury to discover that the apparently clearer enhancement is no closer to reality than the blurry original.
And all this is exactly true of audio
Does that mean enhancing is never effective?
Audio enhancing can sometimes be useful. It can also be ineffective – or even misleading. In the present case, for example, it misleadingly made typing sound like speech, at least to some listeners.
The point is that, in the absence of ground truth, the effectiveness of enhancing cannot be reliably determined simply by asking listeners whether the audio sounds clearer.
Yet that is the sole criterion used in our courts.
According to our legal system, evaluating the effectiveness of enhancing is a matter for the jury, who are invited to listen to the enhancement and use it if it sounds clearer to them.
But the experiment shows that making audio “clearer” can have the opposite effect to the one intended. That’s because less noisy audio makes an unreliable transcript seem more believable than it does in the original.
This exacerbates the already serious problem of inaccurate police transcripts providing misleading evidence to juries.
But wait, it gets worse
Lax admission of enhanced audio is a serious problem. Even more serious, however, is the prevalence of false beliefs among the judiciary about the capabilities of enhancing in general. These false beliefs may make for erroneous rulings on important matters.
This is one of several concerns that have prompted Australian linguists to raise a Call to Action, asking the judiciary to review and reform the handling of indistinct covert recordings used as evidence in criminal trials.