What’s a decoder, and why is it so powerful?

Put simply, a decoder decodes encoded messages. Okay, okay, I’ll elaborate with a quick example.

Suppose you want to send a message to your upstairs neighbor, who’s being outrageously loud for mysterious reasons (it’s like they’re elephants with pogo sticks, seriously). In this purely hypothetical situation, you only have one real channel available to you—the ol’ broom against the ceiling technique. How do you translate your message, say, “Please be quiet and practice the gong before 2 AM”, into a series of bumps? Well, you and your upstairs neighbor would previously agree upon a code, then you would encode your message by running some sort of algorithm (simply a recipe to do something) to translate your message into a coded message, which you can then send across your ceiling “channel”. Any such device that translates input messages to coded messages is appropriately called an encoder.  The algorithm used by the encoder may be very simple—it may just say “replace each letter with the morse code equivalent”—or it may be more sophisticated. It depends on what we want features we want on our code and how tricky the input data are.

On the other side, your neighbor will receive and write down the bumps that he hears, then pass it to a decoder, which, you guessed it, takes in coded messages and uses an algorithm to turn them into intelligible messages. If everything goes as planned, my neighbor should be able to recreate my message to some degree of accuracy. All is well.

Now imagine that my ceiling fan starts to make random noises that strongly resemble broom bumps. My upstairs neighbor is now confused—sometimes, when decoded, the bumps are rich with information (I’m sending a message), and other times, the bumps are just noise. Without a decoder, the upstairs neighbor does not have an obvious way of measuring the information content of the bumps—he could only talk about correlations between bumps and time of day, or bumps and his gong playing. These correlations don’t necessarily prove that there’s a coherent message in the bumps, and what’s more, the correlations are limited in that they evidence for (or against) his prior hypothesis about the message content—they do not produce an estimated message by themselves. So he has to have some guess a priori about the message I’m sending, then measure correlations between the bumps and this guess to decide whether they match: so if I say “Hello good buddy, let’s get some pizza”, he could have a very tough time guessing, because he’s most likely testing similarity to “C’mon man it’s too loud” or similar statements.

But if he can decode the bumps and output a meaningful sentence, then that information must be in the bumps. He didn’t have to guess what the bumps were saying—he just tossed it into his decoder and watched what it spat out. This is the power of the decoder: you can confidently argue that some random process (here, a sequence of bumps) is in some way encoding information because you can extract that information with a proper decoding scheme. (It is worth mentioning that finding such a proper decoding scheme can be extremely difficult, depending on what you’re doing. While decoders are awesome, they’re definitely not cure-alls)

This last point is why decoders have so much power in neuroscience. Suppose I think that some network…oh, I don’t know, a place cell network, for example…is encoding spatial location. Evidencing this claim can be highly nontrivial, as neuronal networks are noisy, highly complex, and often involved in more than the main task at hand. If I have some inkling as to how these neurons are encoding space (even if my understanding of their encoding scheme is incomplete), I can build a decoder that inputs these neurons’ activities (the random process) and estimates that spatial location. And if it works—that’s powerful. It allows us to bypass the problems presented by evaluating correlations between behavior and neural activity and definitively establish a lower bound on the amount of spatial information contained in the network. I don’t have to argue that this is representing location to some degree—I can just say, hey look, my decoder is outputting spatial location, so the information must be somewhere inside that network. Once that estimate is created, we can evaluate performance by comparing it to some recorded location—if they’re close, we’re in the money, so we can now estimate location using only the neuronal network and the decoder. Note that decoders cannot be used to prove absence of information—I mean, what if our decoder uses a totally incorrect scheme to guess location? I could design a decoder that outputs the same location every time it receives any input; if it incorrectly estimated location, that’s not the network’s fault.

TL;DR Decoders are cool, and a really powerful way of putting lower bounds on the information content of a random process. This finds powerful utility in neuroscience, where other methods of estimating information content of networks is really difficult.

Coffee: A mocha and a pourover from one of my favorite California Roasters, Verve. Happy national coffee day!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: