It wasn’t until 1920 that the question “how do we quantify information” was well articulated. This video introduces a key idea of Nyquist and Hartley, who laid the groundwork for Claude Shannon’s historic equation (Information Entropy) two decades later. In these early papers, the idea of using a logarithmic function appears, something which isn’t immediately obvious to most students fresh to this subject. If one ‘takes this for granted’ they will forever miss the deeper insights which come later. So, the goal of this video is to provide intuition behind why the logarithm was the ‘natural’ choice…

The follow three video mini-series is a bit of an Engineering detour in the story of information theory. In order to easily grasp the ideas of Hartley and Shannon, I felt it would be beneficial to lay some groundwork. It began with my own selfish interest in wanting to relive some famous experiments & technologies from the 19th Century. Specifically, why did the Information Age arise? When and how did electricity play a role in communication? Why was magnetism involved? Why did Morse code become so popular compared to the European designs? How was information understood before words (and concepts) such as “bit” existed? What’s the difference between static electricity and current?

All of these questions are answered as we slowly uncover a more modern approach to sending differences over a distance…

It’s powerful to understand how conditional probability can be visualized using decision trees. I wanted to create an alternative to most explanations which often start with many abstractions. I was drawn to the idea of looking at the back pages of a choose-your-own-adventure book, and deciding how you could have arrived there. Here I present a visual method using a story involving coins… allowing you to decide how to formalize. Once we grow tired of growing trees, we may ask the key questions: how can we speed up this process?:

This is followed by a game I designed (built by Peter Collingridge) which introduces how branches can be weighted instead of counted.

In order to understand the subtle conceptual shifts leading to the insights behind Information Theory, I felt a historical foundation was needed. First I decided to present the viewer with a practical problem which future mathematical concepts will be applied to. Ideally this will allow the viewer to independently develop key intuitions, and most importantly, begin asking the right kind of questions:

I noticed the viewer ideas for how to compress information (reduce plucks) fell into two general camps. The first are ways of using differentials in time to reduce the number of plucks. The second are ways of making different kind of plucks to increase the expressive capability of a single pluck. Also, hiding in the background is the problem of what to do about character spaces. Next I thought it would be beneficial to pause and follow a historical narrative (case study) exploring this problem. My goal here is two congratulate the viewer for independently realizing a previously ‘revolutionary’ idea, and at the same time, reinforcing some conceptual mechanics we will need later. It was also important to connect this video to previous lessons on the origins of our alphabet (a key technology in our story), providing a bridge from proto-aphabets we previously explored….

This is followed by a simulation which nails down the point that each state is really a decision path

Before jumping into Information Theory proper I decided to go back and explore the history of the Alphabet. This reminds us that communication, no matter how fluid it seems, is really just a series of selections. I’m using both Shannon and Harold Innis as inspiration for this series which is why I’m clarifying medium vs. message as well as information transmission over space vs. time – ideas which are popularized by Marshall McLuhan years later. By starting this way I’m able to carefully move away from the semantic issues of information and towards what Shannon called the “engineering problem”. This analogy will carry through the rest of the series so it’s important to lay the groundwork early on.

This video is an attempt to explain the Prime Number Theorem in a way that gives you a tactile intuition regarding the density of primes. It’s an idea Gauss is famous for having at the age of 16 while studying tables of prime numbers < size (x). The idea for this video came to me while walking in the forest and noting the gradual shift in leaf density as I moved away from the trees. I thought it could be a nice way to introduce density gradient.

More importantly, check out the amazing visualization that Khan Academy user Peter Collingridge made to follow up the video: