It wasn’t until 1920 that the question “how do we quantify information” was well articulated. This video introduces a key idea of Nyquist and Hartley, who laid the groundwork for Claude Shannon’s historic equation (Information Entropy) two decades later. In these early papers, the idea of using a logarithmic function appears, something which isn’t immediately obvious to most students fresh to this subject. If one ‘takes this for granted’ they will forever miss the deeper insights which come later. So, the goal of this video is to provide intuition behind why the logarithm was the ‘natural’ choice…

The follow three video mini-series is a bit of an Engineering detour in the story of information theory. In order to easily grasp the ideas of Hartley and Shannon, I felt it would be beneficial to lay some groundwork. It began with my own selfish interest in wanting to relive some famous experiments & technologies from the 19th Century. Specifically, why did the Information Age arise? When and how did electricity play a role in communication? Why was magnetism involved? Why did Morse code become so popular compared to the European designs? How was information understood before words (and concepts) such as “bit” existed? What’s the difference between static electricity and current?

All of these questions are answered as we slowly uncover a more modern approach to sending differences over a distance…

It’s powerful to understand how conditional probability can be visualized using decision trees. I wanted to create an alternative to most explanations which often start with many abstractions. I was drawn to the idea of looking at the back pages of a choose-your-own-adventure book, and deciding how you could have arrived there. Here I present a visual method using a story involving coins… allowing you to decide how to formalize. Once we grow tired of growing trees, we may ask the key questions: how can we speed up this process?:

This is followed by a game I designed (built by Peter Collingridge) which introduces how branches can be weighted instead of counted.

In order to understand the subtle conceptual shifts leading to the insights behind Information Theory, I felt a historical foundation was needed. First I decided to present the viewer with a practical problem which future mathematical concepts will be applied to. Ideally this will allow the viewer to independently develop key intuitions, and most importantly, begin asking the right kind of questions:

I noticed the viewer ideas for how to compress information (reduce plucks) fell into two general camps. The first are ways of using differentials in time to reduce the number of plucks. The second are ways of making different kind of plucks to increase the expressive capability of a single pluck. Also, hiding in the background is the problem of what to do about character spaces. Next I thought it would be beneficial to pause and follow a historical narrative (case study) exploring this problem. My goal here is two congratulate the viewer for independently realizing a previously ‘revolutionary’ idea, and at the same time, reinforcing some conceptual mechanics we will need later. It was also important to connect this video to previous lessons on the origins of our alphabet (a key technology in our story), providing a bridge from proto-aphabets we previously explored….

This is followed by a simulation which nails down the point that each state is really a decision path

Before jumping into Information Theory proper I decided to go back and explore the history of the Alphabet. This reminds us that communication, no matter how fluid it seems, is really just a series of selections. I’m using both Shannon and Harold Innis as inspiration for this series which is why I’m clarifying medium vs. message as well as information transmission over space vs. time – ideas which are popularized by Marshall McLuhan years later. By starting this way I’m able to carefully move away from the semantic issues of information and towards what Shannon called the “engineering problem”. This analogy will carry through the rest of the series so it’s important to lay the groundwork early on.

This video is an attempt to explain the Prime Number Theorem in a way that gives you a tactile intuition regarding the density of primes. It’s an idea Gauss is famous for having at the age of 16 while studying tables of prime numbers < size (x). The idea for this video came to me while walking in the forest and noting the gradual shift in leaf density as I moved away from the trees. I thought it could be a nice way to introduce density gradient.

More importantly, check out the amazing visualization that Khan Academy user Peter Collingridge made to follow up the video:

I’ll never forget the first time I was introduced to Information Theory. My TA Mike Burrel began a lecture by writing a string of 0’s and 1’s on the board and asked us to think about what it meant. It was followed by a trance-like state of excitement…how did I not hear of this before? Three years later I’m thrilled to be launching an entire episode on the topic. It was a true joy to go back to square one and relearn the topic with a childlike curiosity…My goal is to create a Myst inspired adventure which includes various puzzles along the way.

Figuring out a brief way to explain how & why the RSA encryption algorithm works was a daunting task. My goal was to find a balance between a rigorous 2+ hour technical explanation (for this I’d suggest Dan Boneh’s crypto course) and a simplified intuitive example. I came up… Continue reading →

Check out my interactive exploration of random walks on khanacademy labs.

When someone rolls dice, or selects a card from a shuffled deck the best possible strategy for predicting the outcome can’t beat a blind guess. This is because each outcome is equally likely. When we apply random shifts to our messages it results in a ciphertext which is indistinguishable from any other message – it contains no information. The problem with this method of encryption (one-time pad) is that we must share all the random shifts in advance. What happens when we apply pseudorandom shifts instead? We can relax our definition of perfect secrecy and achieve practical security…

This video covers this history of Cryptography through the lens of Cryptanalysis. It takes us from the Caesar Cipher to the one-time pad…a daunting task in 8 minutes. My strategy for this was based on much reflection after completing a course in Cryptography last year…I was seeking out the kernel of what is required to understand the big picture. This brought me to analogies which connect the idea of a fingerprint to unique frequency distribution, and information leak to differentials in the distribution. Hopefully this lends some intuitive weight when I explain the strength of the one-time pad…which is a lovely concept.

I recently launched Art of the Problem on Kickstarter.com which is an interesting new crowd sourced funding network for creative projects. A total of 88 people pledged to the project and my $4,000 funding goal was reached. As of today (May 13th 2011) production work has commenced on the pilot episode titled ‘Gambling with Secrets’, soon to be available at www.youtube.com/artoftheproblem.