Modelling the Human Auditory System

By modelling the human auditory system in neuromorphic hardware, we not only better understand the complex pathways through which humans process sound, but also create auditory processors for our mobile devices that outperform the state of the art auditory system in real world environments.

The videos below show the output our a real-time software simulation of a model of the human cochlea (opens in a new window) (your inner ear). This model, called CAR-FAC, was developed by Dick Lyon, and described in his book: Human and Machine Hearing (opens in a new window). Dick has made the pdf (opens in a new window) of this book available online. We also developed some Jupyter notebooks to better understand this model. You can find these here (opens in a new window) and play with it yourself. We have also created a real-time hardware implementation (opens in a new window) of this model.

For the videos below, we simulated the CAR-FAC model with 100 sections spaced between 50 Hz (bottom) and 12,000 Hz (top). The yellow line on the right shows the simulated vibration of the Basilar Membrane in response to sound. We then added to it a very simple model of the auditory nerve fibres (Spiral Ganglion Cells (opens in a new window)), by modelling these with a Leaky-Integrate-and-Fire neuron (opens in a new window). These neurons generate nerve pulses (spikes), which we count and display per time window (23.22 ms), with a higher count giving a brighter colour. The most recent time window gets displayed to the right, next to the yellow line, and older time windows scroll to the left. By doing this, the video displays recent activity on the auditory nerve, simulating what yours is doing when you listen to music. Note that a singing voice typically has two to four frequencies very active, with the lowest one representing the note being sung, and the higher ones changing with the lyrics. Various instruments have very different responses across the frequency spectrum.

https://www.youtube.com/embed/sRHC9C25CNs

This video show the output of the CAR-FAC model in response to "54-46" by Toots and the Maytals. This band is said to be the first to use the term Reggae.

In the intro, compare the response generated by the voice and by the keyboard. One of our favourite bits starts at 1:07 in the video.

https://www.youtube.com/embed/I_CnLw7Nv18

This video is a shout-out to Lloyd Watts, who was the first to use "They can't take that away from me" by Ella Fitzgerald and Louis Armstrong in a live demo of his cochlea simulation. The vibrato in the voice and her bending up to hit her notes is so nicely visible in the wiggles and bending of the frequency components of her voice, and the trumpet clear in the higher frequencies.

https://www.youtube.com/embed/i8-T0dFiC_E

Finally, a classic. This video was inspired by the animated score (opens in a new window) of Vivaldi's "Winter" from the "Four Seasons". We used the same performance by the US Air Force Band.

Resources

role

sidebar