Beat Induction and Analysis with Robotic Percussive Improvisation Systems

This Report will explore the past and current research into the areas of Beat Induction, Rhythm Recognition and percussive improvisation.First we must ask, why? In terms of current robotic systems, most focus simply on sound production (Mechanically reproducing a set of static instructions) and rarely address perceptual aspects of musicianship, such as listening, analysis, improvisation and interaction with human input. There has been some breakthrough work done on improvisation and robotic percussion in the last few years which will be explored. First we will introduce the problem of beat induction and the varying methods by which it can be achieved. We will then attempt to clearly understand how this problem will be applied to our robotic system. Furthermore we will show how we can use state-of-the-art research into similar robotic systems as a guide and stepping stone for our project.


In order to stay properly aligned with the scope of our mechatronic system, I must firmly clarify how this research will be applied to our project. With regards to our working prototype plans, the initial goal will be to extract basic tempo information (the down beat) from a microphone input by first performing a form of pulse code modulation on the input signal (Conversion of an audio input signal into a binary bit stream representing note onset times). This information will be analyzed and used to actuate a drum stick against a ride symbol at a beat which can vary dynamically. I will explore current and past research to this beat induction problem in order to understand the theory, concepts involved and current research. In order to proceed, we must develop a firm grasp of concepts such as beat, tempo, meter, density and the process of beat induction.

IntroductionBeat can be defined as:”the sequence of equally spaced phenomenal impulses which define a tempo for the music…characterized by a period and a phase, that is, the distance between two beats and the temporal location of the first beat” [Gouyon Herrera 2004]” Dr. Honing, a leading researcher in AI-programming, Music and Human-Computer interaction defines beat and beat induction as:
”Beat induction is the process in which a regular isochronous pattern (the beat) is activated while listening to music. This beat, often tapped along by musicians, is a central issue in time keeping in music performance. But also for non-experts the process seems to be fundamental to the processing, coding and appreciation of temporal patterns. The induced beat carries the perception of tempo and is the basis of temporal coding of temporal patterns. Furthermore, it determines the relative importance of notes in, for example, the melodic and harmonic structure. “[P. Desain H. Honing 2007]Masataka Goto and Yoichi Muraoka explain not only the importance of beat tracking but the need to properly evaluate various methods which have been applied:
”Furthermore, beat tracking — also called beat induction, foot-tapping, or rhythm-tracking—has applications in such various fields as human-computer improvisation, video/audio editing, stage lighting control, and the synchronization of computer graphics with music.” [Masataka Goto 1997] To summarize simply, have a look at the following pattern of lines and dots:

 |..||…..|.|..||…..|.|..||…..|.|.|.|…|…|“Do you see any emergent structure? Probably not. When you would listen to it, though, (e.g., the pattern being played from left to right, with every line being a 16th note and every dot a 16th rest) you would quickly hear a regular pattern -the beat-, and could probably easily tap your foot along with it. This relatively simple cognitive task is called beat induction or foot-tapping.” [Desain Honing 2004] We must also define these related terms: Rhythm is a temporal pattern with relationships between duration, accents and possibly structural interpretations [Dowling & Harwood, 1986]. Metre involves a ratio relationship between multiple time levels. One is the referent time level, beat period, and the other is based on a static number of periods, the measure. It imposes an accent structure on beats and is fundamental to how we score music [Yeston 1976]. Tempo defines the rate at which beats occur, usually expressed as beats per minute. A Temporal pattern is a series of time intervals, without any interpretation or structure. Density is used to refer to the average rate of events taken across different durations (events per second) [Dowling & Harwood 1986]Research and DiscussionBeat InductionUp until today, many different methods have been applied to the beat induction problem (due to the variety of applications) such as neural nets, rule based search models and even coupled oscillators. One such method uses relaxed coupled oscillators to perform mechanically based beat induction methods. This research is fundamental in modeling networks of oscillators to find important temporal patterns, although not directly related to our problem [Eck 2001]. All related methods I will now look at are mathematically based.

In a recent paper by Christopher Raphael, he takes a set of note onset times and estimated the most likely global tempo using graphical methods. An important point to understand is that their model accounts for expressive timing, which is the way actual note times and onsets deviate from a literal interpretation when performed by musicians [Raphael 2001]. There are two main components to expressive timing which we must be aware of. First, understanding that when a musician plays a certain tempo it is never exact, but constantly varying throughout the performance. Also, there are always local (within single notes) variations which can be accounted to interpretive conditions, or more simply, a mistake. So, it is very important to understand how a system responds to these problems.

Raphael explains how the time sequences forming the input, estimated from an audio signals, can be expressed as a sequence of time tagged musical events (In our case this would be note onset times). This information is then graphed as real time vs. beat onset location (musical time), the resulting slope of this graph gives us an estimate of global tempo. [Raphael 2001] (See Figure 1 left). 


Figure 2 – Gouyon’s Algorithm Flow DiagramHowever, we must recall that in our system we will be detecting audio only from a single snare drum – which can be described directly in a binary timing sequence – rather then a symphony of subtle intervals and onsets. We are also aware of the problem of feedback from noise emitted by the actuator (ride symbol) back into our input microphone. Nevertheless we can simplify this process greatly by understanding our constraints. For instance looking at Figure 2, we would have no need to do low-level feature extraction in our specific application.
Simon Dixon explores similar approaches to beat induction in his paper ‘Beat Induction and Rhythm Recognition’. He explains how the first stage of any rhythm recognition process begins with the detection of beginnings of notes (onsets). Offsets are essentially ignored due to the fact that they have no impact on rhythm or tempo [Dixon 1998], and in the case of percussion, are not even possible to specify.
“We use a sliding window short-time Fourier transform to create a frequency-time representation of the music, and search each frequency band for sudden increases in amplitude which may correspond to note onsets…we are able to achieve a high degree of time resolution, at the expense of a corresponding loss in frequency resolution… 
a window size of 16ms (128 samples), which gives a sufficiently high time resolution (and certainly finer than human perception),” [Dixon 1998] In our case we will need to tune our electrical circuit to only respond to certain thresholds due to the fact that our input sound source is monophonic.

Artificial Improvisation

An extension of our prototype will be research into more advanced forms of rhythm recognition and parsing. It is rhythm parsing which is the foundation of more complex musical analysis and understanding [Raphael 2001]. It is with this understanding that we plan to propose (Both conceptually and through simulation) an approach to basic improvisational percussion algorithms. The interesting problem with improvisation is that although it must be eventually expressed by low level mathematical relationships and patterns, it must first be interpreted through a higher level appreciation of the art of music.Without understanding why something sounds appealing we can’t possibly describe how to do so.Again, recall that we are mapping an entire field of music into a one dimensional drum beat input. This simplification allows us to only deal with part of the problem, since we have no interest in melodic intervals (as we would with piano).

Instead of detailing the exact process by which I am using the research into this emerging field (Research into computing improvisation began just over a decade ago [Pressing 1987] ), as I did with beat induction, I will summarizes the most important references sources through which I am exploring the problem.

A detailed analysis of improvisation by Jeff Pressing attempts to illuminate the process of musical improvisation by examining the modeling tools available from a number of different disciplines (Drawing off Psychology, Neuropsychology, Musicology, Cognitive Science, AI and the study of speech) [Pressing 1987]. This paper begins by explaining the psychology of improvisation in terms such as negative feedback and error correction. It then defines the importance of ‘skill’ and creativity in improvisation and how it is developed and measured. Finally it models how this knowledge can be used to design computer programs which can improvise. At the time of publication (1987) this work was just proposed as a new problem in computer science.The latest research (2007) into robotic systems utilizing our approach was found at Georgia Tech with a robotic system named ‘HAILE’, which is still an active project in their department [Weinberg 2006]. It has been identified as the most relevant solution to our particular problem definition. Learning from the interesting research and experimentation done with HAILE we can immediately understand issues involved with beat detection and human-computer interaction. For example the initial beat detection method was build off analyzing a beat from pre-recorded audio samples. This presented problems which would have been almost impossible to predict:
”However human players tend to adjust to the robot’s tempo, which leads to an unsteady input beat that is difficult for the software to follow…Haile therefore listens for a short period (5-10 seconds) and then locks the tempo before joining in”. [Weinberg 2007a]The most up to date research paper on written on HAILE is entitled ‘A Real-Time Genetic Algorithm in Human-Robot Musical Improvisation’ [Weinberg 2007b]. Genetic Algorithms mimic Darwinian logic of fitness evaluation and parent/child models (The details of Genetic Algorithms are defined in a pioneering paper by J Grefenstette [Grefenstette 1986]). This is a fascinating attempt to mimic improvisation; using an initial base population of pre-programmed beats (They used 40 reference beats) evaluated by their fitness (similarity) to the input beat. They also apply some random noise (mutation) to the copying process. The method will be explored deeply in our final reports and a model will be applied to our system.

Conclusions and Recommendations The Music Mind and Machine group clearly summarized the different beat induction problems we must now tackle at the International Computer Music Conference (ICMC) in Arhus, Denmark [Desain 1994]: 1. The system must work in real-time (Efficient enough to deal with only incremental inputs of data)
 2. They must deal with expressive timing inherent in real performance (As previously defined)
 3. They must be robust, recovering gracefully form errors
 4. They must deal with response delay and therefore must be tuned according to careful temporal planning. This research allows us to clearly understand the problems we face and how they have been solved through different applications. It also allows us to follow the most current and future research into the field of robotic percussive improvisation and beat induction applications. Not only has our problem been defined as state of the art but we’ve been able to understand how the constraints in our application can greatly simplify the final implementation.ReferencesP. Gouyon and P. Herrera. (2004). A Beat Induction Method for Musical Audio Signals. Music Technology Group. University Pompeu (1), 1-7.Peter Desain., & Honing, H. (1991). Tempo curves considered harmful (part 1). Array, 11(3), 8-9.Peter Desain & Honing, H. (2004). “The Beat Finding Shoe”. Available: Last accessed 1 Nov 2007.Peter Desain. (1994). General beat/meter induction. Available: Last accessed Oct 23 2007.Peter Desain Henkjan Honing. (2007). Computational Models of Beat INduction: The rule-Based Approach. Canadian Research Knowledge Network. 17 (14), 1-10.Simon Dixon. (1998). Beat Induction and Rhythm Recognition. Department of Computer Science, University of South Australia. (1), 1-8.Dowling, W. J. & D. L. Harwood (1986) Music Cognition. London: Academic Press.Douglas Eck. (2001). A Network of Relaxation. Institute Dalle Molle di Studi Sulli Intelligenza Artificiale. IDSIA-06-01 (1), 1-10.Masataka Goto and Yoichi Muraoka. (1997). Issues in Evaluating Beat Tracking Systems. Workshop on Issues in AI and Music. 17 (1), 1-16.J Grefenstette . (1986). Optimization of control parameters for genetic algorithms. IEEE Transactions on Systems, Man and Cybernetics. 16 (1), 122-128.Yeston, M (1976). The stratification of musical rhythm. New Haven CT: Yale University Press.Jeff Pressing. (1987). IMPROVISATION: METHODS AND MODELS . Available: Last accessed 28 Oct 2007.Christopher Raphael. (2001). A hybrid graphical model for phythmic parsing. Department of Mathematics and Statistics, University of Massachusetts. IDSIA-06-01 (1), 1-10.P. Trilsbeek, H. van Thienen, (1999) Quantization for notation: Methods used in commercial music software,in106th Audio Engineering Society Conference, Munich,Gil Weinberg and Scott Driscoll. (2006). Toward Robotic Musicianships. Georgia Institute of Technology. 1-28.Gil Weinberg, Scott Driscoll, Mitchell Parry. (2007a). HAILE – AN INTERACTIVE ROBOTIC PERCUSSIONIST . Available: Last accessed Oct 25 2007.G Weinberg, M Godfrey, A. Rae, J. Rhoads. (2007b). A Real-Time Genetic Algorithm in Human-Robot Musical Improvisation. Georgia Tech. 1 (1), 108. (Also online at:

One Response to “Beat Induction and Analysis with Robotic Percussive Improvisation Systems”

  1. […] call/response structure) Current software methods for performing beat induction was researched in a past paper of mine. It is now my intention to explore the potential for software advancement of this […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: