Why Study Rhythm?

Analyzing and modelling the perception of musical rhythm provides insights into non-verbal knowledge representations, quantification of musicological theories, and intelligent tools for music performance and composition. (L. M. Smith)
Understanding the workings of the human mind is one of the great scientific frontiers of our time. One of the few paths into the brain is the auditory system, and discovering the boundaries between auditory cognition, perception, and the signals that arrive at our ears is a way to probe at the edges of our understanding. Building models that try to mimic particular human abilities is a great way to proceed: when the models are successful they lead to better algorithms and to new applications. When the models fail they point to places where deeper understanding is needed. Studying the rhythmic aspects of music is one piece of this larger puzzle.

Three important aspects of rhythmic phenomena are its nonverbal nature, its relationship with motor activity, and its relationship with time. Rhythmic knowledge is nonverbal, yet operates in a hierarchical, multi-tiered fashion analogous to language with "notes" instead of "phonemes" and "musical phrases" instead of "sentences." Rhythmic phenomena express a kind of meaning that is difficult to express in words - just as words express a kind of meaning that is difficult to express in rhythm.

Second, rhythmic activities are closely tied into the motor system, and there is an interplay between kinesthetic "meaning" and "memory" and other kinds of meaning and memory. From the work song to the dance floor, the synchronization of activities is a common theme in human interactions that can help to solidify group relationships.

Third, rhythmic activities are one of the few ways that humans interact with time. We sense light with our eyes and sound with our ears. But what organ senses the passage of time? There is none, yet we clearly do know that it is passing. Gibson concludes that time is an intellectual achievement, not a perceptual category. By observing how time appears to pass, Kramer explores the interactions between musical and absolute time, and shows how musical compositions can interrupt or reorder time as experienced. Indeed, Chapter 10 shows very concretely how such reorderings can be exploited as compositional elements. In arguing that music and time reveal each other, Langer states elegantly that music "makes time audible."

How do we learn about time? Children playing with blocks are learning about space and spatial relationships. Talking, singing, and listening to speech and music teach about time and temporal relationships. Jody Diamond's comments about gamelan music apply equally well to the study of rhythm in general:

The gamelan as a learning environment is well suited to some important educational goals: cooperative group interaction, accommodation of individual learning styles and strengths, development of self-confidence, creativity...
Rhythm and Transforms focuses on a few of the simplest low level features of musical rhythms such as the beat, the pulse, and the short phrase, and attempts to create algorithms that can emulate the ability of listeners to identify these features. We take a strictly pragmatic viewpoint in trying to relate things we can measure to things we can perceive, and these correlations demonstrate neither cause nor effect. The models are essentially mathematical tricks that may be applied to sound waveforms, and the signal processing techniques emphasize properties inherent in the signal prior to perceptual processing.

Nonetheless, as the discussion throughout this chapter suggests, the models are often inspired by the operation of the perceptual mechanisms (or, more accurately, guesses as to how the perceptual mechanisms might operate). For example, Chapters 5 - 6 - 7 explore mathematical models of periodicity detection. To make these applicable to musical signals, a kind of perceptual preprocessing is applied which extracts certain elementary features from the waveform. These derived quantities (like the feature vector of Fig. 9) feed the periodicity detection. Similarly, Chapter 7 describes an un-biological model of beat extraction from musical signals based on a Bayesian model. These function in concert with perceptually inspired features that are extracted from the musical signal.

Several new and exciting applications open up once the foot-tapping machine of Fig. 2 can reliably locate the beats and basic periodicities of a musical performance:

Musical Editing:
Identification of beat boundaries allows easy cut-and-paste operations when editing musical signals.
An Intelligent Drum Machine:
Typical drum machines are preprogrammed to play rhythms at predefined speeds and the performers must synchronize themselves to the machine. A better idea is to build a drum machine that can "listen" to the music and follow the beat laid down by the musicians.
External Synchronization:
Beat identification enables automated synchronization of the music with light effects, video clips, or any kind of computer controlled system. This may be especially useful in the synchronization of audio to video in film scoring.
A Tool for Disc Jockeys:
Any identified levels of metrical information (as fast as the tatum or as slow as the phrase) can be used to mark the boundaries of a rhythmic loop or to synchronize two or more audio tracks.
Music Transcription:
Meter estimation is required for time quantization, an indispensable subtask of transcribing a musical performance into a musical score.
Beat-Based Signal Processing:
Beats provide natural boundaries in a musical signal which can be used to align a variety of signal processing techniques with the music. For example, filters, delays, echoes, and vibratos (as well as other operations) may exploit beat boundaries in their processing. This is discussed in Chapter 9 and appropriate algorithms are derived.
Beat-Based Musical Recomposition:
Automatic identification of beat boundaries allows composers to easily work at the beat-level, an underexplored compositional level. Several surprising techniques are discussed and explored in Chapter 10.
Information Retrieval:
The standard way to search for music (on the web, for instance) is to search metadata such as file names, .mp3 ID tags, and keywords. It would be better to be able to search using melodic or rhythmic features, and techniques such as beat tracking may help to make this possible.
Score Following:
In order for a computer program to follow a live performer and act as a responsive accompanist, it needs to sense and anticipate the location of musically significant points such as beat boundaries and measures.
Personal Conducting:
Combining the beat tracking with an input device (such as a wand that could sense position and/or acceleration) and a method of slowing/speeding the sound (such as a phase vocoder, see Chapter 5), the listener can "conduct" the music at a desired tempo and with the desired expressive timing.
Speech Processing:
Rhythm plays an important role in speech comprehension because it can help to segment connected speech into individual phrases and syllables.
Visualization Software:
Designed to augment the musical experience by presenting appropriate visuals on a screen, visualization software is a popular adjunct to computer-based music players. Many of these relate the visuals to the music using the amplitude of the audio signal (so that, for instance, louder passages move faster), the shape of the waveform, or various transforms. It would clearly be preferable to also have them able to synchronize to the beat of the piece.