Overview of Rhythm and Transforms

There are three parts to Rhythm and Transforms. There are chapters about music theory, practice, and composition. There are chapters about the psychology and makeup of listeners, and there are chapters about the technologies involved in finding rhythms.

Music: Chapter 2 discusses some of the many ways people think about and notate rhythmic patterns. Chapter 3 surveys the musics of the world and shows many different ways of conceptualizing the use of rhythmic sound.

Perception: The primary difficulty with the automated detection of rhythms is that the beat is not directly present in the musical signal; it is in the mind of the listener. Hence it is necessary to understand and model the basic perceptual apparatus of the listener. Chapter 4 describes some of the basic perceptual laws that underlie rhythmic sound.

There are three approaches to the beat finding problem: transforms, adaptive oscillators, and statistical methods. Each makes a different set of assumptions about the nature of the problem, uses a different kind of mathematics, and has different strengths, weaknesses, and areas of applicability. Despite the diversity of the approaches, there are some common themes: the identification of the period and the phase of the rhythmic phenomena and the use of certain kinds of optimization strategies.

Transforms: The transforms of Chapter 5 model a signal as a collection of waveforms with special form. The Fourier transform presumes that the signal can be modelled as a sum of sinusoidal oscillations. Wavelet transforms operate under the assumption that the signal can be decomposed into a collection of scaled and stretched copies of a single mother wavelet. The periodicity transforms presume that the signal contains a strong periodic component and decomposes it under this assumption. When these assumptions hold, then there is a good chance that the methods work well when applied to the search for repetitive phenomena. When the assumptions fail, so do the methods.

Adaptive Oscillators: The dynamical system approach of Chapter 6 views a musical signal (or a feature vector derived from that signal) as a kind of clock. The system contains one or more oscillators, which are also a kind of clock. The trick is to find a way of coupling the music-clock to the oscillator-clock so that they synchronize. Once achieved, the beats can be read directly from the output of the synchronized oscillator. Many such coupled-oscillator systems are in common use: phase locked loops are dynamic oscillators that synchronize the carrier signal at a receiver to the carrier signal at a transmitter, the "seek" button on a radio engages an adaptive system that scans through a large number of possible stations and locks onto one that is powerful enough for clear reception, timing recovery is a standard trick used in cell phones to align the received bits into sensible packets, clever system design within the power grid automatically synchronizes the outputs of electrical generators (rotating machines that are again modelled as oscillators) even though they may be thousands of miles apart. Thus synchronization technologies are well developed in certain fields, and there is hope that insights from these may be useful in the rhythm finding problem.

Statistical Methods: The models of Chapter 7 relate various characteristics of a musical signal to the probability of occurrence of features of interest. For example, a repetitive pulse of energy at equidistant times is a characteristic of a signal that is likely to represent the presence of a beat; a collection of harmonically related overtones is a characteristic that likely represents the presence of a musical instrument playing a particular note. Once a probabilistic (or generative) model is chosen, techniques such as Kalman filters and Bayesian particle filtering can be used to estimate the parameters within the models, for instance, the times between successive beats.

Beat Tracking: Chapter 8 applies the three technologies for locating rhythmic patterns (transforms, adaptive oscillators, and statistical methods) to three levels of processing: to symbolic patterns where the underlying pulse is fixed (e.g., a musical score), to symbolic patterns where the underlying pulse may vary (e.g., MIDI data), and to time series data where the pulse may be both unknown and time varying (e.g., feature vectors derived from audio). The result is a tool that tracks the beat of a musical performance.

Beat-Based Signal Processing: The beat timepoints are used in Chapter 9 as a way to intelligently segment the musical signal. Signal processing techniques can be applied on a beat-by-beat basis: beat-synchronized filters, delay lines, and special effects, beat-based spectral mappings with harmonic and/or inharmonic destinations, beat-synchronized transforms. This chapter introduces several new kinds of beat-oriented sound manipulations.

Beat-Based Musical Recomposition: Chapter 10 shows how the beats of a single piece may be rearranged and reorganized to create new structures and rhythmic patterns including the creation of beat-based "variations on a theme." Beats from different pieces can be combined in a cross-performance synthesis.

Beat-Based Rhythmic Analysis: Traditional musical analysis often focuses on the use of note-based musical scores. Since scores only exist for a small subset of the world's music, it is helpful to be able to analyze performances directly, to probe both the literal and the symbolic levels. Chapter 11 creates skeletal rhythm scores that capture some of the salient aspects of the rhythm. By conducting analyses in a beat-synchronous manner, it is possible to track changes in a number of psychoacoustically significant musical variables.