The accounts of musical pitch - of melody and harmony - that these writings provide often appear as prescriptive rules and definitions which sanction the forms that musical pitch structure may take. They will describe: (i) what collections of notes belong together (in defining major and minor scales); (ii) which notes from these collections are more important than others and are thus likely to occur at the beginning or end of successions of notes, or melodies (in defining the tonal functions - tonic, dominant etc. - of the different scale notes); (iii) which notes from the collections may occur simultaneously with others (in defining triadic chord configurations); (iv) which simultaneities are functionally similar (in defining inversional and substitutive equivalence of chords); (v) which keys are more closely, and which are more remotely, related; (vi) the ways in which the tonal function of notes in a melody determines which notes will succeed one another (as in the rules of voice-leading); and (vii) the ways in which principles of combining notes in chords and melodies will interact.
It would seem logical that the highly elaborated theories about pitch organisation expounded in these writings should have played a major role in explorations of the psychology of music from the outset. However, the prevailing view of musical pitch within the psychology of music through the first half of this century appears highly reductionist, and can be summarised in Seashore's (1938/1967) statements that (p17) "The terms `frequency'... `cycles' and `waves' are synonymous, and may be used interchangeably to designate frequency and pitch." [my italics] and that (p384) "When we have measured the sense of pitch, that is, pitch discrimination, in the laboratory with high reliability and we know that pitch was isolated from all other factors, no scientist will question but that we have measured pitch."[1]. In other words, pitch as a psychological phenomenon was to be equated with the acoustical phenomenon of frequency; while phenomena not easily explicable in terms of acoustics (such as the tendency of performers to flatten or sharpen particular notes of the scale in particular melodic contexts) were recognised and investigated (see, e.g., Francès, 1958/1988), the rationales that were advanced for their existence generally remained unsystematic.
Nevertheless, music theory describes relations between sets of pitches (such as transposition, inversion, and even reduction) which can lead to two different sets being regarded as more or less similar even although their constituent frequencies are quite different. Musical pitch is commonly discussed in terms of melodies, themes, motifs and keys; many of these concepts prove to be remarkably intractable when an attempt is made to account for them in terms of relations between frequencies. Thus, if an understanding of the psychology of musical pitch is simply equated with an understanding of frequency so that, e.g., two pitches are different only if they correspond to different frequencies, it becomes difficult - if not impossible - to account for many of the notions that are taken for granted in music theory (although it may be questioned whether these notions have any "reality" outside music theory, a possibility considered later in this chapter). However, everyday musical experience, together with the results of many experimental studies, indicate that pitches or patterns of pitches may be perceived as similar even when a frequency-based account would indicate that they should be experienced as being quite different, and any scientific account of the experience of musical pitch must confront this issue.
With the development of cognitive psychology in the 1950's and 1960's a new concern to account for the psychological mechanisms underlying the cognition of musical pitch became evident. Over the last twenty years a vast body of research has grown that seeks to reconcile an understanding of the experience of musical pitch with its rich description in the musicological literature. Much of this research has focused on the development and empirical testing of a range of functionalist models of musical pitch cognition that can account for the diverse capacities exhibited by listeners and performers in respect of a wide variety of musical materials and circumstances. While most of these models are symbolic or sub-symbolic, being intended to reproduce cognitive capacities with no particular reference to specific neurophysiological structures, a few are more closely based on determinate properties of the auditory system. This chapter will outline the course of this recent research and will describe the different strands that comprise it in an attempt to provide a clear picture of the development of current views of musical pitch in cognition.
A particular configuration of Deutsch's (1969) system for abstracting equivalence information for musical intervals and chords. Individual pitches are represented on level one; intervals and chords formed between specific pitches are represented on level 2; and classes of intervals and chords are represented on level 3.
An alternative and more elaborate account is provided by Christopher Longuet-Higgins, who in a series of papers (Longuet-Higgins, 1962a and b; 1976; 1979) proposed a formal model of "tonal space" intended to elucidate the perception of pitch relations in music. Longuet-Higgins' model represents pitches as points on an infinite two-dimensional plane, single steps being perfect fifths within one dimension and major thirds in the other. Diatonic major scales thus constitute asymmetrical and uninvertible L-shaped blocks formed by groups of seven proximate pitches (see Figure 2). In this way movement between notes within a key and movement between keys can be represented as shifts within and between regions of the two-dimensional plane.
Longuet-Higgins' model representing pitches as points on an infinite two-dimensional plane: single steps are perfect fifths in the y-dimension and major thirds in the x-dimension. The diatonic major scale on C is shown as an asymmetrical and uninvertible L-shaped region.
Longuet-Higgins intends his model to constitute a component of a formal, computational, "competence theory" of tonal musical pitch: that is, a theory that makes explicit the rules and processes that underlie a listener's experience of tonal musical works. As such he has not sought to test the model experimentally; however, as we shall see below, his model has many similarities to that of Balzano (1980; 1982) which has been the focus of extensive experimental investigation.
A different model is suggested by Dowling (1982), whose concern with generalisability has led him to propose a system within which the regularities of musical pitch in cognition can be represented in a way that may be applied to music of many different cultures. Dowling suggests four "levels of analysis" of musical scales: the first, the psychophysical pitch function maps pitches to frequencies; the second, the level of tonal material incorporates all the available pitches within a culture's music; the third, the tuning system is a subset of the pitches from the tonal material that is employed within sets of melodies; whilst at the level of the fourth, the modal scale, pitches of the tuning system are hierarchically organised to reflect the way in which pitches are used in an actual melody, with certain pitches being in some way "more important" than others (see Figure 3).
Dowling's four levels of analysis of musical pitch in cognition.
Each level within Dowling's model is formed by selecting pitches out of the next higher level, or by conferring different types of properties on them. Dowling intends this model to be capable of accounting for a range of sensitivities to pitch organisation while being sufficiently general to be applicable to the cognition of pitch structure in different types of music and in the music of a wide range of cultures.
Shepard proposed that this schema for musical pitch is best conceived of in terms of some multi-dimensional spatial representation. An early version of this representation (Shepard, 1964) took the form of a simple spiral (enabling representation of octave-equivalence). Shepard's more sophisticated 1982 version of the cognitive-structural model aims to account for many different types of perceived musical pitch relations in terms of the proximity of pitches within a complex spatial representation; spatial proximity is intended to model the degree to which pitches might be perceived as similar within a variety of musical contexts. This model makes use of three components, one being unidimensional (and correlating with differences in adjudged pitch height) and two being two-dimensional (taking the forms of the circle of fifths and the chroma circle). The circle of the fifths simply consists of the notes of the equal-tempered chromatic scale laid out in a circle so that each pitch forms the enharmonic musical interval of a perfect fifth with the notes on either side of it, while the chroma circle is produced by treating octave-related notes as functionally identical, and consists of the notes of the equal-tempered chromatic scale laid out in a circle so that each pitch forms the musical interval of a semitone with the notes on either side of it.
Shepard points out that as well as representing octave equivalence, his model has other properties that appear musically significant; these are most easily described in terms of the three-dimensional structure formed by the pitch height and circle of fifths components, which takes the form of a double helix on the surface of a regular cylinder (see Figure 4a). In this representation the notes of any major diatonic key can be divided from the notes not in that key by passing a plane through the central axis of the double helix. Moreover, transposition into the most closely-related keys is achieved by the smallest angles of rotation of the dividing plane about the central axis. A further aspect of the representation, obtained by combining the (two-dimensional) pitch chroma component with the circle of fifths component to produce a four-dimensional torus, is shown in Figure 4(b). Shepard (1982) states that the five-dimensional structure that results from adding both two-dimensional components and the unidimensional pitch height component can account for perceived similarity of pitches which are close in pitch height, heightened similarity of pitches at the octave and the perfect fifth, as well as accounting for the separability of the pitches within the major diatonic key from non-key pitches and the rotational proximity of closely-related keys.
Regular cylinder produced by combining pitch height and circle of fifths dimensions.
Torus produced by combining the chroma circle and circle of fifths dimensions.
In this way Shepard provides an imagistic structural representation of musical pitch. It is intended to provide an account of the ways in which our cognitive mechanisms may function when we experience music; it is being proposed that the ways in which we represent musical pitch relations in our cognitions - our mental models, or tonal schemata - have properties similar to the model that he proposes. It integrates many of the features of perceived relations between pitches that experiments have shown to be significant; it would seem to be capable of explaining many of the observed judgmental and memory abilities of Western musical listeners. The model provides a coherent and logical basis for cross-octave chroma identity, for the "privileged" status that notes within a scale or key have in respect of one another, and for the idea of key-distance relations. Moreover, Shepard suggests that the model could be adjusted while retaining its fundamental attributes if the experimental evidence so required, e.g. by weighting different dimensions to account for the perception of tones and semitones as being equivalent in certain musical contexts (as experiments such as those of Dowling (1978) appear to indicate).
A hypothesis central to the approach is the idea that listeners make use of a cognitive representation of pitch organisation within which different pitches can be said to be differently stable. Stability is accounted for in terms of the degree to which a pitch can be said to be representative of a class or category of pitches. The idea of categorisation is an instance of a general principle that has been shown to operate in many cognitive domains (Rosch, 1978) and which contributes significantly to efficiency of cognitive functioning.
Krumhansl (1990, p19) suggests that "the relative stability of a tone will depend to some degree on its treatment within a particular compositional context...However, it is presumed that there is a more abstract, invariant hierarchy of stability that is typical of a musical style more generally, and that this more abstract hierarchy is an important characteristic contributing to the perceived stability of each tone within a complex musical sequence." Listeners brought up within a particular musical culture will progressively form this "tonal hierarchy" as they are exposed more and more to the music of their culture, as well as in the course of any formal musical training they undergo. The formation of the "tonal hierarchy" is thus held to depend both on processes of enculturation - most likely non-conscious - and on explicit and conscious learning.
A range of empirical studies conducted by Krumhansl and others supports the idea of a hierarchy of stability for pitches in cognition. Following major and minor scalic passages as well as triadic arpeggios and major and minor cadences as contexts, subjects were asked to give a numerical rating of how well subsequent single pitches - "probe tones" - fitted the contexts. It was found that subjects were highly consistent in the ratings that they accorded to particular pitches following particular contexts; the note that could be construed as the context's tonic would be likely to be rated most highly, next highest-rated were the dominant or mediant (depending on whether the context was, respectively, major or minor), then the other notes diatonic in respect of the context, and finally (and rated lowest) the non-diatonic notes (see Figure 5). Most significantly, these findings were replicated in a number of studies (for summary accounts see Krumhansl, 1990).
The "tonal hierarchies" evident in subjects' ratings of probe notes following major or minor contexts.
Krumhansl (1990, Ch. 3) suggests two possible factors as giving rise to the tonal hierarchy: consonance and statistical distribution of tones. The former is taken as deriving from the degree to which two or more tones interact in the peripheral auditory sense-organs and hence give rise to a greater or lesser sensation of "roughness" or "sensory dissonance", while the latter derives from statistical analyses of pitch distribution in various corpuses of tonal music. It was found that tonal hierarchy rating profiles correlated strongly with statistical predictions based on frequency of pitch occurrence in tonal music, and (less strongly) with predictions based on sensory dissonance calculations; this finding implies that learning - whether formal, or by enculturation - is a stronger determinant of the tonal hierarchy than is the nature of the sensory-transductive mechanisms. Briefly, the cognitive-structuralist programme would suggest that the tonal hierarchy owes more to nurture than to nature (although nature's paternity cannot be disproved nor its contribution discounted).
A slightly different form of experimental study (Krumhansl, 1990, Chapter 5) sheds further light on the nature of the cognitive representation of the tonal hierarchy. In this, subjects heard two pitches preceded by a variety of contexts and judged how similar was the first pitch to the second. This afforded the possibility of providing a geometric model of listeners' cognitive representations of tonal relations between pitches, wherein pitches that were adjudged to be highly similar were represented by points close together within a multi-dimensional space, dissimilar pitches being far apart. Krumhansl (1990, p 119) acknowledges that an imagistic model (such as that proposed by Shepard, above) might be problematic in the strong assumptions that it makes about the satisfaction of metric axioms (see Tversky, 1977). She provides an alternative propositional version (1990, p130), which enables her to account for the finding that the order of presentation of a pair of pitches that differ in their stability in respect of the preceding context affected the similarity judgments made. If the first pitch of a pair was less stable than the second (e.g., the pitches B then C in the context of a C major scale), it was judged to be more similar to the second pitch than if the order of presentation of the pitch-pair was reversed.
Krumhansl and her collaborators have extended the idea of a tonal hierarchy for pitches in cognition to chords and to keys, altering the probe-tone paradigm by changing the elements to be rated to triadic chords. In these experiments it should be noted that the tones used were synthesised so as to contain only octave-related partials and are spectrally-scaled so that if a single ascending chromatic octave were to be played repeatedly, the effect of an infinite ascent would be perceived (Shepard-tones). This was done so as avoid any putative effects of voice-leading between chords of a context and probe chords on adjudged fit, or, in Krumhansl's own words, to "minimise the effect of pitch height differences between the context and probe tones" (1990, p 26) and to "minimise the effects of melodic motion" (ibid., p 170). A hierarchical representation of harmonic function of single chords in tonal contexts was again found, but one with a structure different from that found for single notes (see Figure 6). Chords that function in the same key were found to be perceived as more closely related than chords that function in different keys even in the absence of contexts. Within the resulting "harmonic hierarchy", chords on the tonic, dominant and subdominant were found to be most highly rated, although Krumhansl suggests that these results show strong interdependencies between the three levels of musical structure: tones, chords and keys, an interdependence of the sort that Lerdahl's (1988) model seeks to express in formal terms.
"Harmony-space" derived from subjects' ratings of chords in harmonic contexts.
Further studies conducted by Krumhansl have again modified the probe-tone technique, employing more complex contexts such as modulatory, bitonal or serial sequences, or short pieces of North Indian music. The results of the experiments on modulatory chord sequences were shown to be predictable from the results of earlier studies on the harmonic hierarchy in cognition, as they demonstrated that subjects developed a sense of key by integrating the possible harmonic functions of the individual chords over time. The results of the experiment on the perception of tonal structure in a bitonal context were more complex, indicating that subjects might well have employed aspects of the octatonic set (see van den Toorn, 1983) in their judgments. The experiment employing serial contexts (Krumhansl, Sandell and Sergeant, 1987) again found that listeners adapted the bases of their judgments of the "fittingness" of notes to suit the contexts. However, the ratings of a group of listeners familiar with serial music correlated negatively with local key implications, indicating that listeners inverted the normal tonal hierarchy to reflect the denial of key structures that they expected to be associated with 12-tone serial music. Interestingly, the responses of a group of listeners with very limited experience of atonal music did appear to show that these listeners were making use of some tonal hierarchy in their judgments; these listeners had apparently acquired an interpretive strategy and continued to apply it even when it might not be wholly appropriate.
The experiment employing North Indian contexts (Castellano, Bharucha and Krumhansl, 1984) used listeners who were placed into two matched groups according to whether or not they had extensive experience with Indian and with Western music or only with Western music. This experiment employed short pieces of music as contexts, and examined the relation between subjects' ratings of subsequent pitches and the theoretical hierarchies of pitches that characterise the heptatonic modal structures called thaats. Thaats can be grouped in a circle so that adjacent thaats differ in their make-up by one single pitch. The context pieces were pre-existing compositions that had been categorised as belonging to different thaats. It was found that the relative durations with which the various tones were sounded in the context pieces strongly affected listeners' responses. However, only the listeners with extensive prior exposure to Indian music showed any sensitivity to thaat structure in their responses; this sensitivity was structured in such a way that their responses to different pitches in the context of different compositions could be mathematically scaled so as to demonstrate the "circle of thaats" as underlying those responses.
Bharucha has further elucidated the processes that underlie listeners' sensitivity to pitch organisation through a series of experiments (Bharucha and Stoeckig, 1986, 1987) and by the development of a connectionist account of the perception of tonal structure (Bharucha 1987, 1991; Bharucha and Todd, 1991; Bharucha, 1994). Bharucha and Stoeckig's experiments demonstrated that even Western listeners with no formal training in music theory are sensitive to the tonal context of pitches; they showed that the speed and accuracy of processing a chord are greater when it is related to the preceding context than when it is unrelated, and explain this in terms of a process whereby the context generates expectancies, or "primes" the subject to expect that a related chord will follow.
Bharucha explains this phenomenon, and the fact that it appears to arise through processes of acculturation ("passive exposure"), in terms of the capacities of a distributed or connectionist representation of relations between pitches in cognition. He suggests (1987, 1991) that a neural net model (of which Deutsch's 1969 model can be seen as an early instance) should be able to exhibit the sensitivities to pitch structure shown by listeners through a process of unsupervised learning, and outlines just such a system, MUSACT. This model presupposes multiple levels of pitch representation; these range from the "spectral" level (reflecting many of the characteristics of the acoustical signal) to the "invariant pitch-class" level (a highly abstract level of representation in which pitch-classes are differentiated by tonal function). In this respect Bharucha addresses some of the issues that prompted Dowling's account (see above).
Units and links in Bharucha's MUSACT system
The MUSACT network as described in Bharucha (1991) consists of units representing pitch classes, chords and keys (see Figure 7) that acquire their functions through a process of competitive self-organisation during exposure to musical pitches, whether presented as single pitches or as chords. The resulting network will then exhibit patterns of activation of its constituent units that reflect many of the characteristics evident in the perception of musical pitch. For example, a given chord unit may be activated by the activation of either all or just some of its component pitch class units; this latter case may derive from "top-down" activation by a parent key unit that has itself been activated by a preceding context, thus modelling the way in which inferences about tonal identity and function can be made on the basis of partial evidence. Bharucha (1994) sketches a further refinement of this model that builds on the work of Gjerdingen (1994) in order to account for the temporal dimension inherent in the idea of dissonance and resolution.
However, two further sets of experiments using the probe-tone technique appear to contradict these findings. Using contexts of complete ascending and descending major diatonic scales, Speer & Meeks (1985) found that children of ages 8 and 11 were able to make all the distinctions that were made by musically experienced adults in earlier studies such as Krumhansl and Shepard (1979). They concluded that tonal features that are salient to children may differ from those which are salient to musically inexperienced adults, a conclusion echoed by the findings of a similar study (Cuddy & Badertscher, 1987) which again conflict with Krumhansl and Keil's earlier results. Using contexts of an arpeggiated major triad, a complete ascending major scale, and a diminished triad, Cuddy and Badertscher found no apparent developmental change in patterns of judgments. The factor of context-type in this instance appeared to be more influential; the major triad context evoked a profile corresponding to that of a key of which the root of the triad could be assumed to be the tonic, the major scale produced a less defined profile (with the tonic the only note strongly preferred) and the diminished triad an essentially flat profile. Cuddy and Badertscher concluded that the triad is the most effective determinant of Western tonality
It may be recalled that Shepard's model can depict the notes of a diatonic scale as a "connected region" of his multi-dimensional space. This property arises because Shepard's model is directly analogous in some of its characteristics to another method of representing musical pitch relations: Balzano's group theoretic model of musical pitch relations (Balzano, 1982). Balzano treats the notes of the chromatic scale as analogous to the cyclic group of order 12 (i.e., the set of numbers 1 to 12, or 0 to 11 ). Within this approach, all octave-related notes are treated as functionally identical. This enables the notes of the chromatic scale - or chromatic set - to be depicted in two different circular, or cyclical, forms; these are directly analogous to the chroma circle and to the circle of fifths (see Figure 8). This approach also relies on the idea that sets of notes can be regarded as functionally identical at the level of pitch class set if one set can be transformed into another by means of an operation such as transposition (and, in Forte's (1973) formulation, transposition together within inversion). Pitch class sets are unordered, being regarded as identical if they have the same membership irrespective of the order in which the members of the pitch class set are laid out. To give an example, the set [2,0,7] - or D, C, G - would be regarded as identical to the set [0,2,7], and to the set [2,4,9]. That is, within the group-theoretic representation of musical pitch, two sets of notes or pitch classes are identical when the same structural relations can be shown to hold for each set.
Upper two circles show the chroma circle as note-names and as integers; lower two circles show the circle of fifths as note-names and as integers.
Balzano points out that within the group-theoretic representation, the pitch class set - or set of notes - corresponding to the notes of the diatonic scale has a number of special properties: it has the property of uniqueness, in that each pitch class can be differentiated from every other pitch class by the set of intervals that it forms with all other pitch classes of the set; it has the property of simplicity, in that each diatonic scale or diatonic set may be transformed into another such set by changing only one pitch class, the relation between the initial set and the transforming element being the same for all equivalent forms of the diatonic set; and it has the property of coherence in that the sum of any two intervals formed between three adjacent pitch classes or notes of the set will be greater than any single interval occurring between adjacent pitch classes. Coherence and uniqueness can be thought of as conferring possible advantages in perception, in that the notes of a diatonic melody will be differentiable by the intervals that they form with other notes irrespective of their order of occurrence and any major or minor interval within the melody will correspond to a given number of diatonic scale steps enabling a listener to judge the size of movement that it constitutes within the underlying scale (with the sole exception of the tritone). The diatonic set is the only set within the group-theoretic representation of musical pitch that exhibits all these particular properties.
Balzano (1980) presents a further, more complex two-dimensional model based on intervals of major and minor thirds that is essentially analogous to that proposed by Longuet-Higgins (see above), except that Balzano's model requires enharmonic pitch identity (i.e. for the purposes of the model Bb is functionally identical to A#). Within this more complex model, based on the direct-product group representation of the cyclic group of order 12, diatonic scales are compact entities formed by overlapping major and minor triads (see Figure 9), with closely related keys lying adjacent within the space (cf Figure 6, Krumhansl's "harmony space"). In fact, all the unique properties of the diatonic set that are evident in the chroma and fifths circles are evident in the direct-product representation, with the addition that scales can also be seen as agglomerations of triadic structures, affording the possibility of easily representing both melodic and harmonic relations.
Balzano's direct product group. Single steps are minor thirds in the y-dimension and major thirds in the x-dimension. A diatonic major set is shown based around C as an asymmetrical region. (Note that this two-dimensional representation constitutes an "unrolling" of a toroidal structure on the surface of which all points of the same integer are coincident.)
Possible implications of these properties of the diatonic set for perception have been further explored by Browne (1981). He suggests that diatonic set structure uniquely facilitates cognitive activities which he describes as position finding and pattern matching. Position finding involves questions concerning orientation and reference, such as "where are we ?" (in respect of tonal space) and of "how long should I hear what happened in terms of note x ?". So the identification of the tonic (or other structural notes) within a passage, or the tracking of modulations between tonal regions, could both be described as position finding processes. Pattern matching involves questions of similarity and belongingness, such as "is this the same as that ?", or "does this belong to the same thing as that ?".
Browne points out that within the diatonic set, intervals and subsets of the diatonic set (a diatonic subset is a set formable between any members of the diatonic set) can be regarded as either more rare or more common (i.e., can have a low or a high multiplicity). Thus the interval of a semitone occurs in only two positions within the set, while the interval of a perfect fourth occurs in six positions. Similarly the pitch class set [5,7,11] - equivalent to the set of notes F, G and B - can only occur in the one position within the set (low multiplicity), the pitch class set [0,5,7] - C, F, G - can occur in five positions (see Figure 10), while the pitch class set [0,2,5] - equivalent to C, D and F, or, (if inverted around the central pitch class) to the notes E, D and B - can occur in eight different positions (high multiplicity). Browne suggests that rare intervals (such as the tritone and semitone) or rare diatonic subsets such as [0,2,6] may fulfil position finding functions in cognition involving differentiation between notes within the diatonic set so as to specify a tonal hierarchy. On the other hand, common intervals or high multiplicity subsets such as [0,2,5] may help to effect pattern matching functions in cognition involving association and integration of notes as members of the set. Thus Browne proposes that the structure of the diatonic set - which, in Shepard's model, seems quite static, and rather "grid-like" - may play a dynamic role in the cognition of musical pitch and pitch relations.
Rareness and ubiquity of pitch-class sets within the diatonic set: upper circle shows that sets of the type (5, x, 11) can be formed in only a few positions within the diatonic set (that shown is the set (5, 2, 11)); lower circle shows that sets of the type (5, 0, 7) occur in five positions within the diatonic set.
A subsequent series of experiments (Howell, West and Cross, 1984) tested the bases for subjects' judgments about the degree to which pitches fitted in melodic sequences by asking subjects to identify wrong (i.e., "out-of-scale") notes in ongoing sequences. Their findings indicated that subjects were more likely to identify notes as wrong when the preceding pitches constituted a pitch-class set that was common within the diatonic set (i.e., constituted a set of type that Browne implicates in pattern-matching processes) than when the preceding pitches constituted a more scale-specific (position-finding) set. A further series of experiments (West, Cross and Howell, 1991) clarified these results by asking subjects to rate the fit of "out-of scale" probe notes following three-note sequences that varied according to their diatonic commonness or multiplicity; it was found that high multiplicity sets such as the set (0,2,5) led to lower ratings for the probes than did low multiplicity sets, indicating that diatonically common sets were evoking a stronger sense of diatonic structure than were diatonically rare (but scale-specific) sets. A feature of these results was that even the tonally-referential triad set (0,3,7) was less efficacious than higher multiplicity sets in evoking a sense of diatonic structure. Overall, these findings suggest that Western listeners, whether musically trained or not, employ a schematic representation of musical pitch in listening within which diatonic structure has a privileged position and in which sets of pitches that are common within the diatonic set constitute the most potent activators of a "diatonic schema", fulfilling a pattern-matching function of the type suggested by Browne.
Exploration of Browne's concept of position-finding, together with a dissatisfaction with aspects of the cognitive-structuralist programme, have led Butler and Brown to put forward an account of musical pitch in cognition that differs significantly from the cognitive-structuralist version. Butler (1989) suggests that cognitive-structuralist findings can be accounted for largely by local pattern implications or by short-term memory processes. For example, the finding that diatonic notes are rated more highly than non-diatonic notes following a context of an ascending diatonic major scale might derive from a bias towards rating highly those notes of which traces persist in short-term memory rather than from the application of any schematic representation within longer-term memory. Similarly, a tendency to rate the tonic most highly following an ascending scalar context spanning a complete octave might derive from the saliency accorded to the tonic because it starts and ends the context (cf. Divenyi and Hirsh, 1978) rather than from operations based on a cognitive representation of the tonal hierarchy. Or it could be proposed that a tendency to rate the tonic highly following presentation of an incomplete ascending scale (i.e., rising from the tonic to the leading-note) could derive from local pattern implications of good continuation (see Bregman, 1990) rather than from global considerations of tonal hierarchy. In other words, Brown and Butler are proposing that the tonal hierarchy is an experimental artefact; they suggest that the perception of musical pitch is mediated by processes that are both more dynamic than those implicated in the cognitive-structuralist theories and more closely tied to structural potentials of patterns of pitches.
Brown and Butler (1981) showed that listeners exhibited a high degree of accuracy (87%) in producing a feasible tonic when presented with a three-note context consisting of two notes separated by a tritone plus one other note, irrespective of the order of the notes in the context. This finding sits well with Browne's (1981) suggestion that rare diatonic intervals (such as the tritone) are well-suited to play a position-finding role. However, a further study involving the identification of tonal centres (Butler, 1983) found that the efficacy of four-note diatonic subsets containing a tritone between two members (e.g., the set [0,1,5,6]) in acting as contexts that unequivocally specified a tonic for listeners was dependent on the order in which the notes of the context were presented. While the set is a good cue to the identity of a tonic for listeners when presented in the order [0,1,5,6], when presented in the order [0,5,1,6] - an order in which the diatonically-rare intervals in the set (semitones and a tritone) are not explicitly present - the efficacy of the context in cueing a tonic decreases.
Building on these findings, a study by Brown (1988) used nine categories of tonal content and four categories of tonal context, and required musically-trained listeners to sing appropriate tonics following brief contexts. The variable tonal content ordered the context sets according to the degree to which their structure unambiguously specified a tonic, while the variable tonal context classified context sets on the basis of the salience of diatonically-rare intervals within each context. She found that both variables as well as their interaction affected the performance of her subjects in identifying a tonic, and concluded that her subjects were sensitive to both time-independent and time-dependent functional relationships between pitches, these relationships being most easily perceived in the presence of rare intervals in "optimal" temporal orderings.
Recent papers by Butler (1989) and by Brown, Butler and Jones (1994) have extended and clarified these findings in presenting and testing the intervallic rivalry model, which is intended to counter many of the problematic aspects of the cognitive-structuralist account. The intervallic rivalry model rests on three related hypotheses: (a) that listeners assume the first pitch of a sequence is the tonal centre until a better candidate arrives (the primacy hypothesis); (b) that listeners rely upon rare intervals more than common intervals in deriving a sense of tonal centre, as these provide more reliable key information by unambiguously correlating with a single diatonic set; and (c) that listeners are more accurate in determining key when a rare interval appears in a temporal order implying goal-oriented harmonic motion of a type common in tonal music. They suggest (p377) that "the intervallic rivalry model assumes that enculturated listeners respond to incoming context by engaging in an active discovery process oriented toward identifying a tonal centre over the course of the music".
They tested the intervallic rivalry model in two "probe-tone" experiments, using a range of different types of context and employing listeners grouped according to level of musical training. Their first experiment set out to replicate the findings of Cuddy & Badertscher (1987) using adult listeners and to account for these findings in terms of the degree of ambiguity in the presented contexts when interpreted according to the intervallic rivalry model. It used three different contexts to precede probes (an ascending major diatonic scale, and arpeggiations of the major triad and diminished triad), predicting that the particular pitches used would act together with their temporal ordering so as to elicit the tonal hierarchy in subjects' responses (except that the responses to the diminished triad were expected to indicate no advantage for any particular pitch). The results of this experiment were closely in line with those of Cuddy and Badertscher, except that the tonic received a higher than expected rating following the diminished triad. While noting that this latter aspect of their results did not unambiguously favour either the intervallic rivalry or the tonal hierarchy theory, Brown, Butler and Jones claim that the intervallic rivalry theory is better able to account for most of their findings.
Their second experiment was designed explicitly to test the degree to which temporal and structural features of the contexts could be manipulated according to the intervallic rivalry theory so as to steer listeners towards particular interpretations of the contexts. Contexts with the same pitch content as in the previous experiment were employed; however, these were internally re-ordered so as to cue different responses. The major triad context was ordered so as to "favour" the subdominant (being presented as tonic, dominant, mediant, tonic) and the diminished triad was intended to favour the tonic (being ordered as leading-note, subdominant, supertonic, leading-note). The scalar contexts were presented in random order, the likelihood being that subjects would produce essentially flat responses as no particular note was being accorded primacy. The results of this experiment did not unambiguously support the intervallic rivalry theory; despite the randomisation the tonal hierarchy emerged in the responses following the scalar context. However, in accordance with the predictions of their theory the reordered major triad favoured the putative subdominant, while the diminished context led to responses favouring tonic as well as the submediant and the leading-note.
Butler and Brown's research indicates that temporal and structural factors that are generally neglected in the cognitive-structuralist approach can play a major role in the perception of musical pitch., even for listeners with little musical training; as Brown, Butler and Jones state (1994, p. 406) "key discovery and responsiveness to tonal implications is not the preserve of specially trained musicians". The role that Browne's theory allows for diatonically rare intervals and sets in pitch perception is confirmed, and they are identified as one component of the processes involved in key identification.
It is notable that both cognitive-structural and intervallic rivalry theories incorporate schemata that embody diatonic structure. For the cognitive-structuralists, such a schema presumably arises through exposure to diatonic music within which the divergent frequency distributions of pitches that are differentiated by their positions within the diatonic scale - their scale-step aspect - give rise to different degrees of stability inhering in the cognitive representations of those pitches. Here, diatonic structure constitutes a given, and appears to be grid-like and static. On the other hand, within intervallic rivalry theory the schema embodies information about the structural characteristics of intervals and sets of pitches within diatonicism that can be employed to identify location within the scale. In this theory, diatonic structure is a dynamic feature of the listening process, and could be construed as an emergent property of the intervals and pitch-sets encountered in listening to music (cf Cross, West and Howell, 1991).
The differences between the two theories can be interpreted as a difference in focus rather than in fact. Both theories acknowledge the role of order information in determining the "stability" of pitches in perception, and as we have seen, both rely on some representation of diatonicism. However, while the cognitive-structuralist theory focuses on the differential frequency distributions of pitches within music as an operational characteristic in perception, the intervallic rivalry theorists emphasise the role of diatonic multiplicity. This divergence between the two models can be interpreted as deriving from their focus on different perceptual stages and strategies in the perception of musical pitch. Brown, Butler and Jones (1994) themselves suggest as much in stating that the two models accentuate different aspects of tacit knowledge about tonality; the intervallic rivalry model centres on process of key discovery, the cognitive-structuralist account on reinforcement of tonal function, but both processes are necessary for a listener to follow tonal music in real time.
This suggestion, that the two theories reflect different stages or strategies in listening, is given force by the findings of both Castellano, Bharucha and Krumhansl (1984) (in which Western listeners responded to the event structures of the unfamiliar, North Indian, contexts while the responses of Indian listeners reflected "tonal hierarchical" relations appropriate to the music) and a recent study by Cuddy (1994). The latter conducted probe-tone experiments where diatonic contexts were specially composed so as to have frequency distributions of pitches (a component of the event structure) that differed from the tonal "norm". She found that subjects' responses to this unfamiliar music directly reflected the frequency-distribution aspect of the contexts' event-structures. This could be taken to imply that when the detail of the flow of events in a piece is unfamiliar it is likely that listeners will respond to global statistical characteristics such as the frequency distribution of events, while in listening to pieces that conform more to cultural norms prior knowledge concerning functional characteristics of intervals and pitch sets can employed in interpreting the music's structure. In other words, when presented with music in an unfamiliar but "followable" idiom in which structural relations between notes are employed in unexpected ways, expectations based on abstractions of structure (prior tonal knowledge) will be abrogated and attention directed towards the holistic properties of the music rather than to the detailed "argument" of its unfolding.
Certain aspects of Cook's charge are undoubtedly worth considering. His contention that studies of music cognition, in relying on music theory, have undermined the validity of their results as accounts of the listening experience might well be applied as a criticism to some of Krumhansl's work; after all, in describing her study with Kessler (Krumhansl and Kessler, 1982) in her 1990 book, she states (p.26) that "...in order to decrease the chance that non-musical response strategies would be adopted, listeners had to have an average of 5 years of formal instruction in music", appearing to fall into the trap of identifying "musical" response strategies with those prescribed by music theory.
However, Cook's refutation of the utility of all studies of music cognition - or even of all Krumhansl's studies - as accounts of the listening experience does not stand up to scrutiny. It can scarcely be argued that those many studies that employ musically untrained listeners as subjects, and require them to make judgments that do not rely on any overt use of music-theoretic concepts or labels in their responses, fall into the error that Cook condemns. Indeed, it is evident that one of the studies that he criticises most harshly[4] - that of Castellano, Bharucha and Krumhansl (1984) - employs just such a group of listeners among its subjects (Western subjects with no prior experience of North Indian music), and that their results greatly illuminate the processes involved in listening to music in an unfamiliar but "followable" idiom.
In general, Cook's claims stand as a warning to (rather than a proscription of) research in music cognition. His criticisms may indeed stand as a valid reproach to many studies of ostensibly "musical" listening; however, there are many studies that do not make the assumption that music-theoretic entities and elements accurately encapsulate the varieties of musical experience, but use such entities simply because they offer ways of controlling and manipulating musical materials (much in the way that they are used in practical music within a literate culture: see Cross, 1985). Moreover, when the results of an experimental study suggest that aspects of these music-theoretic entities are operational in the cognitions of untrained listeners, this surely indicates either (i) that the experimental procedures involved were so over-determining that their potential findings must necessarily reflect an "artefactual" effect of these music-theoretic entities (which problem should become evident through careful examination of the operational definitions employed, and of the experimental design) or (ii) that these music-theoretic entities actually do capture something about the music as experienced (as one presumes many music-theoretic entities and principles must for music theory to be anything other than an idiosyncratically deconstructionist fantasia). Indeed, as suggested at the outset of this Chapter, the concepts of music theory can themselves be regarded as phenomena that the empirical study of music cognition must seek to investigate.
The most developed of these accounts, that of Parncutt (1989), which partly derives from that of Terhardt (see Terhardt, Stoll and Seewan, 1982), provides an explanation for the way that chords and chordal successions are heard that derives both from the processes whereby the ear analyses out components of complex waveforms and from the processes whereby these components are integrated into some unitary percept. Parncutt has produced an algorithmic version of his theory that is intended as a formal model of the ways that our perceptual systems function in musical perception; it is not intended to be a model of the actual neural processes involved.
Parncutt proposes that chords are heard as having unitary identities or roots because listeners have learned to hear complex periodic waveforms as having unitary pitches (generally correlatable with the fundamental frequency of the complex waveform). When a complex periodic sound is encountered, it is generally experienced as one pitch (see Figure 11a); this occurs through a process of analysis of the waveform and its re-integration as a percept by the auditory system. Parncutt suggests that a similar process governs the perception of chords in music. Thus, a sounding triad - major or minor - can be heard as being one thing rather than three notes sounding together. However, Parncutt states that chords may be heard either synthetically or analytically, i.e., that individual chord components may be "heard out", on a similar basis to the processes involved in "hearing out" the components of a complex periodic waveform (see Figure 11b). He suggests that the same factors constrain perception of single pitches and of chords. These factors derive from, among other things, the frequency-resolving power of the inner ear, the latency-period of the auditory nerve, and the establishment of a system of pattern-recognition based on "best fit" to the harmonic series. Thus he holds that the sensory processes involved in perception of single pitches govern the perception of chords.
(a) Unitary pitch percept corresponding to fundamental frequency evoked by harmonic complex tone.
(b) Pitch evoked by harmonic complex tone constituting either a unitary percept ("virtual" pitch, shown as solid bar) or a cluster of "spectral" pitches corresponding to higher spectral components (shown as dotted bars).
(c) Possible "roots" of a triadic chord.
The input to Parncutt's algorithmic model of the auditory system may be either a complex periodic waveform or a chord. The output, in response to the former, would not be one single "perceived pitch", but a set of probable perceived pitches, weighted according to the likelihood of their being "heard" in response to the input (see Figure 11b). The output in response to a chord would similarly be either a set of possible weighted chord roots as in Figure 11c or a set of chord components that are weighted according to their likelihood of being noticed. The set of chord roots would, Parncutt suggests, provide an index of the chord's perceptual stability, and hence of the likelihood of its being used in some referential way; if one root is much more highly weighted than the others the chord is probably stable - a major triad would be likely to have a fairly unambiguous root - while if several different roots are given the same weighting, then the chord is likely to be perceived as unstable (e.g. the Tristan chord, which generates several equally likely roots).
In this way, this algorithmic account provides a rationale for harmonic stability or instability - functionally, consonance and dissonance - that is rooted in the nature of the sensory processes involved in hearing. Hence, Parncutt's theory can provide an explanation of why certain sets of notes should sound well together as chords and why others should sound unpleasant, as well as providing reasons for why chords such as major or minor triads should end a passage (their roots being relatively unambiguous, and the chords therefore stable) while others such as the Tristan chord sound "incomplete" (because its root is ambiguous it appears unstable).
This account can be seen as complementary rather than antithetical to the cognitive approaches outlined above, in that it addresses an issue - that of the relation between physical signal and the experience of musical sound - that they do not, and hence fills what could have been a theoretical lacuna. Indeed, Parncutt admits of the need for a theory at the pitch-class level (such as is embodied in most of these cognitive approaches) in order to account for aspects of musical perception that remain outside his theory. In fact, the bases of Parncutt's account have been disputed by some psychoacousticians who consider that the experimental evidence points to the operation of periodicity-sensitive mechanisms (see Moore, 1989) rather than to the spectral-abstraction processes upon which Parncutt's (and Terhardt's) theory relies. Patterson (1986) has proposed an account of musical pitch in cognition that relies on detection and abstraction of periodicities in the physical signal to structure our perceptions of pitch. He puts forward a complex "spectro-temporal" model for the perception of musical pitch, and suggests that the structure of the scales employed within Western tonal music, and hence their efficacy in music cognition, can be accounted for on the basis of properties of this model. However, his model relies on occurrence of low-integer relations between scale notes, leading him to accord a pre-eminent role to the "Just" scale form that cannot be supported by the historical record (see, e.g., Lloyd and Boyle, 1963).
Notwithstanding these caveats, it is evident that any theory of the perception of musical pitch must consider the processes that link the physical signal and the experience of musical sound as operational factors in determining the nature of that experience. It may be that such factors play a more potent role in shaping the tuning systems upon which Western music relies than in determining our sensitivities to structure in musical pitch (as suggested in Burns and Ward, 1982); it may be that they play quite different roles in musical perception within different cultures (as hinted at in Kubik, 1985). In the limit, they constitute constraints on our experience of musical pitch that cannot be ignored, and Parncutt's theory represents the most comprehensive attempt to date to explore this issue.
Given the primacy accorded to pitch in music theory and in casual discourse about music it is unsurprising that the cognition of musical pitch has been the focus of so much research interest. This research has resulted in the formation of coherent and robust accounts of the factors and processes that determine our experience of musical pitch. Nevertheless, pitch is but one aspect of music; however coherent a cognitive theory of musical pitch may be, it remains at best a partial account of the experience of music. Recent experimental work has begun to explore the perception of music by treating pitch as one amongst several interacting musical dimensions (see, for example, Monahan and Carterette, 1985; Boltz, 1993; Schmuckler and Boltz, 1994; Thompson, 1993, 1994; Thompson and Cuddy, 1992). Other research has focused on the global experience of music, in the process seeking to elucidate the role of pitch within the whole by examining the impact on subjects' responses of different types of pitch organisation (see Bigand, 1993; Deliège, 1987; Deliège and El Ahmadi, 1990).
In the last fifteen years the complexities of musical experience have been assailed by a number of researchers, notably Lerdahl and Jackendoff (1983) and Narmour (1989, 1992), who have put forward theories that purport to account for the global experience of music. Theories of pitch that are more or less compatible with those outlined above constitute significant components of these accounts (see, e.g., Lerdahl, 1988), and the theories that are outlined in this Chapter can be expected to play a significant role in future research that is aimed at explicating the richness of the global experience of music.
Balzano, G. (1980). The group-theoretic representation of 12-fold and microtonal pitch systems. Computer Music Journal, 4, 66-84.
Balzano, G. (1982). The pitch set as a level of description for studying musical perception. In M. Clynes, (Ed), Music Mind and Brain. New York: Plenum.
Bent, I. (1980). Analysis. Entry in S. Sadie (Ed), Grove's Dictionary of Music. London: Macmillan.
Bharucha, J. J. (1984). Event hierarchies, tonal hierarchies, and assimilation: a reply to Deutsch and Dowling. Journal of Experimental Psychology: General, 113, 421-25.
Bharucha, J. J. (1987). Music Cognition and Perceptual Facilitation: a Connectionist Framework. Music Perception, 5, 1-30.
Bharucha, 1991. Pitch, harmony and neural nets: a psychological perspective. In P. Todd and D. G. Loy (Eds.), Music and connectionism. Cambridge, Mass.: MIT Press.
Bharucha, 1994. Mental tonal structures. In I. Deliège (Ed.), Proceedings of the 3rd International Conference on Music Perception and Cognition, Liège, July 1994.
Bharucha, J. J. and Stoeckig, K. (1986). Priming of Chords: Spreading Activation or Overlapping Frequency Spectra? Perception & Psychophysics, 41, 519-524.
Bharucha, J. J. and Stoeckig, K. (1987). Priming of chords: spreading activation or overlapping frequency spectra ? Perception and Psychophysics, 41, 519-524.
Bharucha, J. J. and Todd, P. (1991). Modelling the perception of tonal structure with neural nets. In P. Todd and D. G. Loy (Eds.), Music and connectionism. Cambridge, Mass.: MIT Press.
Bigand, E. (1993). Contributions of music to research on human auditory cognition. In S. McAdams and E. Bigand (eds). Thinking in sound: the cognitive psychology of human audition. Oxford: Oxford University Press.
Boltz, M. G. (1993). The generation of temporal and melodic expectancies during musical listening. Perception & Psychophysics, 53 (6), 585-600.
Bregman, A.S. (1990). Auditory Scene Analysis: the Perceptual Organisation of Sound. Cambridge, Mass: M.I.T. Press.
Brown, H. (1988). The interplay of set content and temporal context in a functional theory of tonality perception. Music Perception, 5, 219-250.
Brown, H. (1994). Theories of perception of tonal harmony - musical psychoacoustical and cognitive: what do we really know about what we hear ? In I. Deliège (Ed.), Proceedings of the 3rd International Conference on Music Perception and Cognition, Liège, July 1994.
Brown, H. and Butler, D. (1981). Diatonic trichords as minimal cue-cells. In Theory Only, 5, 39-55.
Brown,H. Butler, D. and Jones, M. R. (1994). Musical and temporal influences on key discovery. Music Perception, 11 (4), 371-407.
Browne, R. (1981). Tonal implications of the diatonic set. In Theory Only, 5, 3-21.
Burns, E. M., and Ward, W. D.(1982). Intervals, scales and tuning. In The psychology of music. D. Deutsch (Ed.), London: Academic Press.
Butler, D. (1983). The initial identification of tonal centers in music. In J. Sloboda and D Rogers (Eds), The acquisition of symbolic skills. New York: Plenum Press.
Butler, D. (1989). Describing the perception of tonality in music: a critique of the tonal hierarchy theory and a proposal for a theory of intervallic rivalry. Music Perception, 6, 219-242.
Castellano, M. A., Bharucha J. J. and Krumhansl C. L. (1984). Tonal hierarchies in the music of North India. Journal of Experimental Psychology: General, 113, 394-412.
Cook, N. (1987). The perception of large-scale tonal closure. Music Perception, 5 (2), 197-205.
Cook, N. (1994). Perception: a perspective from music theory. In R. Aiello with J. Sloboda, (Eds.), Musical perceptions. Oxford: Oxford University Press.
Cross, I. (1985). Music and Change. In P. Howell, I. Cross, and R. West (Eds.) Musical Structure and Cognition. London: Academic Press.
Cross, I., Howell, P. & West, R. (1983). Preferences for scale structure in melodic sequences. Journal of Experimental Psychology: Human Perception and Performance, 9 (3),444-460.
Cross, I., West, R & Howell, P. (1991). Cognitive correlates of tonality. In P. Howell, R. West & I. Cross, (Eds), Representing musical structure. London: Academic Press.
Cuddy, L. L. (1994). Tone distributions in melody: influence on judgments of salience and similarity. In I. Deliège (Ed.), Proceedings of the 3rd International Conference on Music Perception and Cognition, Liège, July 1994.
Cuddy, L. L. & Badertscher, B. (1987). Recovery of the Tonal Hierarchy. Perception & Psychophysics, 41, 609-620.
Deliège, I. (1987). Grouping conditions in listening to music : An approach to Lerdahl & Jackendoff's grouping preference rules. Music Perception, 4, 4, 325-360.
Deliège, I., and El Ahmadi, A. (1990). Mechanisms of cue extraction in musical groupings: A study of perception on Sequenza VI for viola solo by L. Berio. Psychology of Music, 18, 1, 18-44.
Deutsch, D. (1969). Music recognition. Psychological Review, 76, 300-307.
Deutsch, D. (1981). The processing of structured and unstructured tonal sequences. Perception and Psychophyics, 28, 381-389.
Deutsch (1982). In D. Deutsch (Ed.), The Psychology of Music. London: Academic Press.
Deutsch, D. & Feroe, J. (1981). The Internal Representation of Pitch Sequences in Tonal Music. Psychological Review, 88, 503-522.
Divenyi, P. L. and Hirsh, I. J. (1978). Some figural properties of auditory patterns. Journal of the Acoustical Society of America, 64, 1369-1386.
Dowling, W. J. (1978). Scale and contour: two components of a theory of memory for melodies. Psychological Review, 85, 341-354.
Dowling, W. J. (1982). Musical scales and psychophysical scales: their psychological reality. In T. Rice and R. Falck (Eds.) Cross-cultural perspectives on music. Toronto: University of Toronto Press.
Forte, A. (1973) The structure of Atonal Music. New Haven: Yale University Press.
Francès, R (1958/1988). The perception of music. (trans. W. J. Dowling). London: Erlbaum.
Gjerdingen, R. O. (1994). Apparent motion in music. Music Perception, 11 (4) 335-370.
Howell, P., West, R. & Cross, I. (1984). The detection of notes incompatible with scalar structure. Journal of the Acoustical Society of America, 76 (6), 1682-1689.
Krumhansl, C. L. (1988) Tonal and harmonic hierarchies. In Sundberg, J. (ed) Harmony and Tonality. Stockholm: Royal Swedish Academy of Music.
Krumhansl, C. L. (1990). The cognitive foundations of musical pitch. Oxford: Oxford University Press.
Krumhansl, C. L. and Keil, F. C. (1982). Acquisition of the hierarchy of tonal functions in music. Memory and Cognition, 10, 243-51.
Krumhansl, C. L & Kessler, E. J. (1982). Tracing the Dynamic Changes in Perceived Tonal Organization in a Spatial Representation of Musical Keys. Psychological Review, 89, 334-368.
Krumhansl, C. L., Sandell, G. J. and Sergeant, D. C. (1987). The perception of tone hierarchies and mirror forms in twelve-tone serial music. Music Perception, 5, 31-78.
Krumhansl, C. L. and Shepard, R. N. (1979). Quantification of the hierarchy of tonal functions within a diatonic context. Journal of Experimental Psychology: Human Perception and Performance, 5, 579-94.
Kubik, G. (1985). African tone-systems: a re-assessment. Yearbook for traditional music, 17, 31-63.
Lakatos, I. (1970). Falsification and the methodology of scientific research programmes. In I. Lakatos and A Musgrave (Eds), Criticism and the growth of knowledge. C.U.P., Cambridge.
Lerdahl, F. (1988). Tonal pitch space. Music Perception, 5, 315-350.
Lerdahl, F. and Jackendoff, R. (1983). A generative theory of tonal music. Cambridge, Mass: MIT Press.
Lloyd, Ll. S., and Boyle, H. (1963). Intervals, scales and temperaments. London: Macmillan.
Longuet-Higgins, C. H. (1962a and b). Two letters to a musical friend. The Music Review, 244-248, 271-280.
Longuet-Higgins, C. H. (1976). The perception of melodies. Nature, 263, 646-653.
Longuet-Higgins, C. H. (1979). The perception of music. Proceedings of the Royal Society, London, Series B, 205, 307-322.
Monahan, C. B., and Carterette, E. (1985). Pitch and duration as determinants of musical space. Music Perception, 3 (1), 1-32.
Moore, B C J. (1989). Introduction to the Psychology of Hearing. (3rd edn). London: Academic Press.
Narmour, E. (1989). The analysis and cognition of basic melodic structures. London: University of Chicago Press.
Narmour, E. (1992) The analysis and cognition of melodic complexity. London: University of Chicago Press.
Neisser, U. (1976). Cognition and Reality. New York: WH Freeman & Co.
Palisca, C. V. (1980). Theory. Entry in S. Sadie (Ed), Grove's Dictionary of Music. London: Macmillan.
Parncutt, R. (1989). Harmony: a psychoacoustical approach. London: Springer-Verlag.
Patterson, R. D. (1986). Spiral detection of periodicity and the sprial form of musical scales. Psychology of Music, 14 (1), 44-61.
Rosch, E. (1978). Principles of categorization. In E. Rosch and B. B. Lloyd (eds), Cognition and categorization. Hillsdale, NJ.: Erlbaum.
Schmuckler, M. A. and Boltz, M. G. (1994). Harmonic and rhythmic influences on musical expectancy. Perception & Psychophysics, 56 (3), 313-325.
Seashore, C. E.. (1938/1967). Psychology of Music. Dover: New York.
Shepard, R.N. (1964). Circularity in judgments of relative pitch. Journal of the Acoustical Society of America, 36, 2346-2353.
Shepard, R. N. (1982). Structural representations of musical pitch. In D. Deutsch (Ed.), The Psychology of Music. London: Academic Press.
Simon, H.A. and Sumner, R.K. (1968). Pattern in music. In B Kleinmuntz (Ed.) Formal representations of human judgment. New York: Wiley.
Speer, J. R. & Meeks, P. U. (1985). School Children's Perception of Pitch in Music. Psychomusicology, 5, 49-56.
Terhardt, E., Stoll, G. and Seewan, M. (1982). Algorithm for extraction of pitch and pitch salience from complex tonal signals. Journal of the Acoustical Society of America, 71, 679-688.
Thompson, W. F. (1993). Modelling perceived relationships between melody, harmony, and key. Perception & Psychophysics, 53 (1), 13-24.
Thompson, W. F. (1994). Sensitivity to combinations of musical parameters - pitch with duration, and pitch pattern with durational pattern. Perception & Psychophysics,56,3, 363-374.
Thompson, W. F. and Cuddy, L. L. (1992). Perceived key movement in 4-voice harmony and single voices. Music Perception,9 (4), 427-438.
Tversky, A. (1977). Features of similarity. Psychological Review, 84, 327-352.
van den Toorn, P. C. (1983). The music of Igor Stravinsky. London: Yale University Press.
West, R., Cross, I. & Howell, P. (1991). Activation of schemas in
perception - the case of musical scale conformance. Psychologica
Belgica, 31 (2), 197-216.
[1] There were always exceptions to this
equation of musical pitch with physical frequency. Indeed, Helmholtz, who, in
laying much of the groundwork for an acoustical and psychoacoustical
understanding of music might be thought to have been a proto-reductionist,
explicitly defended his inclusion of "room...for the action of artistic
invention and esthetic inclination" in his "Theory of Music" in the preface to
the third German edition of his "Die Lehre von den Tonempfindungen" (1870).
[2] Although Shepard (1982) and Krumhansl
(1988, 1990) both agree that these particularities are likely to play a role in
shaping the organisation of musical pitch in cognition.
[3] Although Brown (1994) provides a concise
and convincing reinterpretation of this study that renders this aspect of his
case at best "not proven".
[4] This criticim appears to arise from a
misinterpretation of the concepts of "event hierarchy" and "tonal hierarchy" as
evident in the combined and separate analyses of the responses of Indian and
western subjects.