7#  -\Laaa+iwooov|}}.xo# C*ZX  Are Feature Hierarchies Autosegmental Hierarchies?* Colin Phillips, Massachusetts Institute of Technology 1. Introduction: On Division of Labour Phonological representations are blamed for too much. They are taken to task for missing generalizations which can be captured elsewhere. In this paper I review some recent criticisms of the view that feature hierarchies are hierarchies of autosegmental tiers, especially the arguments of Hayes (1990) and Halle (1993), and try to show that although the considerations they raise are important ones, they do not force us to give up the standard view of feature structure. The criticisms can be more easily accommodated by making changes somewhere other than the theory of phonological representations. Phonological theories capture generalizations about phonological processes in two ways. First, in the theory of representations; second, in the theory of transformations or rules the theory of how different stages in a derivation take place. Generalizations about linguistic sound patterns can also be captured by phonetic theories of articulation and perception. Clearly, no subdomain has priority for accounting for any set of facts. I explore the trade-off between these different compoments of grammar in this paper. 1.1 Autosegmental Hierarchies A great deal of work in phonology has attempted formalize the intuition that some groupings of features are natural classes and other groupings are not. Only the natural classes may participate in phonological processes. The idea that certain sets of features behave as natural classes is an old one (Trubetzkoy 1939), but closer examination of the idea was made vastly more feasible by the definition of the problem as one of finding a universal feature tree or feature hierarchy (Clements 1985; Mascar 1983; Mohanan 1983; Sagey 1986). The assumption that (i) there are restrictions on the possible sets of features which may participate in phonological rules, and (ii) that these groupings are best expressed in terms of a universal hierarchical organization of features, is common to all approaches that I consider here. 1. Hierarchical Organization a. Only certain sets of features may participate in phonological processes. b. Hierarchical subgroupings of features define the possible classes of features. c. The hierarchical organization of features is universally determined. (2) illustrates the feature hierarchy which I assume in this paper. However, my main aim here is not to justify the specifics of this hierarchy, although some details will turn out to be crucial in specific cases.  A more contentious issue is what hierarchies like (2) represent. The most common assumption is that each node in the hierarchy represents a unit on an autosegmental tier (3). Such autosegmental theories of feature structure generally share the following assumptions: 3. Autosegmental Structure All nodes in feature hierarchies are units on autosegmental tiers. If we also assume that feature hierarchies are hierarchies of autosegmental tiers (4), then (5) follows as an automatic consequence. 4. Autosegmental Hierarchies Feature groupings are determined by the hierarchical associations between different autosegmental tiers. 5. Node Sharing Sets of features (i.e. subtrees) may be shared by two segments by means of double linking. Assumptions (3-5) are the core representational assumptions of autosegmental theories of feature structure, and (6) is the basic transformational assumption of these approaches. Section 2 discusses a problem for this set of assumptions, which can be resolved either by modifying (3-5) or by modifying (6). 6. Constituent Spreading a. Phonological rules operate on single nodes only. b. Phonological rules perform single operations only. (Clements & Hume, to appear, 7) 1.2 Bottlebrush Theories While the assumptions in (3) to (6) have been the dominant ones, a growing number of phonologists have reached the conclusion that feature hierarchies should not be regarded not as autosegmental hierarchies, but simply as representations of category membership, which could equally well be expressed as labelled bracketings. In these approaches, only the terminal features of (2) retain autosegmental properties, so that what autosegmental structure is left has the appearance of what Hayes (1990) endearingly refers to as a bottlebrush, in which all terminal features are directly associated to the prosodic tier (7).  Hierarchical groupings of features can be superimposed on bottlebrush structures as in (8).  We should ask, what does the autosegmental view of feature structure commit us to that the bottlebrush view, augmented with feature groupings as in (8), does not? The primary difference is given as assumption (5), that class nodes can be shared by two or more segments. Class nodes cannot be shared in bottlebrush representations. Section 3 discusses evidence for class node sharing. Second, class nodes do not encode timing information in an augmented Bottlebrush structure like (8), whereas they do under the autosegmental view. Association lines between autosegmental tiers have been taken to encode temporal overlap or simultaneity of the associated nodes (cf. Sagey 1988, Hammond 1988 for discussion). In autosegmental representations like (2), the temporal relationship between the skeletal tier and terminal features is mediated by temporal relations between class nodes. In representations like (8), on the other hand, temporal coordination of terminal features with the skeleton is unmediated. A consequence of (8) is that subgroups of features are not expected to be temporally coordinated: for example, if different subgoups of the features of a segment are realized at different times, the phonological representations do not predict that these subgroupings will correspond to class nodes in the feature hierarchy. 1.3 An Asymmetry in Tier Content In presenting the received autosegmental view of feature hierarchies, so far I have been highlighting the parallels which this approach claims to hold between feature structure and the classic autosegmental domain tone. I have concentrated on the interpretation of the association lines between autosegmental tiers. When we look at standard assumptions about the content of these autosegmental tiers, however, we find striking differences between what is assumed about tonal tiers and what is assumed about the tiers of feature structure. A typical assumption about the tonal tier is that it consists of sequences of units representing high and low tones. These may be associated to one or many skeletal units, or to none at all (9).  The Obligatory Contour Principle (10) imposes restrictions on autosegmental tiers containing nodes of more than one type. Its requirement that adjacent nodes be different amounts to the prediction that if, say, two adjacent syllables bear high tones, they are in fact realizing the same doubly linked unit on the tonal tier. 10. Obligatory Contour Principle (Goldsmith 1976) Adjacent identical elements are prohibited. In the formulation given in (10), the OCP is irrelevant to most of the tiers in an autosegmental feature hierarchy, because they only contain one type of unit. Only the terminal tiers representing + and - values of binary features have the content necessary for a condition like the OCP to apply. On the tiers containing just one node type, like [place], or [coronal], the OCP is either uniformly violated or inapplicable. Nevertheless, phonologists have found evidence for OCP-like effects in feature structure at terminal and non-terminal levels. McCarthy (1988) cites restrictions on Ponapean consonant clusters (It 1986) as an OCP effect on [place], and suggests that co-occurrence restrictions on consonants in root morphemes of Proto Indo-European be analyzed as an OCP effect on [laryngeal]. By attributing these phenomena to the effects of the OCP on the [place] tier, it is implicitly assumed that the content of the [place] tier is more like the multi-valued content of the tonal tier than a series of identical units. In other words, it is assumed that it is possible to refer to variables picking out values of place, i.e. [aPlace] just as it is possible to refer to variables picking out values on terminal tiers, such as [aback]. By allowing well-formedness conditions to refer to values of both terminal and non-terminal nodes, all tiers are taken to have non-uniform content, like the tonal tier. Meanwhile, theories of phonological transformations have typically assumed a distinction between the tonal tier and terminal feature tiers on the one hand, where rules may refer to values, and class nodes on the other hand, where rules may not refer to values. If we eliminate this asymmetry, and allow rules to refer to variables at any level in feature hierarchies, we can account for some harmony processes which are problematic under standard assumptions about possible harmony rules, as I show in section 2. This modification also allows us to handle in an autosegmental framework phenomena which Halle argues require a Bottlebrush-style analysis. A harmony rule referring to values of a non-terminal tier can be expressed as in (11).  By writing rules referring to value variables like [aN] in (11) we allow two kinds of harmony processes which are unexpected under theories which do not allow reference to values at all levels. Marked Values First, we provide a new way of referring to the structurally more complex members of a natural class of features. For example, some coronal consonants are assumed to have extra structure below the coronal tier, whereas plain coronals are often assumed to have a [coronal] node with no dependent structure. Rules which pick out only those coronals with structure below the coronal tier may be construed as rules referring to [aCoronal]. In theories which do not allow reference to values at all levels, though, the only way to pick out the structurally more complex members of a natural class of segments is either (i) assume underspecification of the less complex segments: for the case of the coronals, this would mean that plain coronals would lack a [coronal] node; or (ii) allow rules to simply ignore segments which meet the structural description of the rule but are unmarked/structurally simple (Calabrese 1993; Carnie, this volume). In section 2.1 I present an example from Sanskrit which I claim requires the variable spreading account. Multiple Values In cases where a tier has more than one dependent in the same segment, the variable in a rule like (11) can range over more than one value, with the effect that a harmony rule will be implemented as more than one spreading operation, as illustrated in (12).  This implementation of single harmony rules as multiple operations entails weakening the Constituent Spreading hypothesis in (6) above, so that (6a) still holds, and rules still refer to single nodes, but (6b) is dropped, allowing rules to be implemented as multiple operations. When rules refer to value variables, we make different predictions about rule blocking from Constituent Spreading approaches. It predicts that spreading of each value must be blocked individually, and, therefore, that harmony rules may be partially blocked. I present a case from Scots Gaelic which motivates this assumption in section 2.2. The rest of the paper is organized as follows: section 2 presents evidence for harmony rules which refer to value variables, and argues that this approach can accommodate one class of Halles objections to the autosegmental view of feature hierarchies. Section 3 addresses Halles second objection to autosegmental feature hierarchies the claim that they are unneccesary and argues that a variety of assimilation and dissimilation processes motivate the class node sharing that autosegmental theories predict. Section 4 turns to Hayes arguments that autosegmental feature hierarchies are inadequate for accounting for diphthongization phenomena. I argue that the range of attested diphthongizations does not justify Hayes conclusions, with the result that diphthongization does not constitute an argument against autosegmental feature structures. Section 4 includes discussions of Icelandic preaspiration and of how contour segments should be represented. 2. Spreading Values 2.1 Sanskrit Nati Assimilation processes are typically assumed to propagate a node rightwards or leftwards until another node of the same type is encountered. Normally, the blocking node occupies the same autosegmental tier as the node which spreads, and so the classes of spreaders and blockers are identical. The well-known Nati rule of Sanskrit (Whitney 1889; Schein & Steriade 1986; Cho 1991; Halle 1993) departs from this pattern: the class of spreading segments is a subset of the class of blocking segments. This leads to difficulties which appear to have gone unnoticed in previous analyses of Nati. (13) shows the consonantal inventory of Sanskrit. 13. Labial Dental Retr Palatal Velar Laryngeal Stops p t t. c k ph th t.h ch kh b d d j g bh dh dh jh gh Nasals m n n ** Glides v l r y Fricatives s s. s' h The Nati rule spreads retroflexion from a continuant /s.,r/ onto all following coronal nasals (14a-d), unless any other coronal intervenes, in which case no assimilation takes place (14e-h). 14. a. is. - na seek b. pr - na fill c. vrk - na cut up d. ks.obh - ana quake e. bhug - na bend f. mrd - na be gracious g. marj - ana wiping h. ks.ved - ana hum I assume that what distinguishes the 3 series of coronals in Sanskrit is differences in their sub-coronal structure. The different series are characterized by either monovalent dependents of the coronal node, such as [retroflex] or [palatal], or by differing values of binary features like [anterior] and [distributed]. The difficulty with Nati is this: the class of harmonizing segments are picked out by their specific sub-coronal structure the Nati rule must refer to this in its structural description. However, what characterizes the class of potential blockers of harmony is the presence of a [coronal] node. (15) shows the formulation of Nati proposed by Schein & Steriade (1986) and Cho (1991). Since Nati assimilates two dependents of [coronal], the [coronal] node itself is assumed to spread rightwards. Therefore, propagation is blocked by any segment with a coronal node, due to the Line Crossing Constraint. This formulation successfully captures the range of blocking segments, but the way in which the class of spreading segments is picked out in (15) is problematic. (15) correctly states that [coronal] spreads only in certain contexts, but the context that it specifies is an unusual one.  Typically, the node that harmonizes, and any extra context which must be satisfied for harmony to apply, are structurally disjoint, in the sense that neither node dominates the other. The [+continuant] and [+nasal] features specified as contexts in (15) are typical in this regard. But in addition to these, part of the context [anterior] and [-distributed] is dominated by the spreading [coronal] node. If this kind of contextual specification is available, we run the risk of giving up one of the original motivations for postulating feature hierarchies, the prediction of what sets of features are natural classes. This is because by choosing an organizing node, eg, [root], and a subset of its dependents, it is possible to pick out any set of features as a natural class. If on the other hand we assume that monovalent features like [retroflex] and [palatal] distinguish the different series of coronals, it becomes straightforward to pick out the harmonizing segments /r/ and /s./. In this case, Nati spreads the [retroflex] node rightwards in continuant segments (16).  Now, however, it becomes more difficult to capture the blocking of Nati. Its certainly not the case that all coronal segments are linked to the [retroflex] tier, so blocking cannot be a result of the Line Crossing Constraint. Its not clear that a locality condition like Archangeli & Pulleyblanks (1994) Precedence Principle helps either. This constraint rules out instances of gapped configurations like (17) in general.  While this successfully prohibits assimilation in forms like (14e-h), this would also appear to prohibit well-formed instances of Nati like (14a-d), because consonant spreading across a vowel violates the Precedence Principle. I suggest that these problems with the formulation of Nati can be avoided by using the variable spreading device introduced above. The Nati rule in (18) states that any values of continuant coronal segments are spread onto a following coronal nasal.  Let us see how this formulation manages to pick out the appropriate classes of harmonizing and blocking nodes. First, I assume that the 3 series of coronals have the representations in (19).  Plain coronals do not spread, since they lack a value of [coronal]. To see why the palatal coronals /y/ and /s'/ do not induce harmony, notice the gap marked by stars in the palatal column of the inventory in (13) Sanskrit lacks a palatal nasal. If a [palatal] feature were to spread onto a coronal nasal, it would be blocked by Inventory Preservation. The fact that all coronals block Nati is not surprising, given that the [coronal] tier is the one specified in the rule. A locality condition scans the tier specified by the rule, preventing targeting of the nasal in sequences like (20), since it is not adjacent to the trigger on the specified [coronal] tier.  For Sanskrit, then, I exploit the variable spreading format to claim that spreading applies from the tier below [coronal] in Nati, whereas blocking applies on the [coronal] tier itself. The next section tries to show how the innovations introduced in this analysis of Nati helps to fend off some of Halles arguments against the autosegmental view of feature hierarchies. 2.2 Barra Vowel Copy & Terminal Feature Spreading By allowing variable spreading rules such as the one I have proposed for Sanskrit Nati, I am giving up the strong version of the Constituent Spreading hypothesis in (6a-b) above. Any constituent of a feature hierarchy may be spread by a rule as in Clements (1985), but I have dropped the assumption that rules may spread only single constituents. The weaker requirement that I impose is that when a set of nodes spread, they are exhuastively and immediately dominated by a single node in the feature hierarchy. In this section I compare this approach with an approach to the interpretation of feature structures recently proposed by Halle (1993) (see also Halle & Vaux 1994). Halles alternative takes a very different view of what feature structures represent, but allows for very similar analyses of a number of phenomena. Halle (1993): Autosegmental hierarchies are (i) unnecessary, (ii) sometimes inadequate. Halle (1993) assumes that organizing nodes in feature hierarchies do nothing more than represent groupings of features; they are not elements on autosegmental tiers. Individual terminal features, and these alone, exhibit autosegmental properties like multiple linking in harmony processes. Halles view appears closer to Clements proposal than it actually is, because Halle retains feature tree representations. However, he assumes that the trees only represent category membership information. Although not explicitly formulated as such, this view amounts to a version of the Bottlebrush theory (cf. (7) & (8) above), in which each feature is directly associated to the root. Nevertheless, Halle assumes that harmony rules may be stated as we have become accustomed to, and may refer to organizing nodes, such as [place]. The only difference is in the implementation of rules: when a rule states that an organizing node must be spread, this is actually implemented as the spreading of all terminal nodes dominated by that organizing node, as in the complete vowel copy rule in (21).  This is a stronger modification of the Constituent Spreading hypothesis than I have proposed. Whereas my variable spreading rules require that sets of spreading nodes be exhaustively and immediately dominated by an organizing node, Halle merely requires that sets of spreading nodes be exhaustively dominated by an organizing node there is no immediate domination requirement. In addition, non-terminal constituents cannot spread for Halle. One of the clearest predictions of Halles approach is that, since harmony takes place on a feature-by-feature basis, the blocking of harmony processes can only occur in a similar feature-by-feature fashion. So, for example, the vowel feature [+high] should never be blocked from spreading by a segment which lacks specification for [ahigh]. The fact that a pair of segments share an organizing node should never have a bearing on how they interact in harmony processes only terminal features are relevant for blocking. On the other hand, the view I have proposed does allow segments to share organizing nodes. In the next few sections I present some cases which motivate the terminal feature spreading assumption and show that they are equally consistent with the variable spreading view Ive suggested. I also present examples of assimilation and dissimilation which seem to require analyses involving organizing node sharing. Halles approach and my modified version of the autosegmental view both allow a single harmony process to be implemented as a number of distinct association operations, of which each may be blocked individually, with the result that partial harmony effects are predicted to arise. The paradigm example of this is the vowel copy process found in the Barra dialect of Scots Gaelic (Borgstrm 1937, 1940; Clements 1986; Sagey 1987; Halle 1993). The inventories in (22-23) show that among the features distinguished by segments in Barra, [back] is distinctive for both vowels and consonants. This leads to an interesting interaction when it comes to determining the quality of an epenthetic vowel. Epenthetic vowels are identical in quality to the preceding vowel (24), unless the intervening consonant conflicts in backness, in which case, the vowel has the backness value of the intervening consonant (25). 22. i u e o a + + + - - - - - - High - - - + + + Low - + + - + + - + + Back - + - + - + Round 23. Plain Palatal ([-back]) p t k - c& k b f g - j& g f s - s& v - j m n, N - -, N r, R r, - (l), L l, L 24. a. /u/ duNux\ Duncan b. // n\s Angus c. // ms&ir time 25. a. u i bulik bellows (gen. sg.) b. i dri fishing line (gen. sg.) c. a s&rav bitter As Sagey (1987) points out, the fact that all vowel features are copied to the epenthetic vowel in (24) would normally lead us to assume a rule spreading a node which dominates all vowel features, eg. [place] or [V-Place]. But when a conflicting value of [back] intervenes between the harmonizing vowels, as in (25) a Constituent Spreading analysis does not predict any vowel features to harmonize. The fact that only [back] is blocked is expected, though, if vowel copy is implemented as multiple operations, due either to a variable spreading rule (26a) or due to a terminal spreading interpretation of feature hierarchies, as in (26b).  We should counter a couple of alternative analyses, which could claim that the Barra facts are consistent with maintaining a strict version of the constituent spreading hypothesis. The first possible objection is that feature-by-feature blocking of harmony is an illusion: the one feature which is claimed to be blocked is [back], which is in fact always supplied by the intervening consonant. We need to find examples where the intervening consonant cannot supply a value for [back]. (24c) shows that backness values can in principle spread from the preceding vowel: /m/ is non-contrastive for backness, so the [-back] feature on the epenthetic vowel in (24c) must be due to the preceding vowel. This supports the claim that the feature is actually blocked from spreading when a non-labial consonant intervenes. Another possible reanalysis of the Barra facts consistent with Constituent Spreading would be to claim that when [back] is blocked from spreading, the remaining features [round], [low], and [high] do still spread as a constituent. This requires a revision of the feature hierarchy in (2) such that [round] and the height features form a constituent to the exclusion of [back], as in (27).  In this case Barra vowel copy might be explained as spreading a node [V-Place] which dominates all 4 features when no Line Crossing Constraint violation results, and spreading the lower node [HLR] dominating [high], [low], and [round] otherwise. This approach makes very strong predictions about the range of partial vowel harmony processes to be found across languages: the structure in (27) predicts that whenever [back] and at least one other vowel feature harmonizes, all other vowel features must harmonize. On the other hand, the analysis of Barra which claims that each feature spreads individually predicts that other subsets of vowel features may be assimilated, provided that appropriate blockers are present. Evidence to distinguish these two positions is readily available. Odden (1991) uses data from Wikchamni (Gamble 1978; Archangeli 1985) as evidence for a node [Back-Round] in the feature hierarchy which exhaustively dominates [back] and [round]. Wikchamni has the 5 vowel system in (28). This language harmonizes [back] and [round] from left to right between vowels in consecutive syllables, provided that they have identical values for the feature [high]. 28. i u o a high + + + - - back - + + + + round - - + + - 29. a. pins&i _ pins&i stung b. huts&i _ huts&u knew c. tiss&i _ t֕ss& made d. thans&i _ thans&i went e. tawthat _ tawthat might run f. toyxat _ toyxot might doctor g. hukyat _ hukyat might mix Halle (1993) shows that the spreading of [back] and [round] alone can be readily analyzed in his terminal-feature spreading framework, thereby maintaining a feature hierarchy in which vowel features are dominated by the articulator nodes [labial] and [dorsal]. He assumes that identical height features fuse if they are identical (following Cole & Trigo 1988), so that the height harmony required for back-round harmony amounts to a requirement that the harmonizing vowels must share a [high] node. Then a rule harmonizing [place] (i.e. full vowel harmony) has the effect of spreading [back] and [round], since values of [high] are already identical. More important than the specifics of the analysis, however, is the fact that Wikchamni shows precisely what a Constituent Spreading analysis of Barra predicts to be impossible it spreads [back] together with another vowel feature, but not all other vowel features. The fact that it is impossible to analyze vowel harmony in both Barra and Wikchamni as single processes in the Constituent Spreading framework constitutes a strong argument in favour of the modification of assumptions (6a-b) above, as Halle and I propose. Halles conclusion from Barra vowel copy is that the constituent spreading hypothesis must be abandoned in favour of a Bottlebrush approach in which only terminal features spread. I have tried to show that Barra vowel copy and Sanskrit Nati can be handled by a theory which drops the strict constituent spreading hypothesis, but which maintains an autosegmental view of feature hierarchies. Halle (1993) also presents terminal feature spreading analyses of a wide variety of examples used by Odden (1991) to argue for perceptually motivated subgroupings of vowel features. Halle claims that Oddens arguments for the partial feature hierarchy in (30) do not go through once a terminal feature spreading approach is adopted.  Although I cannot give a full treatment of all of Oddens examples here, it should be pointed out that if Oddens vowel geometry in (30) is correct, my variable-spreading analysis of Barra vowel copy is impossible, since my account relies on the assumption that there is a node which immediately dominates all terminal vowel features. My analysis therefore becomes untenable if there are subgroupings of vowel features in the feature hierarchy. Fortunately, Halles reanalyses of Oddens examples using terminal feature spreading can be readily reinterpreted as variable spreading analyses, so it does not seem necessary to introduce sub-groupings of vowel features. This section has focussed on cases which Halle (1993) claims can be analyzed in his version of the Bottlebrush theory, but which present difficulties for autosegmental approaches. Ive tried to show how the autosegmental approach can be modified to handle these problems. In the next section I turn to Halles second, implicit, argument for the Bottlebrush theory: there are no phenomena which demand that class nodes in feature hierarchies be treated as autosegmental nodes. 3. Class Node Sharing 3.1 On Some Apparently Disjunctive Rules Sagey (1987) recognized the difficulty of the Barra vowel copy facts for theories which assume Constituent Spreading. Sageys analysis is essentially the same as Halles, but she stops short of concluding that terminal feature spreading is how assimilation is implemented in all cases, due to other examples which seem to demand an analysis in terms of Constituent Spreading. Ainu Sagey presents vowel harmony in Ainu (It 1984) as a minimal contrast with Barra vowel copy. In this language the transitivizing verbal suffix is usually a copy of the root-final vowel (32). Vowel copy crosses any consonant unimpeded, with the exception of the glides /w,y/, in which case the suffix vowel is always /e/ (33). (31) shows the phonemic inventory of Ainu, taken from Maddieson (1984). 31. p t c* k h i u e o a s high + + - - - m n low - - - - + rr back - + - + 32. a. mak-a to open 33. a. ray-e to kill b. ker-e to touch b. chaw-e to solve c. pis-i to ask c. hew-e to slant d. pop-o to boil d. piw-e to cause to run e. tus-u to shake e. poy-e to mix f. huy-e to observe Sageys analysis of the blocking effect of glides runs as follows. Vowel copy must spread the features [high], [low], and [back]; all of these features are blocked from spreading by the presence of intervening /w, y/; but since the glides do not have all features of vowels, a terminal feature spreading analysis like the one given to Barra vowel copy incorrectly predicts that glides only partially block vowel copy in Ainu. Sagey therefore concludes that Ainu vowel copy must be analyzed as spreading of the constituent dominating all vowel features ([2-Place]), in order to exclude the possibility of partial blocking. Recall that Halles approach only predicts full blocking if there is an intervening segment which matches the spread nodes feature-by-feature. Thus, Halle must assume that Ainu glides are marked for [high], [low] and [back]. This does not follow from any version of the distinctive specification approach to feature marking which Halle assumes, since the two glides contrast only in backness. Therefore he necessarily assumes that the glides are positional variants of vowels, which ensures that they block spreading of all vowel features. Even if Halles conjecture about the underlying forms of glides in Ainu is correct, there is still reason to prefer the constituent spreading analysis over one in which features are spread individually. The argument is based on the analysis of the additional forms in (34). All of the suffixes in (34) are [+high], and bear the opposite value of [back] to the root vowel. It argues that whereas the suffixes in (32-33) are underlying floating Vs with no features, the suffixes in (34) are underlyingly specified [+high]. In these cases the rule in (36) determines the backness value of the suffix: 34. a. hum-i to chop up 35. a. kar-i to rotate b. pok-i to lower b. ram-u to think c. pir-u to wipe d. ket-u to rub 36. Melodic Dissimilation Rule (MDR) (It 1984, 508) [+high] [-aback] / [aback] ____ At first glance, we would not want this rule to apply to all representations, as it would exclude all of the cases of full vowel harmony in (32). It assumes that the rule applies unless vowel identity results from autosegmental spreading of a single melodic unit (1984, 512). In terms of more recent approaches to segmental structure we can say that dissimilation occurs unless the two vowels share all of their features: in this case the environment of the rule is not met, since there is no sequence of two nodes linked to [aback] which could be subject to the MDR. In terms of hierarchical representations of features, the MDR is an OCP constraint requiring contrasting values of [back] when adjacent nodes immediately dominate [back]. When 2 segments share a node on the tier immediately dominating [back] (eg. [V-Place] or [Dorsal]) the MDR does not apply. (37a) shows an environment in which MDR applies, (37b) shows an environment where MDR fails to be triggered.  Notice that this analysis of where MDR does and does not apply crucially relies on the availability of class node sharing, which is unavailable in a Bottlebrush approach like Halles. Ngbaka The scenario we have just observed in Ainu, in which adjacent vowels must bear contrasting values for a particular feature, unless they are identical in all features, is found in a number of other unrelated languages. In Ngbaka (Westcott 1965; Chomsky & Halle 1968) there is a constraint on the distribution of vowels in disyllabic words to the effect that the two vowels must either contrast in the value of [ahigh], or be identical in all features. The vowel inventory of Ngbaka is shown in (38). 38. [+high] i u [-high][-low] e o [-high][+low] a This means that, for example, /i/ cannot appear in a disyllabic word with the other [+high] /u/; /o/ cannot appear with [-high] /e, , /. Tzeltal Tzeltal (Slocum 1948; Kaufman 1971) has a number of VC suffixes in which V is either identical to the root vowel, or has the opposite value for [aback]. Tzeltal has the 5 vowel system in (39). 39. [-back] [+back] i u e o a 40. -Vl - place where thing abounds 41. -Vl - possessor (inanimate) a. c&umil squash-patch a. swicul its hill b. pahc&ul pineapple plants b. slewul its fat c. c&enkul bean-patch c. spakul its cloth d. mayul tobacco plants d. slumil its ground e. ic&il chile-patch e. spos&il its medicine f. sc&ayul its fish g. siul its firewood h. steel its wood i. stiil its edge Guere Finally, Guere (Paradis & Prunet 1989) has a restriction on the height of adjacent vowels, which is a slightly more complicated version of what we have already seen in Ngbaka. Ngbaka restricts the appearance of sequences of [ahigh][ahigh] vowels to instances of complete identity. In Guere, there are no sequences of [-high][-high] unless the vowels are (i) fully identical and (ii) either a coronal or nothing at all intervenes between them. (42) shows the vowel inventory of Guere. 42. [+ATR] [-ATR] i u e o a (43-44) show that sequences of [ahigh][-ahigh] and [+high][+high] are permissible. 43. High, nonhigh a. ko rat b. nm bird c. kla hand d. uo round Nonhigh, high e. ai robe f. gw burn! g. gbau box h. jreu monkey 44. High, high a. nimi animal b. zg chameleon c. duu chest d. bui ashes Nonhigh, nonhigh e. ---- (45-46) show how [-high][-high] sequences are avoided: when a [+high] object pronoun suffix attaches to a stem ending in a [-high] vowel, the stem vowel remains constant (45). However, when a [-high] object-pronoun suffix is attached to a stem ending in [-high], the stem vowel changes from [-high] to [+high] (46). 45. a. n + n stick it! (*n) b. gblee + gble welcome it! (*gbl) c. w(l) + w(l) wash it! (*w(l)) 46. a. n + n stick it! (*n) b. gblee + gbl welcome it! (*gble) c. w(l) + w(l) wash it! (*w(l)) Paradis & Prunet assume that the alternations in (45-46) are due to an OCP constraint which rules out adjacent [-high] features. The exceptions to the height constraint are shown in (47-48): they occur when the 2 [-high] vowels are identical in all features, and either string-adjacent (47), or separated by a coronal (48). 47. a. baa manioc c. week b. y to dry 48. a. w to wash c. s to lose weight b. bee to hang The apparently disjunctive conditions on vowel features in Ainu, Ngbaka, Tzeltal and Guere can all be straightforwardly expressed in a theory which assumes that class nodes occupy autosegmental tiers. In each language, there is a ban on adjacent identical values of a given feature, but the ban can be evaded if two vowels share a class node like [V-Place], because in this situation the two vowels only have one set of terminal vowel features. (50) illustrates this using the ban on [ahigh][ahigh] in Ngbaka. The other languages work in essentially the same way. 49. Ainu *[aback][aback] except under identity Tzeltal *[aback][aback] except under identity Ngbaka *[ahigh][ahigh] except under identity Guere *[-high][-high] except under identity  These apparently disjunctive conditions are not expected given just the formal properties of a bottlebrush view of feature structures, because only terminal features can be shared under this approach. This does not automatically favour the autosegmental view of feature hierarchies, though. We must consider the possibility of accounts of the disjunctive rules which do not rely on class-node sharing. It would be particularly interesting if the kinds of facts discussed in this section could be shown to follow from independent factors, so that they do not need to be captured by the properties of the formal representations chosen. Steriade (p.c.) suggests just such a possibility, whereby the facts of Ngbaka, Ainu, Guere and Tzeltal might follow from phonetic-perceptual factors. Steriade suggests that the generalizations we have reviewed should be viewed rather differently. Let us take Ngbaka as an example: as I presented the facts, adjacent vowels must contrast in height, unless they can escape this requirement by being identical and linked at a non-terminal level. Steriade proposes to express the generalization in almost the opposite fashion: adjacent vowels cannot contrast in features other than height (e.g. [back]) unless they also contrast in height. Steriade suggests that this licensing of contrasts is plausibly grounded in perceptual facts: the presence of a height contrast between vowels, which is easily perceived, might improve the likelihood of perceiving other contrasts between the two vowels. (51a) is a schematic plot of how contrast in height (F1) between two vowels might aid the perception of contrast in backness (F2). (51b) illustrates how a contrast licensing rule could be expressed as a well-formedness condition on feature structures.  This amounts to an extension of the domain of what Steriade has called Positional Neutralization (Steriade 1993b,1994) from feature identification to feature discrimination. With respect to the identification of features, Steriade has argued that the correct identification of features by the listener is made easier in two ways. First, contrastive features may be neutralized in positions in which they are difficult to perceive; second, contrastive features which are hard to perceive may be extended in duration (assimilation). Extending the spirit of this proposal to discrimination of two segments, Steriade suggests that contrasts between two segments which are hard to perceive are only permitted in environments which aid their correct discrimination. The two perspectives on these facts are quite different. The view I presented focusses on the prohibition of adjacent non-contrasting features, whereas Steriades view emphasizes the role of some contrasts in making other contrasts more salient. The identical vowels which I claim escape the OCP constraint are for Steriade precisely the vowels which are subject to Contrast Licensing constraints like (51b). Steriades approach has the attraction of being both restrictive and phonetically grounded, but it faces a couple of difficulties. First, Steriades approach predicts that only contrasts license other contrasts. We do not expect that contrast in [back], [round], [ATR] should depend on sequences with either contrasting height or adjacent [+high][+high], which is what we find in Guere. Under the OCP view suggested here, which emphasizes the licensing of identity, the Guere situation, in which only adjacent [-high] segment prevent contrasts in other segments, is far less surprising: it is just a more specific variant of the constraint in (50), in which the OCP only applies to [high]. Second, Steriades view makes extremely strong predictions about the connection between phonological contrasts and the perception of contrasts. Classic findings about categorical perception show that listeners ability to perceive contrasts between consonants closely matches the phonological significance of the contrast. i.e. 5ms differences in voicing onset time (VOT) between stops are very much more likely to be perceived when the 5ms spans a category boundary. Eimas (1963) found that vowel identification is very similar to consonant identification insofar as vowels with small acoustic differences are fairly easily mapped onto different categories. However, vowel discrimination is rather different from consonant discrimination: acoustic differences between vowels can be equally well discriminated whether they span a category boundary or not, whereas discrimination among consonants within a category is extremely difficult. Recall that Steriade's hypothesis is that discrimination of a contrast along one dimension is aided by the additional presence of a category boundary along a second dimension. Since Ngbaka distinguishes three levels of vowel height, and only the difference between the highest vowels and all other heights is relevant to allowing contrast in [back] and [round], it would have to be category boundary between [high] rather than direct acoustic differences which licenses other contrasts. However, if we assume based on Eimas findings, that just as category boundaries do not enhance the perception of contrasts along their own acoustic dimension, category boundaries do not have any special status in facilitating the discrimination of contrasts on another dimension, then we expect the low-mid contrast (both [-high]) in Ngbaka to have the same facilitatory effect on [back] or [round] discrimination as the mid-high contrast (across boundary [high]), since the acoustic contrast is equivalent. This is however not the case. 3.2 Consonantal Transparency Effects Assimilation processes have guided our thinking about feature hierarchies in two different ways. The bulk of research has focussed on the problem of classifying the groups of features that are affected together in assimilation processes. These groups are used to define the organizing nodes in feature hierarchies. Any results of this line of inquiry are equally applicable to an autosegmental theory of feature hierarchies or a bottlebrush theory incorporating an independent notion of hierarchical grouping of features, such as Halles or Hayes. Another kind of work on feature hierarchies focusses on what kinds of long-distance assimilation rules exist, and what kinds of elements they are blocked by. Examining structural similarities between spreading and blocking segments provides clues to what nodes are being assimilated. This kind of approach is more contingent on the specific assumptions of autosegmental feature hierarchy theories. These approaches assume that since autosegmental spreading can take place on non-terminal planes, it is sufficient for a pair of segments to have a class node in common for one of them to block spreading from the other. Only where blocking can be reduced to blocking by terminal features can this kind of effect be accommodated within a bottlebrush theory. This section shows cases where non-terminal nodes need to be invoked as blockers of assimilation. 3.2.1 Coronal Transparency First let us review some examples from Fula studied by Paradis & Prunet (1989) which indicate that vowels may spread across coronals, but not across labials and velars. Paradis & Prunet have taken such facts to argue for deep underspecification of coronals, in contrast to labials and velars. Similar to the generalization seen above in Guere, where vowels may be fused across coronal consonants but not across consonants with other places of articulation, Fula has vowel harmony processes which apply only across coronals. There are voice and aspect morphemes which are suffixed as empty X-slots, and derive their quality from the preceding vowel, provided the intervening consonant is coronal (52). 52. Imperfect Paradigm a. at - x ata Active voice b. ot - x oto(o) Middle voice c. et - x ete(e) Passive voice In the pronominal system, /a/ (and sometimes /e/) assimilates to the following vowel across the coronals // or /n/ (53). 53. Pronouns a. a - en een we (incl.) b. a - on oon you (pl.) c. on - en onon you (independent pl.) Futankoore Fula has two epenthetic vowels, /i/ and /u/, which are used to break branching onsets and codas. /i/ and /u/ can be inserted in different, but overlapping contexts. Normally, when the environment for epenthesis is met /i/ or /u/ is possible, /u/ is preferred. However, when the underlying form contains a cluster which requires insertion of two epenthetic vowels, as in (54), and the choice is between inserting /i/ at either side of a coronal, or inserting /i/ and /u/, the normal preference for /u/ insertion is overridden and a second /i/ is inserted. Paradis & Prunet suggest that this surprising reversal of preference is due to the fact that the two /i/s seen in the surface form are the result of just one instance of epenthesis followed by spreading to the position of the second /i/. 54. Epenthetic vowels a. /utt - - t/ uttiit Notice that the coronal transparency effect in Fula involves high, low, and mid vowels, and both stop and liquid coronals. The point of these examples is to show that there are harmony processes which are blocked by just a subset of the consonants of a language, and that what distinguishes the blocking consonants from the non-blocking consonants is very unlikely to be the fact that blocking labials and velars have all the terminal features of vowels, while the consonants that are transparent to vowel harmony the coronals do not bear any of the terminal features of vowels, which is what such effects demand in a terminal feature spreading approach like Halles. In order to capture the blocking of harmony we need to make reference to non-terminal nodes which the harmonizing vowels have in common with the labials and velars. Different explanations have been offered for the contrast between coronal and non-coronal consonants. One popular approach is to assume that coronals are temporarily lacking a [Place] node during derivations, so that if vowel spreading is instantiated as spreading of the [Place] node, labial and velar consonants will block harmony due to violation of the Line Crossing Constraint, but coronals will not cause similar violations, as (55) shows (from Paradis & Prunet 1990).  Another approach to the special status of coronals is to assume that there is a node in the feature hierarchy which exhaustively dominates the place features of labials, velars and vowels. Such a node known as Peripheral has been suggested by Rice & Avery (1991). Given this extra structure, a process which spreads the [Peripheral] node of vowels will be full vowel harmony which will be blocked by labials and velars, but not by coronals. There is a growing literature which raises difficulties for both of these approaches. For example, a number of people have pointed out that phenomena which treat coronals as somehow special, including transparency effects, tend to single out just a subset of coronals, while other coronals pattern with labial and velar consonants (cf. McCarthy & Taub 1992; Kaun 1993; Steriade 1993a). Kaun (1993) reports that a survey of coronal transparency effects shows that only /l,r,n/ are transparent to assimilation processes. Coronal obstruents and other continuants are no more transparent than labials and velars. Notice, however, that these criticisms of particular analyses of the coronals do not affect the main point I am drawing from the transparency effects namely, the need to invoke blocking by non-terminal nodes. The theories of underspecification may be inadequate, or the feature hierarchies may be incorrect, but we still expect that the appropriate characterization of the blocking and transparency effects will involve autosegmental blocking at the level of non-terminal nodes. 3.2.2 Translaryngeal Transparency Another class of examples where vowel harmony is blocked by some of the consonants of a language but not others is the translaryngeal vowel harmony found in many languages and discussed by Steriade (1987c). In these languages vowels can assimilate in all of their oral features when they are string adjacent or separated by just /h,/. All oral consonants, including coronals, block assimilation. One of the most interesting cases of translaryngeal harmony discussed by Steriade is Acoma (Miller 1965), in which vowels can be glottalized or devoiced. In VV sequences in this language vowel harmony only matches the oral features of the two vowels, leaving the laryngeal features free to contrast. (56) shows VV sequences in which the vowels contrast in glottalization; (57) shows VV sequences in which the vowels contrast in voicing. 56. a. kaausiustya he is tied up b. puuukaca come out and look at the two of them 57. a. ziyuuceee they took him b. senaaasi my arch This example shows that it would not be sufficient to claim that larygeal consonants are simply neutral to vowel spreading. The node that spreads must be [place], so that all oral consonants which have [place] nodes block vowel spreading, and the glottalization and voicing features do not interfere with vowel spreading., . 4. Diphthongization & Contours 4.1 The diphthongization paradox In this section we turn to a discussion of an earlier challenge to the autosegmental theory of feature structure, raised by Hayes (1990). Hayes regards diphthongization processes as insurmountable problems for the standard interpretation of feature hierarchies. His argument runs as follows. Diphthongization processes convert sequences of 2 identical segments (long segments) into 2 non-identical segments. Autosegmental theories of feature structure represent long segments as a single feature tree which is doubly linked to the skeletal tier, as in the representation of /pp/ in (58).  Since each feature is represented only once for the two timing slots, if any part of the feature tree is altered, it will affect both halves of the long segment. Therefore, it is predicted to be impossible for a rule to change just one half of a long segment, leaving the other half unaffected. But this is precisely what diphthongization rules do. If the characterization of diphthongization is correct, then the standard autosegmental representation of long segments is in need of revision. This is what Hayes proceeds to do. This section reviews the diphthongization facts which led Hayes to propose an alternative theory of feature structures, and suggests that the facts do not entail the fatal consequences for autosegmental hierarchies that Hayes claims they do. Hayes (1990): Autosegmental approaches lead to the Diphthongization Paradox, which a Bottlebrush theory can avoid. Hayes was not the first person to worry about how to reconcile feature hierarchies with diphthongization rules. Clements (1985) and Steriade (1987a) had already recognized exactly the concern that Hayes raises, and had suggested modifications to the feature geometry accordingly. Both Clements and Steriade attribute the diphthongization problem to the sharing of root nodes, and so suggest analyses of diphthongization rules in which root nodes are not shared. For example, Clements treats the Icelandic preaspiration rule, which converts underlying /pp/ to [hp] (59: Thrinsson 1978), as an instance where the laryngeal and supralaryngeal tiers are separately shared by each half of /pp/. This makes it possible to delete the supralaryngeal features of the first half of the segment without affecting the laryngeal features, changing /p/ to /h/ (60). 59. a. kappi [khahpI] hero b. hattur [hahtYr] hat c. akka [qahka] thank  Steriade (1987a) takes a more radical position than this: she abandons altogether the notion of a single root node, and proposes that there are 3 different root nodes, each of which supports a separate subset of features and is directly associated to the prosodic tier. By separating 3 classes of features as in (61), it becomes possible to capture 3 kinds of diphthongization processes: (a) rules affecting laryngeal features, while leaving all else intact, as in Southern Paiute, where /mm/ is realized as /mm