7#, N 7kxx8J===D>*l?AAXAxBBn BBE*E0x<VEBEEM6EEEEEEOrder and Structure by Colin Phillips B.A., Modern Languages University of Oxford, 1990 Submitted to the Department of Linguistics and Philosophy in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology August 1996 1996 Colin Phillips. All rights reserved. The author hereby grants to MIT permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole or in part. Signature of Author: Department of Linguistics and Philosophy August 31st 1996 Certified by: Alec Marantz Professor of Linguistics Thesis Supervisor Accepted by: Wayne ONeil Professor of Linguistics Head, Department of Linguistics and Philosophy Order and Structure by Colin Phillips Submitted to the Department of Linguistics and Philosophy on August 31st, 1996 in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Linguistics. Abstract The aim of this thesis is to argue for the following two main points. First, that grammars of natural language construct sentences in a strictly left-to-right fashion, i.e. starting at the beginning of the sentence and ending at the end. Second, that there is no distinction between the grammar and the parser. In the area of phrase structure, I show that the left-to-right derivations forced by the principle Merge Right can account for the apparent contradictions that different tests of constituency show, and that they also provide an explanation for why the different tests yield the results that they do. Phenomena discussed include coordination, movement, ellipsis, binding, right node raising and scope. I present a preliminary account of the interface of phonology and morphology with syntax based on left-to-right derivations. I show that this approach to morphosyntax allows for a uniform account of locality in head movement and clitic placement, explains certain directional asymmetries in phonology-syntax mismatches and head movement, and allows for a tighter connection between syntactic and phonological phrases than commonly assumed. In parsing I argue that a wide range of structural biases in ambiguity resolution can be accounted for by the single principle Branch Right, which favors building right-branching structures wherever possible. Evidence from novel and existing experimental work is presented which shows that Branch Right has broader empirical coverage than other proposed structural parsing principles. Moreover, Branch Right is not a parsing-specific principle: it is independently motivated as an economy principle of syntax in the chapters on syntax. The combination of these results from syntax and parsing makes it possible to claim that the parser and the grammar are identical. The possibility that the parser and the grammar are identical or extremely similar was explored in the early 1960s, but is widely considered to have been discredited by the end of that decade. I show that arguments against this model which were once valid no longer apply given left-to-right syntax and the view of the parser proposed here. Thesis Supervisor: Alec Marantz Title: Professor of Linguistics Acknowledgments I left this part to the very end, thinking it might be easy, but I hadnt reckoned with the enormous number of people who deserve credit for what follows, many of whom probably dont even get a mention here. Alec Marantz has been far more than my advisor and committee chair. It goes without saying that this thesis has greatly improved as a result of many hours of discussion with Alec and by his constant written feedback. We have argued and argued about this, and it was always valuable. But Alec has been more than my thesis advisor. We have collaborated for three years as novice neurolinguists, with Alec as our intrepid leader. Alec has been my running training partner for a number of years, and he played a large part in my other big project of this year, which was to qualify for the Boston Marathon. He has always been willing to give help and advice, ranging from more regular academic matters to feeding my cat and helping me move house, with a whole lot of other things in between. He is a good friend. I have learned a lot about doing linguistics from David Pesetsky. He seemed to know where this thesis was headed well before I did. His infectious enthusiasm made an great difference to the enjoyment I got from working on this, and he never failed to find the fatal flaws in my logic, which led to a good many matters being clarified, in my head as much as on the page. Beyond this thesis, his co-directorship (with Ken Wexler) of the joint program in linguistics and cognitive science has had an enormous impact on what I have got out of my time at MIT. Noam Chomskys immense influence on this work should be clear, and my discussions with him always led me to rethink whatever I had been thinking about. Perhaps this isnt so surprising, since what Im trying to do here is something that Noam had already done long before I was born. I only wish that I had begun to meet regularly with Noam at an earlier point. I can say for sure that I would not have written this thesis had it not been for Ted Gibson. Three years ago, before Ted came to MIT, parsing was the last thing that I was expecting to write my dissertation about. This thesis had its beginnings in a (very late) term paper I wrote for a class of Teds, and somehow it developed from there, with Ted as my main advisor in this area for the first year. Thankfully, none of the original paper has survived to this version. Working with Ted has been extremely rewarding, and a good deal of fun, too. Particularly the material in Chapter 3 has benefitted from countless hours of discussion with Ted. Ted is the only professor with whom I have left an appointment running, so that we would not miss the final subway of the night, at 12:15am! Beyond my committee a number of people have provided large amounts of feedback, information and other assistance with this thesis. I am amazed at how helpful people are around hereit is a wonderful community of linguists. Heidi Harley Harley, Jonathan Bobaljik and Andrea Zukowski all read the entire document and provided detailed comments. Danny Fox, Uli Sauerland and Carson Schtze all read sizeable chunks and returned sizeable comments with amazing speed. Many other people have directly contributed to the thesis, in the form of discussions, correspondence, judgements, criticisms, technical assistance, and miscellaneous wisdom: Carl Alphonce, Albert Alvarez, Kevin Broihier, Andrew Carnie, Paul Crook, Janet Fodor, Danny Fox, Bob Frank, Lyn Frazier, Paul Gorrell, Paul Hagstrom, Ken Hale, Koji Hoshi, Dick Hudson, Edith Kaan, Masa Koizumi, Howard Lasnik, Martha McGinnis, Don Mitchell, Shigeru Miyagawa, Renate Musan, Orin Percus, Norvin Richards, Uli Sauerland, Julie Sedivy, Mike Tanenhaus and Myriam Uribe-Echevarria, Ken Wexler and Kazuko Yatsushiro, Elron Yellin. The experimental work reported in Section 2 of Chapter 3 was conducted in collaboration with Ted Gibson. Thanks to Neal Pearlmutter, Ariel Salomon, and San Tunstall for their invaluable assistance with these experiments. Work on this thesis was supported in part by the NSF funded program Language: Acquisition and Computation (#DIR9113607) awarded to MIT. and thats just the dissertation. I also owe a great debt to the many other people who have helped me through my first six years of being a linguist and my first six years of living in the United States. My introduction to linguistics and to the USA came when I was a visiting student at the University of Rochester in 1990-1991. I had only intended to be in this country and in linguistics for a year, but the warm welcome and the exciting intellectual environment that I found there quickly made me change my plans. Itziar Laka inspired me, because she cared about finding the right answer more than anybody I had encountered in my earlier life as a medievalist. Sandro Zucchi and Roberto Zamparelli provided many many hours of discussion and instruction. Many many other people in Rochester provided a warm welcome. The absence of any distinction between theoretical linguistics and psycholinguistics which I encountered in Rochester has had clear influences on the document that follows. I benefitted immensely from living with Hamida Demirdache, Tom Green and Kumiko Murasugi in my first year at MIT. From remedial phonology help to forming a band, this was an extremely happy time. I was lucky enough to spend the first 4 years of my time at MIT with a wonderful group of classmates: Masha Babyonyshev, Andrew Carnie, Heidi Harley, Shawn Kern, Masa Koizumi, Renate Musan, Orin Percus, Hubert Truckenbrodt, and adoptively Jonathan Bobaljik. Since most of them graduated and left a year ago, I have missed their company and lively discussions very much this year. There are many many other people who have helped to make my stay at MIT incredibly rewarding. Apart from the people already mentioned and the people who I will kick myself for forgetting right after I hand this in (it is, after all very late as I write, or is it early?), there are Morris Halle, Tony Harris, Irene Heim, Maya Honda, Dianne Jonas, Michael Kenstowicz, LING8895 (thats just about everybody), Gary Marcus, all of the MITWPLers past and present, Wayne ONeil, David Poeppel, lots of great people in the BU Applied Linguistics Program (you know who you are), and Linda Thomas, my companion in Brocas Area for much of the writing of this thesis, Leo my furry companion at home for the rest of the writing. Derek Gross was one of the first linguists I met in Rochester, and a friend in Boston for number of years. He showed me this year that writing a thesis is really not such a big deal. His ability to deal with cancer in a cheerful and always positive manner put everything in clear perspective for me. I wish I could thank him for this. Many people will miss him very much. I owe an immense debt of love and thanks to Andrea Zukowski, my best friend and wife. Of all people she has had to put up with the most through the writing of this thesis. She has listened to the most unformed ideas at the most inappropriate times. She has read drafts of the thesis in great detail, raised many important questions and clarified many others. She has been a constant source of encouragement and support, and as writing took up more and more time she insisted on staying up into the small hours to keep me company. She has taught me most of what I know about bread-baking and statistics, and an enormous amount about life. Finally, I thank my parents Kathleen and Derrick Phillips for their constant support and many years of sacrifices on my behalf. I owe them everything. Table of Contents 1. Introduction 13 1.1 General Architecture 13 1.2 Outline 18 1.3 Some reminders 20 2. Constituency 23 2.1 Introduction 23 2.2 The Problem of Contradictory Constituency 24 2.3 Constituency in Structure Building 27 2.3.1 A left-to-right derivation 27 2.3.2 Prediction I: Uniform C-command 33 2.3.3 Prediction II: Left-edge constituency 34 2.3.4 Prediction III: Parallelism 34 2.4 C-Command Tests 35 2.4.1 Binding 35 2.4.2 Scope 36 2.5 Linear Order and Left-edge Constituency 39 2.5.1 VP-Fronting 39 2.5.2 PP-movement 44 2.5.3 Right Node Raising 48 2.5.3.1 Disguised Clausal Coordination 49 2.5.3.2 Right Node Non-Raising 51 2.5.3.3 Factorization and Ordering 54 2.5.4 Summary 58 2.6 Constituency Conflicts and Parallelism 59 2.6.1 An Asymmetry between VP-Fronting and VP-Ellipsis 60 2.6.2 Scope and Ellipsis in Japanese 67 2.6.3 Is Branch Right Violable? 69 2.6.4 Comparative Ellipsis 70 2.6.5 Verb-preposition units 75 2.7 On the Necessity of Right-Branching Structures 77 2.7.1 Arguments for Right-Branching VPs 77 2.7.2 Arguments for PP-splitting 78 2.8 Alternative Approaches to Contradictory Constituency 80 2.8.1 Points of Agreement and Disagreement 80 2.8.2 Categorial Grammar 81 2.8.3 Pesetsky 1995: Cascade and Layered Syntax 85 2.9 Conclusion 87 Appendix 1: Constraints on Right Node Raising 88 A1.1 Case and Right Node Raising 88 A1.2 Constraints and Non-Constraints on the Coordinates 94 Appendix 2: Heavy NP Shift & Parasitic Gaps 97 3. Parsing 103 3.1 Issues in Parsing 103 3.1.1 Overview 103 3.1.2 Representations and transitions 104 3.1.2.1 Endpoints 105 3.1.2.2 Intermediate representations 105 3.1.3 Transitions: ambiguity resolution 106 3.1.4 Structural complexity metrics 108 3.1.5 Branch Right 111 3.2 On the Strength of the Local Attachment Preference 114 3.2.1 The Locality Puzzle 114 3.2.2 Competing Biases 117 3.2.3 Experimental Evidence 119 3.2.4 Resolving the Locality Puzzle 128 3.2.5 Interim Conclusions 131 3.3 Ambiguity resolution in English 131 3.3.1 Two types of structural ambiguity 132 3.3.2 Association ambiguities I 133 3.3.3 Association ambiguities II 137 3.3.3.1 Multiple complement verbs 137 3.3.3.2 PP-attachment: argument-by-category interaction 141 3.3.4 (In)dependence ambiguities 144 3.4 Some attachment preferences in head-final languages 147 3.4.1 Head position and PP-attachment in German 148 3.4.2 Japanese 152 3.4.2.1 How left-branching is Japanese? 153 3.4.2.2 How ambiguous is Japanese? 157 3.4.2.3 Japanese syntactic ambiguity resolution 160 3.4.2.4 Recovery from error 161 3.5 Some residual questions 163 3.5.1 Fragility of the predictions of Branch Right 163 3.5.2 One more anti-locality effect: complex NPs 164 3.6 Modularity 165 3.7 Conclusion 166 Appendix: Experimental Materials 167 4. Morphosyntax 169 4.1 Introduction: Composition and Decomposition 169 4.2 Movement and Projection of Heads 172 4.2.1 The C-command Condition on Movement 172 4.2.2 Head Projection 176 4.3 (Non-)Locality in Head Movement 176 4.3.1 Movement and Decomposition inside VP 178 4.3.1.1 Two restrictions on causer arguments 178 4.3.1.2 Heavy PP Shift voids the restrictions 186 4.3.2 (Non-)locality in Participle Movement 188 4.3.2.1 Long Head Movement 188 4.3.2.2 Long-Distance Participle Fronting Crosslinguistically 192 4.3.2.3 Filled Comp blocks LHM in Serbo-Croatian. 197 4.3.3 Local Head Movement 203 4.4 Effects of Ordering 206 4.4.1 Serbo-Croatian Clitics 207 4.4.1.1 Interfacing Phonological and Syntactic Structures 207 4.4.1.2 Determinants of Clitic Placement 214 4.4.2 A Linear Asymmetry in Clitic Placement 223 4.4.3 The Adjacency Penalty on Head Lowering 227 4.5 The Syntax of Intonation in O'odham 229 4.5.1 Basic Facts 229 4.5.2 Clitic Auxiliaries and Intonation Phrases 232 4.5.2.1 XP and X0 as First Position 233 4.5.2.2 XP-internal Auxiliaries in Wh-Questions 235 4.5.3 Comparison with Hale & Selkirk 1987 239 4.5.4 Clitic (un-)placement again 241 4.6 Conclusion 243 Appendix: More on Participle Fronting 243 A.1 Polish 243 A.2 Old Spanish 246 A.3 Breton 249 5. The Parser and the Grammar 253 5.1 Putting the Pieces Together 253 5.2 The Parser is the Grammar 255 5.3 Objections to the PIG Model 256 5.3.1 The Grammar couldnt be a Parser 257 5.3.1.1 Analysis-by-synthesis 257 5.3.1.2 Analysis-by-analysis 258 5.3.2 The Derivational Theory of Complexity 262 5.3.2.1 The Received View of the DTC 263 5.3.2.2 Results Supporting DTC 264 5.3.2.3 Some Concerns about the DTC Experiments 266 5.3.2.4 Transformations that Reduce Perceptual Complexity. 267 5.3.2.5 Other reevaluations of the DTC experiments 268 5.3.2.6 Conclusion 269 5.3.3 There are Non-Grammatical Parsing Strategies 270 5.4 More Objections to ParserGrammar Identity 272 5.6 On the CompetencePerformance Distinction 275 5.7 Conclusion 277 Appendix: Reevaluating Criticisms of the DTC 278 Miller & McKean 1964 278 Passivization and Reversibility (Slobin 1966) 283 Transformations and Memory Capacity 285 Studies of other transformations 287 References 291 Chapter 1 Introduction The main aim of this thesis is to argue for the following two claims. First, natural language grammars construct sentences in a strictly left-to-right fashion, i.e. starting at the beginning of the sentence and ending at the end. Second, there is no distinction between the grammar and the parser. In other words, we perceive sentences by generating them for ourselves. These claims are both rather mundane, but they are anything but standard assumptions about grammar and parsing. Most work on syntax does not assume that sentences are constructed from left-to-right, and most work on parsing assumes a parsergrammar distinction. Since most of the thesis will be taken up with discussion of specific issues internal to the traditional areas of syntax and parsing, the role of this introductory chapter is to explain the overall plot of the thesis, so that the reader may know where I am headed. The main arguments about syntax and parsing will just be presented in outline here, and discussion of general issues of the architecture of language will be kept brief, as I will return to a more detailed discussion of these issues in Chapter 5. 1.1 General Architecture The general view of the architecture of language that I will be arguing for in the chapters that follow is sketched in (1). It has two components: a grammar and a finite set of resources which the grammar uses. This view is very similar to the model proposed by Miller & Chomsky in 1963, in which there was little or no role for a parsing system distinct from the grammar. For want of a better name, I will refer to this as the pig model of language (Parser Is Grammar). (1)  I should clarify at the outset what I mean by the term parser. I take this to refer specifically to the structure building system that it used in sentence recognition, and not to the many other psychological processes involved in understanding sentences. Sentence parsing should not be confused with sentence comprehension, which is a complex cognitive act involving the integration of many different sources of information (language, world-knowledge, expectations, attentional state etc.). The parser is just one of the systems involved in language comprehension, and in fact might not always be involved in comprehension of linguistic acts. In the pig model the distinction between sentences which are grammatical and sentences which are parsable is just the distinction between those sentences which the grammar could generate given potentially unbounded resources and those sentences which the grammar can generate given a certain limited set of resources. In other words, grammaticality is parsability in the limit. The steps of parsing a sentence can be seen as proceeding as follows. Parsing is an active process, in which the grammar tries to generate a sentence whose phonetic form matches the incoming sentence, using the normal structures and operations of the grammar. If the grammar can find a structural description and meaning to pair with the sound input, then the incoming sentence is successfully recognized. If, on the other hand, the grammar fails to generate a matching sentence, either because it does not generate a match in principle, or because generating a match exceeds the available resources, then recognition fails. This is what is known as an analysis-by-synthesis model of sentence recognition. Therefore, sentences are not inherently parsable or unparsable, rather they are parsable (or not) given a certain set of resources, where the resources can include short and long-term memory, expectations among other things. Meanwhile grammaticality is just the name given to the special case of parsability in which resources are unbounded. It is in this sense and only this sense that grammaticality represents an abstraction from the steps of generating and comprehending sentences. Apart from the idealization of unbounded resources, grammaticality and parsability are just the same. Nevertheless, the view that the parser is the grammar conflicts with a broad consensus of opinion since the late 1960s, which holds that the parser is not the same as the grammar. Instead there are two distinct but related structure building systems in the language faculty. The main reasons for this view are the following, all of which are discussed in greater detail in Chapter 5, and which received their classic formulation in Fodor, Bever & Garrett 1974: Standard models of grammar cannot be implemented as a sentence recognition device which can successfully recognize sentences in finite time. This is because the grammar is standardly viewed either as a mapping from underlying structures to surface (phonetic) representations (transformational theories), or as a constraint set which applies to fully-built representations (non-transformational theories). Given these models of grammar, the steps of an incremental parser cannot correspond exactly to the steps of a grammatical derivation, since the grammar either defines different steps or the grammar does not define derivational steps at all. An analysis-by-synthesis implementation of such grammars can therefore only recognize sentences by randomly generating enormous sets of generated sentences in search of a matchwhich is obviously quite inefficient and unrealistic. The only way of narrowing down the search is to add an extra preprocessor to the model in (1), which performs preliminary analysis of incoming sentences. Once spelled out, though, this preprocessor tends to take over most of the work of processing the sentence, and effectively reduces the role of the grammar in sentence recognition. Note that while this argument was initially formulated in the context of an Aspects style transformational grammar, it applies to the vast majority of other theories of grammar that have been proposed, whether or not they assume transformations, or even phrase structure. Any grammar that does not specify an incremental left-right mapping from surface strings to structural descriptions will face the same difficulties if used as an analysis-by-synthesis sentence recognition device. Something very similar to the pig model is widely considered to have been experimentally disconfirmed in the 1960s. From the early 1960s onwards a number of experimental studies were undertaken to test whether the operations proposed in transformational grammars of the time had a measurable effect on sentence comprehension or recall. The received view of the outcome of these studies is that they disconfirmed the view that the operations of the parsing device and transformational grammars were the same (a.k.a. Derivational Theory of Complexity: cf. Fillenbaum 1971, Fodor et al. 1974, Levelt 1974, vol. III, Berwick & Weinberg 1983, Bever 1988, Wanner 1988 for reviews). Properties of the parser can be observed which cannot be reduced to properties of the grammar. These first two considerations made it seem necessary that there be a sentence processor of some kind in addition to the grammar. This supposition received further support from investigations of sentence processing which pointed to the existence of a number of sentence processing principles which are not obviously related to the grammar, such as ambiguity resolution strategies and phrase-boundary location heuristics. Considerations such as these, which will be discussed in rather more detail in Chapter 5, led to a different conception of human linguistic capacities, which included at least the components in (2). (2)  The relevance of the grammar to the operations of the parser varies greatly from theory to theory, and the amount of internal structure that is attributed to the parser varies greatly, as does the way in which the parser accesses grammatical knowledge, but the basic picture appears to have been generally agreed upon since the mid-1960s. One effect of this has been that the study of the grammar and the study of the parser have fractionated into separate disciplines, with the result that the issue of whether the simpler model in (1) is possible has effectively been closed, more by boundaries between disciplines than by actual argumentation. I will not take up any further space at this point to discuss reasons for adopting the view of language in (1). Instead I will first try to demonstrate that the parsing-compatible features of the grammar that I adopt are well-motivated based on considerations of grammar alone, and that the grammar-compatible features of the parser that I adopt are well-motivated based on considerations of parsing alone. Having shown that the parser and the grammar look very much alike even in advance of considerations of how they interact, it will make much more sense to return to the issues raised here about the parsergrammar relation. There are two main components to this argument, both of which involve making non-standard assumptions about the form of the grammar and the parser, and both of which take away the force of the classic arguments against viewing the parser and the grammar as the same. First, I reexamine standard views of how the grammar builds syntactic structures (either not at all, or from bottom-to-top), and argue that structures are built from left-to-right, based on evidence from constituency tests and ordering asymmetries. If this conclusion is correct, then the main argument against using the grammar as an analysis-by-synthesis recognition device goes away, because the grammar specifies an incremental left-to-right mapping from surface strings to structural descriptions. Second, I look at ambiguity resolution strategies, an area which is generally viewed in terms of an independent parser. I argue that structural complexity metrics in parsing, which contribute to what is easy to understand and what leads to garden paths, reduce to an independently motivated economy principle of the grammar, which favors the building of right-branching structures where possible. These arguments address the first and the third objections to the pig model given above, and discussing them takes up the greater part of the thesis. However, I also address the second argument against the pig model, surrounding the so-called psychological reality of transformational grammars, and the claim that there is no evidence for their operations in parsing. I show that the evidence for this argument was never particularly strong, and is even weaker now than it was in the 1960s. Moreover, it was never the most important argument against the pig modelthe other two arguments were always more important, although they received less attention. 1.2 Outline The thesis is organized as follows. Chapter 2 argues that syntactic structures are built in a strictly left-to-right fashion. The evidence for this comes from a study of apparent contradictions between the results of different constituency tests. These diagnostic tests include coordination, movement, ellipsis, coreference, disjoint reference and the licensing of bound variables and polarity items. Some of these diagnostics make sentences (or parts of sentences, eg. VP) appear to have a left-branching structure (3a), others make sentences appear to have a right-branching structure (3b), and yet others yield both results. (3) a. b.  The only existing approaches to these problems have required the adoption of either dual representations for all sentences (Pesetsky 1995, Brody 1994) or the flexible constituency of some versions of categorial grammar which effectively allows multiple constituent structures for any sentence (cf. Ades & Steedman 1982, Steedman 1996, Dowty 1988, Pickering & Barry 1993 for approaches and applications). I show that the problem of contradictory constituency does not arise if it is assumed that syntactic structures are constructed incrementally from left-to-right, as dictated by the condition Merge Right, and if structure building is subject to the economy condition Branch Right. (4) Merge Right New items must be introduced at the right edge of a structure. (5) Branch Right Metric: select the attachment that uses the shortest path(s) from the last item in the input to the current input item. Reference set: all attachments of a new item that are compatible with a given interpretation. The effect of Merge Right is that a structure like (3b) has the derivational steps in (6). (6)  All of the strings listed in (7) are constituents at some point in the derivation in (6). Notice that although (6) builds a right-branching structure, all of the strings that are constituents in the left-branching structure in (3a) are constituents at some point in the derivation in (6). (7) AB BC ABC BCD ABCD CD Motivation for Merge Right and Branch Right is drawn from evidence for intermediate constituents of derivations like (6), and from evidence for the restricted distribution of contradictory constituency effects, which is predicted by the left-to-right theory, but not by theories which invoke multiple parallel representations. Furthermore, I show that the Merge Right/Branch Right approach to structure building provides more than a restatement of the effects of multiple structure theories, because it explains why different constituency tests yield the results they do, and where constituency contradictions should and should not be found. The arguments in this chapter are drawn primarily from English. In Chapter 3 I focus on parsing, in particular on the topic of ambiguity resolution, which has been the focus of most work in sentence processing over the last 20 years. I argue that the syntactic component of ambiguity resolution can be reduced to the principle Branch Right, which favors the construction of maximally right-branching structures, all other things being equal. Branch Right is closely related to the local attachment preference that almost all models of parsing incorporate (e.g. Right Association, Kimball 1973; Late Closure, Frazier 1978). My claim is therefore that all structural biases in parsing can be reduced to the same principle that causes a bias to interpret the adverbial yesterday inside the lower clause in well-known ambiguities like (8). (8) John said Bill left yesterday. I show that Branch Right can account for a wide range of parsing biases which are normally attributed to other strategies, and present results from an experiment on a novel structural ambiguity, which pits the choices of Branch Right against the choices of well-known parsing strategies such as Minimal Attachment (Frazier & Fodor 1978, Gorrell 1995), and Attach Arguments (Ford et al. 1982, Abney 1989, Crocker 1996, Schtze & Gibson 1996). The experimental results show that given the choice between a local attachment which is structurally more complex, not supported by discourse and not required by syntax or semantics, and a non-local attachment which is structurally simpler and involves an obligatory syntactic constituent, the parser opts for the more local attachment, as predicted by Branch Right. The evidence in this chapter is again drawn primarily from English, but includes some discussion of ambiguity resolution in German and Japanese. Chapter 4 returns to issues in the traditional domain of grammar, and extends the discussion of the left-right grammar to issues at the interface of syntax with morphology and phonology. In this model morphophonological representations are mapped onto surface syntactic structures, which in turn are mapped onto underlying syntactic structures. This ordering is the opposite of what is assumed in most theories, and is forced because of the fact that surface positions of words are generally to the left of their underlying positions, and are therefore built earlier in a left-to-right derivation. I show how it is possible in this approach to give a uniform treatment of local and non-local head movement and clitic placement operations. I show how certain left-right asymmetries in both head movement and clitic placement are predicted by the theory. I also discuss some issues involving the relationship between phonological phrasing and syntactic constituency, and show how it might be possible to draw a closer connection between phonological and syntactic constituency than is commonly assumed. This chapter draws more on cross-linguistic evidence than the earlier chapters. Topics covered include long head movement in Slavic, Romance and Breton, clitic placement in Serbo-Croatian and phonological phrasing in Tohono O'odham (Papago), among others. Chapter 5 draws together the arguments developed independently in Chapters 24 and returns to the issues raised in this chapter involving the general architecture of language. I show that the best objections to the pig model of language no longer apply, given the view of the parser and the grammar developed in Chapters 24. This chapter also discusses other issues concerning the parsergrammar relation, including the competenceperformance distinction, and some further arguments against the pig model. 1.3 Some reminders Before proceeding, I should emphasize at the outset a couple of things that I am not trying to do here, and one thing that I am trying to do. First, my agenda here should not be mistaken for an attempt to give a parsing explanation for grammatical phenomena. The literature contains a number of arguments which run something like: phenomenon X is generally considered to be a property of grammar, but it in fact is better explained in terms of properties of the parser (for examples, see Fodor 1978, Hankamer 1973, Kuno 1973, Dryer 1980, Hawkins 1995, Berwick & Weinberg 1984, Alphonce & Davis 1992, 1996, Pritchett 1991, Fox 1996). What I am trying to do here is quite different. Given that I am claiming that the parser and the grammar are the same system, the traditional kind of reassignment of work from the grammar to the parser is mostly unavailable to me. Of course, given the distinction between language and resources it is still possible to distinguish between unacceptability of a sentence due to the grammar or due to resource limitations. Second, readers who are at all familiar with the recent experimental literature on sentence processing will be aware of the fact that an enormous amount of the research done on parsing is directed to exploring questions of modularity; in other words, to what extent are different sources of linguistic and non-linguistic information used in sentence comprehension, and what is their relative importance. Note that my focus in this thesis on the direct implementation of the grammar as a parsing device entails no committment whatsoever regarding the modularity issue. Adopting the pig model commits me to the claim that sentence parsing involves building representations that are sanctioned by the grammar, but this says nothing about how ambiguities are resolved in situations which are lexically or pragmatically biasedwhich is where much of the action has been in the modularity debate. In Chapter 3 I discuss briefly the current status of the debate on the informational encapsulation of the parser, but this question is logically quite independent of the issue of whether there are distinct syntactic systems used in parsing and grammar. Thirdly, I am trying to do more than show that an incremental parser can be built whose operations are more or less transparently related to the operations of the grammar. This possibility has been amply demonstrated for a variety of grammatical formalisms in the computational literature on parsing. In incremental parsers based on standard grammatical models, the intermediate stages of a parse are not grammatically defined objects, and are not expected to play any role in grammatical phenomena. One of my main aims here is to show that the intermediate structures built by a left-to-right grammar play a crucial role in certain grammatical phenomena, and that grammatical derivations can therefore only proceed from left-to-right. Finally, this should go without saying, but the overall aim of this thesis is to address issues in parsing and grammaticality together. The chapters on syntax and parsing are mostly written so that they may be read independently, but to overlook the similarity between the issues that arise in parsing and grammar would be to miss the main point, which is that these are not separate lines of inquiry. Chapter 2 Constituency 2.1 Introduction Diagnostics of constituency typically test for what strings of words can be coordinated, moved or elided, and which pairs of phrases can enter into relationships of binding, disjoint reference or other dependencies. A problem that often arises in syntactic research is that faithful application of the constituency tests leads to situations where the results of one test contradict the results of another. Some diagnostics make sentences appear to have a relatively left-branching structure of the kind shown schematically in (1a). Other diagnostics, meanwhile, make the same sentences appear to have a much more right-branching structure, as in (1b). (1) a. b.  The problem of contradictory constituency poses a serious problem for one of the leading ideas of phrase structure grammar which I will call the Single Structure Hypothesis. This is the hypothesis that a wide range of otherwise unrelated syntactic processes all refer to the pieces of a single constituent structure or derivation for any sentence. Existing approaches to the problem of contradictory constituency have adopted one of two approaches. On the one hand, some have attempted to dismiss the problem by arguing that some of the diagnostics of constituency are either misleading or have been misinterpreted. On the other hand, a number of people have recently argued that the conflicts between different constituency tests provide evidence for multiple parallel phrase structures (Pesetsky 1995, Brody 1994) or for the flexible constituency allowed by enriched categorial grammars (Steedman 1985, 1988, in press; Dowty 1988; Pickering & Barry 1993). The aim of this chapter is to suggest a different kind of solution to the contradictory constituency problem. I argue that the problem of contradictory constituency does not arise in a system in which phrase markers are derived by building from left-to-right, i.e. starting at the beginning and ending at the end. The requirement that new material is always added at the right of the phrase marker is imposed by a principle which I call Merge Right (2). (2) Merge Right New items must be attached at the right edge of a structure. Although the theory of phrase structure which I assume is in other respects rather standard, the effects of Merge Right are far reaching. The main consequence of left-to-right derivations, which I focus on in this chapter, is that the strings that form constituents at intermediate stages in the derivation are different from the constituents of more orthodox bottom-to-top derivations. This fact makes it possible to derive the effects of contradictory constituency without assuming parallel representations or flexible constituency. As I show in 2.2 and 2.3, the appearance of constituency conflicts is just the consequence of how structure changes over the course of a left-to-right derivation of a single right-branching phrase marker. More important than just being able to describe in a single derivation what had previously appeared to be contradictions between the results of different constituency tests, the account based on left-to-right derivations begins to provide an account of why each constituency test yields the results it does, and makes novel predictions about which kinds of tests will be able to diagnose which kinds of constituents. 2.4Ф2.6 test these predictions and demonstrate a number of correlations between how a constituency test probes for structure and the kinds of results it produces. In 2.7Ф2.8 I compare the results of the Merge Right approach to other existing approaches to the problem of contradictory constituency, including attempts to deny that there really is a problem at all. A further consequence of Merge Right is that in almost reversing the order in which syntactic derivations are standardly assumed to occurbecause of its left-to-right naturethe grammar proposed here effectively computes from relations which traditionally hold at Sstructure to relations which traditionally hold at D-structure, rather than vice versa. This means that movement operations are generally rightward and downward rather than leftward and upward, as they are in most transformational grammars. This ordering of derivations also opens the possibility of far greater similarity between the operations of the parser and the grammar. 2.2 The Problem of Contradictory Constituency As an illustration of the problem of Contradictory Constituency, consider first the sentence in (3), and the constituency tests that have been applied to it in (4) and (5). The tests of negative polarity item licensing and coordination in (45) point to an extremely right-branching VP-structure, such as in (6), which corresponds to the Cascade structures proposed in Pesetsky 1995. (3) John gives candy to children in libraries on weekends (4) a. John gave nothing to any of my children in the library on his birthday. b. John gave candy to none of my children in any library on his birthday. c. John gave candy to children in no library on any public holiday. d. * John gave anything to none of my children in the library on his birthday. e. * John gave candy to any of my children in no library on his birthday. (5) a. John gives [candy to children on weekends] and [money to homeless people on weekdays.] b. John gives money [to children on weekends] and [to homeless people on weekdays.] c. John gives candy to [children on weekends] and [homeless people on weekdays.] (6)  The facts in (45) motivate the structure in (6) based upon the assumption that negative polarity item licensing requires c-command or m-command, and that coordinability is an indicator of constituenthood (these assumptions are standard, though by no means necessary: see 2.7 for further discussion). Using this reasoning we are led to the conclusion that the complex VP in (3) is right-branching to such an extent that the complement of a preposition forms a constituent with the following PP, to the exclusion of the preposition that selects it. The evidence for this is that an NP can ccommand an element outside of the PP that it is generally thought to be inside, such that it can license a polarity item in the immediately following PP, as in (4bc). Similarly, an NP can form a conjunct which includes the PP that follows it, but excludes the preposition that selects it, as in (5c). Based on these tests, then, rightwards roughly equals downwards in the phrase structure tree. A number of other structural diagnostics yield the same pattern of results, including anaphor binding, disjointness (Condition C effects), weak crossover and bound variable anaphora (cf. Barss & Lasnik 1986, Stroik 1990, 1996, Pesetsky 1995). Contrasting with the evidence for right-branching structures, meanwhile, certain kinds of movement tests point to a left-branching structure for the very same VP, as can be seen from the examples of VP-fronting in (7). The basic generalization in this case is that any string of phrases starting from the left edge of VP can be fronted (7ad). Strings of phrases that do not include the left edge of VP cannot be fronted (7ef). If we assume that the strings that can front are constituents, then the results of this test point to a left-branching structure like (8), which is the kind of structure traditionally assumed for VPs containing multiple modifiers. (7) a. John intended to give candy to children in libraries on weekends, ... and [give candy to children in libraries on weekends] he did ___. b. John intended to give candy to children in libraries, ... and [give candy to children in libraries] he did ___ on weekends. c. John intended to give candy to children, ... and [give candy to children] he did ___ in libraries on weekends. d. ... and [give candy] he did ___ to children in libraries on weekends. e. * ...and [to children in libraries] he did ___ give candy on weekends. f. * ...and [in libraries on weekends] he did ___ give candy to children. (8)  There therefore appears to be a conflict between the results of the polarity item licensing and coordination tests in (45) and the results of the movement test in (7). This kind of conflict is the basis of the contradictory constituency problem. In fact, this conflict is sharpened by the fact that we find diagnostics for both left- and right-branching structures satisfied in a single sentence, as in the sentences in (9), taken from Pesetsky 1995 (p.230), in which sequences of phrases starting at the left-edge of VP have been fronted, implying the kind of structure in (8), but the fronted portion of VP contains an NP which binds a reciprocal in the stranded portion of VP, implying a right-branching VP structure more along the lines of (6). (9) a. ...and [give the books to them in the garden] he did ___ on each others birthdays. b. ...and [give the books to them] he did ___ in the garden on each others birthdays. Notice, however, an important step in the reasoning that leads to the constituency conflict. What the results of the movement test in (7) show is that give candy is a constituent, that give candy to children is a constituent, that give candy to children in libraries is a constituent, and so on. The standard way of representing the fact that each of these strings is a constituent is to assign them the nested, left-branching structure in (8). But this inference is by no means necessary, particularly if we assume left-to-right structure building, as the next section shows. On the other hand, the binding and coordination tests in (35) provide convergent evidence for the right-branching structure in (6). The right-branching structure is motivated by evidence from both constituency tests (i.e. tests that ask: is this string a unit?) and c-command tests (i.e. tests which ask what the relative hierarchical relation of two units is). 2.3 Constituency in Structure Building 2.3.1 A left-to-right derivation (10) shows a very much simplified version of how the sentence The man saw Mary is built up in the theory outlined in Chomsky 1995a. The relevant property of this kind of derivation is that it proceeds largely from bottom-to-top, as dictated by the Strict Cycle Condition (Chomsky 1993). When new items are added at the top of the tree new constituents are created, but existing constituents are preserved from each step of the derivation to the next. Inflectional material and functional projections have been omitted from this derivation for the sake of simplicity, but I assume that they are added to the structure in the same way as lexical material. (10)  The strings that are constituents at some point in the derivation in (10) are listed in (11). Note that these are exactly the constituents of the final structure in (10). (11) the the man man saw Mary saw the man saw Mary Mary But now consider what happens if instead of always adding new material at the top of the tree, structures are built in a strictly left-to-right fashion, so that new material is always added at the right-hand edge of a tree. Let us assume that this requirement is imposed by the condition Merge Right, given in (12). (12) Merge Right New items must be attached at the right edge of a structure. A simplified derivation of the man saw Mary in this left-to-right manner is shown in (13). As in (10) inflectional material has been omitted for ease of exposition. The important thing to notice here is the difference between the third and fourth steps in the derivation. In the second step, at which point the verb is the rightmost element in the structure, the subject and the verb form a constituent. But once the object is added to the structure, the subject and the verb no longer form a constituent. At this point the verb and the object form a constituent, as in the structure traditionally assumed for English SVO sentences. (13)  (14) lists the strings that are constituents at some point in the derivation in (13). The final structure is identical to the one built in (10), but the list in (14) includes one string which is not a constituent in the final structure in (13), namely the man saw. (14) the the man man the man saw saw saw Mary Mary the man saw Mary Therefore, two unusual properties of derivations that respect Merge Right are the following. First, in the construction of a right-branching structure some constituents are created during the derivation which are not constituents in the completed (final) phrase marker. This fact is the key to being able to describe contradictory constituency effects without recourse to multiple parallel structures or flexible constituency. Second, the creation of new constituents in left-to-right derivations sometimes has the effect of destroying existing constituents, such as when the addition of the direct object Mary to the derivation in (13) created the new constituent saw Mary, but destroyed the existing constituent the man saw. This property of left-to-right derivations plays an important role in the explanation of why different structural diagnostics yield different results. Before running through the effects of Merge Right for some more involved examples, I should first spell out some additional assumptions that I will be making. First, I assume that structure building is constrained by the condition Branch Right, which forces structures to be as right-branching as possible. (15) Branch Right Metric: select the most right-branching available attachment of an incoming item. Reference set: all attachments of a new item that are compatible with a given interpretation. I assume that a structure is right-branching to the extent that there is a match between precedence relations among terminal elements and ccommand relations among terminal elements. While complete correspondence between precedence and c-command relations is the extreme situation, we can talk about one structuring of a given set of terminals as being more right-branching than another structuring of the same set of elements if there is greater correspondence between precedence and ccommand relations among terminals. As an illustration, imagine a derivation that has reached the point in (16a), where A and B form a constituent, and C is yet to be added to the structure. Let us assume that C could be attached at the right of the existing structure in two ways without affecting the interpretation of the structure. The two alternatives are shown in (16b) and (16c). (16) a. b. c.  Given the alternatives in (16b) and (16c), Branch Right chooses (16c), because B c-commands C in the more right-branching (16c), but not in the more left-branching (16b). I assume that Branch Right locally determines what is the most right-branching attachment of an incoming item by choosing the attachment that creates the shortest path through the phrase marker from the preceding item to the incoming item. The details of this local way of finding the most right-branching structure will not be important in this chapter, but they are discussed at length in the treatment of parsing in Chapter 3. I assume that the condition in (17) applies to arguments and predicates. (17) requires that thematic relations be satisfied under sisterhood. It does not require that the thematic relations be satisfied at any specific point in a derivation, and it also does not require that the sisterhood relation be preserved once established. (17) Configuration for Arguments and Predication A head X may discharge a thematic role to a position Y or take position Y as a predicate iff Y is the sister of a head containing X or the sister of a projection of a head containing X. Finally, I follow Chomsky 1995a,b in assuming that all non-terminal nodes in a phrase marker are branching nodes. In other words, there is no vacuous projection of phrase structure nodes in order to conform to an X-bar template. With these preliminaries in mind, we are now in a position to see how Merge Right accounts for the appearance of contradictory constituency in complex VPs. (18) shows the steps involved in building the complex VP from the sentence in (3) from left-to-right. The derivation of the VP begins with the verb give in (18a). The verb does not project until the noun phrase candy is merged to the right of the verb as its sister. At this point in the derivation the verb may discharge one of its theta roles to the NP. (18) a. b.   The next step in the derivation involves the addition of the PP to children, and is shown in (18ce). The PP could in principle be merged with the constituent give candy in (18b) to form the structure [[give candy] to children]. However, there is an alternative way of adding the PP to the structure which receives the same interpretation and satisfies the condition in (17) above, and is more right-branching and therefore preferred by Branch Right. First a copy of the verb give is generated, which merges at the right of the phrase marker as the sister of the NP candy (18c). This copy of give is then projected to create an attachment site for the preposition to (18d). Then the preposition to is projected to allow attachment of the NP children as its sister (18e). At this point in the derivation merger satisfies the thematic relation between the P and the NP. (18) c. d.   e.  Notice that the structure in (18e) is a right-branching structure very much like the VP Shell structures proposed by Larson (1988, 1990). Therefore, right-branching VP-structures have a different motivation here from in other theories. Here they are just a consequence of the economy condition Branch Right. In the current theory, left-branching VPs are syntactically well-formed, except when they are blocked by a more economical right-branching VP-structure. The steps involved in the addition of additional adverbial PPs are shown in abbreviated form in (18fg). As in (18ce) the adverbial PPs are merged with the existing phrase marker in such a way as to maximize precedence/c-command correspondences, in accordance with Branch Right. (18) f. g.  An important property of (18fg) to notice is what happens to PPs when additional PPs are added to their right in the phrase marker. For example, when the PP in libraries is added in (18f), the existing PP to children is split up, such that the NP children forms a constituent with the following PP, to the exclusion of the preposition that selects it. The structure that is ultimately built is very similar to the radically right-branching Cascade structures proposed by Pesetsky 1995. However, by building right-branching phrase markers from left-to-right, the system proposed here differs from Pesetskys system in two important respects. First, derivations like (18) combine properties of traditional phrase structure theories and Pesetskys Cascade strictires. Complements of prepositions, for example, enter the derivation as the sister of the preposition, as in traditional theories, but wind up in the specifier position of a lower projection, as in Pesetskys Cascade structures. The second, more interesting difference between this system and Pesetskys system is that there is no need under the current system to represent left-branching and right-branching structures in parallel. The reason for this is that all the strings that are constituents in Layered Syntax structures are also constituents at some point in the construction of the right-branching structure, although these strings are not always constituents in the final structure. For example, give candy is a constituent in (18b), but is no longer a constituent from (18c) onwards. In the light of this derivation, it is useful to reconsider the evidence for left-branching constituency presented in (7) above. The VP-fronting test showed that give candy is a constituent, that give candy to children is a constituent, that give candy to children in libraries is a constituent, and so on. It would be normal to infer from these facts that the VP must have the left-branching structure in (8), but the derivation in (18) shows that this conclusion is not necessary, because all of the strings that can undergo VP-fronting are constituents at some point in the derivation of the right-branching VP. Therefore, the existence of contradictory constituency effects as described in 2.2 does not force us to assume any kind of parallel structure or flexible constituency theory. These effects may therefore be explainable in terms of the derivation of a single structure for any sentence, as we shall see in what follows. More interestingly, what have traditionally been taken to be the constituents of left-branching structures are in this theory transient stages in the construction of right-branching structures. This generates a series of novel predictions about the relation between structural diagnostics and their results, which are tested in the sections that follow. 2.3.2 Prediction I: Uniform C-command The first prediction derives from the fact that although constituency can change over the course of a derivation, asymmetric ccommand relations are never destroyed once they have been created. This means that we should not expect to find conflicts among structural diagnostics which probe for c-command relations. The only conflicts should be between c-command tests and constituency tests, and among different constituency tests. Also, given the effects of Branch Right, we expect that ccommand tests will predominantly diagnose right-branching structures. (19) Prediction I Constituency changes during the course of a derivation, asymmetric c-command relations do not. Therefore, tests involving c-command relations should not conflict with one another. The only exceptions to this generalization should be situations in which a less right-branching arrangement of a set of terminals is permitted because it receives a different interpretation from a more right-branching arrangement of the same set of terminals. These predictions about c-command diagnostics are tested in 2.4. 2.3.3 Prediction II: Left-edge constituency The second prediction relates to the fact that although the constituents of a left-branching structure are also constituents during the derivation of a right-branching structure, these constituents are often destroyed once material is added on their right. The prediction is quite straightforward: once a constituent has been destroyed, it should be impossible to refer to it at any subsequent point in the derivation. Put another way, the only structural diagnostics that should be able to pick out the constituents of left-branching structureswhich I shall refer to as left-edge constituentsare those diagnostics based on syntactic relations established before the constituency-destroying material is added on the right. For examples, diagnostics of the constituenthood of the man saw should involve syntactic relations established prior to the addition of the object NP Mary to the structure. Meanwhile, tests that diagnose right-branching structures should not be subject to the same restriction This prediction is verified in 2.5. (20) Prediction II Left-edge constituents are destroyed when material is added on their right. Therefore, evidence for left-edge constituents should be restricted to relations established before their constituenthood is destroyed by the addition of new material to their right. 2.3.4 Prediction III: Parallelism The third prediction is an extension of the prediction that once a constituent has been destroyed it cannot be referred to later in the derivation. Consider what this means for constructions which require parallelism between two conjuncts. If we assume strict left-to-right structure building, this ensures that the first conjunct will be entirely built before the second is begun. Therefore, any intermediate properties of the first conjunct which might give rise to contradictory constituency effects will no longer be available when the second conjunct is being constructed. Parallelism requirements should therefore only be able to apply to the final properties of the first conjunct, and should not be able to access any properties of the first conjunct which were destroyed in the course of its derivation. A consequence of this is that contradictory constituency effects should be blocked in constructions requiring parallelism across two conjuncts. (21) Prediction III a. Parallelism requirements across two conjuncts should only be able to refer to properties of the final structure of the first conjunct. b. Parallelism requirements between conjuncts should block contradictory constituency effects which would be possible in either of the conjuncts individually. 2.6 argues that this prediction is correct, based on some differences in the distribution of contradictory constituency effects between movement and ellipsis constructions. 2.4 C-Command Tests This section tests the first prediction, that different c-command tests should not contradict one anothers results, and should diagnose right-branching structures, except where an alternative structure is forced by interpretive requirements. (22) Prediction I Constituency changes during the course of a derivation, c-command relations do not. Therefore, tests involving c-command relations should not conflict with one another. 2.4.1 Binding (2327) are familiar examples from the literature on double object and complex VP constructions (cf. Barss & Lasnik 1986) which show that ccommand tests like anaphor binding, negative polarity item licensing and weak crossover all diagnose right-branching structures in double object and dative constructions, as we would expect. In all of the examples an element towards the left of VP behaves as if it c-commands an element on its right, and not vice versa. (23) Reflexive Binding a. I showed John himself in the mirror. b. * I showed himself John in the mirror. c. I showed the childreni to each otheri in the mirror. d. * I showed each otheri to the childreni in the mirror. (24) Bound Variable Anaphora a. I denied each workeri hisi paycheck. b. * I denied itis owner every paychecki. c. I gave every paychecki to itsi owner. d. * I gave hisi paycheck to every workeri. (25) Negative Polarity Item Licensing (Klima 1964) a. I gave no one anything. b. * I gave anyone nothing. c. I gave nothing to anyone. d. * I gave anything to nobody. (26) Weak Crossover (Postal 1971; Wasow 1972) a. Whoi did you show hisi reflection in the mirror? b. * Which lioni did you show itis trainer? (27) Superiority (Chomsky 1973) a. Who did you give which book? b. * Which book did you give who? Therefore, these diagnostics provide promising initial support for the part of Prediction I which states that c-command tests should uniformly point to right-branching structures. 2.4.2 Scope There is, however, one case of a c-command test which appears to contradict both parts of Prediction I. This test uses scope relations as a probe for ccommand relationswide scope is assumed to imply c-commandand the relevant cases involve the relative scope of sequences of postverbal adverbial modifiers. These phrases have been claimed to motivate a left-branching structure, based on the scope relations they exhibit (Andrews 1983, Ernst 1994, Pesetsky 1995), in violation of the prediction that ccommand tests should diagnose right-branching structures except in cases of ambiguity. The evidence comes from pairs of sentences like those in (2830), in which the first adverbial and the rest of the VP is preferentially interpreted as taking narrow scope with respect to the second adverbial. Also, reversing the order of the modifiers reverses the scope relations. For example, (28a) is most naturally understood as meaning that the frequency of the hitting was purposeful, whereas (28b) is most naturally understood as meaning that what was purposeful was the hitting, but we dont know whether the frequency of the hitting was purposeful. Similarly, (30a) is most naturally understood as restricting concerto playing in foreign countries to weekends, whereas (30b) restricts concerto playing on weekends to foreign countries. Facts like this, then, are taken to motivate left-branching VP structures like (31). (28) a. Joe hit him frequently on purpose. b. Joe hit him on purpose frequently. (Ernst 1994) (29) a. She kissed him many times willingly. b. She kissed him willingly many times. (30) a. Kremer plays concertos in foreign countries on weekends. b. Kremer plays concertos on weekends in foreign countries. (Pesetsky 1995) (31)  If the argument for the structure in (31) based on the examples in (2830) goes through, then Prediction I clearly faces a problem. I should emphasize that it is not the mere existence of a left-branching structure that poses a problem for the Merge Right/Branch Right system I am proposing: I assume that left-branching structures are tolerated where they are necessary. Nor is it problematic that evidence for a left-branching structure should come from a c-command test: I predicted that c-command tests should not conflict in their results, not that they should always diagnose right-branching structures. What is problematic is the claim, if true, that in sequences of postverbal modifiers the rightmost modifiers must take widest scope, and that therefore this must be represented as a left-branching structure. This is unexpected in the current system, first because there should be nothing to block phrases on the left taking widest scope, as in a right-branching structure; second, because deviations from right-branching structures are predicted to be possible only when it makes a difference to interpretation, precisely what cannot be the case if (2830) are unambiguous. In addition, the kinds of scope readings among adverbials which are used to argue for left-branching structures are available even when there is also a variable binding dependency between the adverbials of the kind that has been used to motivate right-branching structures (Ernst 1994, Phillips 1995), in apparent violation of the prediction that there should be no conflicts between the results of different c-command tests. The examples in (3233) are based on the examples in (2830), except that a left-to-right quantifier-variable dependency has been added. Adding the quantifier-variable dependency seems to make no difference to the relative scope of the two adverbials, which is the same as in (2830). (32) a. I misled everyonei on purpose the day before hisi briefing. b. She kissed everyonei willingly on hisi cheek. (Ernst 1994) (33) a. Kremer plays quartets in foreign countriesi on theiri national holidays. b. Kremer plays quartets on new federal holidaysi in theiri first 5 years of existence. (Phillips 1995) At this point it seems that scope facts both contradict the results of other ccommand tests and motivate a left-branching VP-structure. However, these facts do not pose problems for Prediction I, because the scope generalization breaks down under closer scrutiny. We must control for the fact that sentence final focal stress has an independent effect on interpretation, which makes it tend to be associated with widest scope. This can be controlled for by adding a third adverbial, as in the examples in (34). While ensuring that the third adverbial is receiving focal stress, we can ask whether the first two adverbials show the same scopal biases that they showed when they were the only two adverbials. My informants share the intuition that any forced scope nesting among the first two adverbials that might have been present in (2830) goes away when an extra phrase is added that takes away the focal stress. (34) a. Sue kissed him willingly many times in front of the boss. b. Kremer plays concertos in foreign countries on weekends at the height of the season. In (34a) it is much easier than it was in (28b) to obtain a reading in which it is kissing many times that Sue did willingly (left-to-right scope), although the reading in which there were many individual willing kisses (right-to-left scope) is also still available. The loss of the requirement for right-to-left scope readings is even clearer in (34b). Recall that (30b) was most naturally taken to mean that it is on weekends that Kremer plays concertos in foreign countries. If this is the result of obligatory right-to-left scope then (34b) should be interpreted as it is at the height of the season that it is on weekends that Kremer plays concertos in foreign countries. This double restriction implies that when it is not the height of the season Kremer plays concertos in foreign countries at times other than on weekends. This reading is certainly not the required reading for (34b), and for many speakers it is not even available. The fact that the scopal relations among adverbials are not fixed by their linear order, as the examples in (34) seem to indicate, is more consistent with the system proposed here. It suggests that the facts in (2830) probably do not reflect obligatory right-to-left ccommand among multiple adverbial phrases, but instead reflect some independent property of focal stress assignment. Furthermore, if the scope readings in (2830) are not indicative of c-command relations, then the examples in (3233) also should not be taken to show a contradiction between the results of two different ccommand tests. I should stress again that I am not trying to claim that scope relations among adverbials are never structurally represented. This will become evident when we consider the interaction of adverb scope with VP-ellipsis in 2.6 below. What I am challenging is the claim that the scope readings in sequences of adverbials motivates obligatory right-to-left scope and hence obligatory right-to-left c-command relations. Therefore the first prediction holds up: that c-command tests should not conflict, and should uniformly diagnose right-branching structures except where forced by interpretation. 2.5 Linear Order and Left-edge Constituency Prediction II from 2.3 is repeated below as (35). (35) Prediction II Left-edge constituents are destroyed when material is added on their right. Therefore, evidence for left-edge constituents should be restricted to relations established before their constituenthood is destroyed by the addition of new material to their right. Prediction II points out a key prediction of the Merge Right approach to constituency. If apparent contradictions between constituency tests are a reflection of the stages of left-to-right derivations, in which certain constituents are destroyed when other constituents are created, then we expect some constituents to be available to syntactic processes for only part of a derivation. This section focuses on one aspect of this prediction. I show that when there is evidence for both left- and right-edge constituency in a given sentence, those syntactic relations which motivate the existence of left-edge constituents are always established before the addition of the material that motivates the existence of right-edge constituency. The constructions discussed here involve leftward and rightward movement and Right Node Raising. 2.5.1 VP-Fronting VP-fronting constructions appear to support the existence of a left-branching structure for VP, because strings starting at the left-edge of VP can be fronted, stranding material on the right-hand side of VP. The relevant examples were already presented in 2.2, and are repeated below as (36). (36) a. ... and [give candy to children in libraries on weekends] he did. b. ... and [give candy to children in libraries] he did on weekends. c. ... and [give candy to children] he did in libraries on weekends. d. ... and [give candy] he did to children in libraries on weekends. e. * ... and [to children in libraries] he did give candy on weekends. f. * ... and [in libraries on weekends] he did give candy to children. The examples in (37) are similar to the examples in (36), except that they contain a reciprocal binding relation between a pronoun in the fronted portion of VP and a reciprocal in the portion of VP that is stranded. This kind of binding relation is what we would expect to find if the fronted portion of VP were in its unfronted position, and if the entire VP were right-branching. However, the highlighted pronouns in (37ab) do not ccommand the reciprocal from their fronted position, nor would they even c-command the reciprocal if they were in their unfronted position in a left-branching VP structure. In other words, assuming reconstruction of the fronted phrases into a left-branching VP is insufficient to account for the binding facts. (37) a. ...and [give the book to them in the garden] he did ___ on each others birthdays. b. ...and [give the book to them] he did ___ in the garden on each others birthdays. (Pesetsky 1995, p.230) The fact that the material at the left-edge of VP can form a constituent to the exclusion of material at the right of VP would normally provide a straightforward argument for a left-branching VP structure. In combination with the binding evidence motivating a right-branching structure in (37), then, the VP-fronting facts appear to contradict one another. However, since in a left-to-right derivation the movement chain is completed before the anaphor is added, the system proposed here allows for the fronted portion of VP to be an incomplete right-branching VPstructure, rather than a piece of a left-branching VP structure. (38) shows how the facts in (3637) are expected in a left-to-right derivation of a right-branching structure. In (38a) the fronted portion of VP is first built, in its fronted position. This partial VP is internally right-branching, and is the result of a derivation like (38ae) above. (38) a. ... and [give [the book [to them]]]  Then in (38b) and (38c) the subject, and do are added to the structure, and then a copy of the fronted VP is inserted as the complement of Infl. I assume that the movement chain is licensed at this point in the derivation, and not later. (38) b. ... and [give [the book [to them]]] he did  c. ... and [give [the book [to them]]] he did [give [the book [to them]]]  Subsequently, in (38d) extra material is added to the right of VP, inserting the temporal modifier on each others birthdays. This adverbial is inserted at the bottom of a right-branching VP, which (i) creates the c-command relation necessary to license the reciprocal binding relationship, and (ii) destroys the constituency of the string give the book to them. The loss of this constituent does not matter, though, because the movement chain involving this constituent had already been established and licensed before the modifier was added. (38) d. ... and [give [the book [to them]]] he did [give [the book [to [them [on each others birthdays]]]]]  In this way the apparent constituency conflict can be captured in the derivation of a right-branching structure. [See 2.6 for discussion of a contrast between the VP-fronting facts presented here and related facts involving VPellipsis.] It is important to note that this way of achieving the effects of contradictory constituency from a single derivation depends on the left-to-right ordering of the derivations that I am assuming here. To see this, imagine what would happen if we tried to capture the same effects in a bottom-to-top derivation. In this kind of a derivation, the entire right-branching VP would be built before the VP-fronting operation could apply. But this would mean that by the time the VP-fronting operation could apply, the portion of VP that is fronted in (38) would no longer be a constituent, and therefore could not be a candidate for movement. Alternatively, it might be objected that the constituency conflict shown by (3637) is only apparent, because the binding relations are entirely consistent with a left-branching VP-structure, and that the problem is an artifact of assuming that the binding relations motivate a right-branching structure. This analysis could account for the facts in this subsection, but arguments to be presented in 2.67 involving contrasts between VP-fronting and VP-ellipsis show that this analysis leads to the loss of important generalizations about the distribution of constituency conflicts. While the account given here makes it possible to account for the apparent conflict between binding and movement diagnostics, there is an additional restriction on VP-fronting which does not follow automatically from the phrase structure theory presented here. VP-fronting does not allow the two complements of a double object construction to be separated by movement (39), and neither can an argument or adverbial phrase be split up by movement (40). (39) * ...and [give the children] he did candy in libraries on weekends. (40) a. * ...and [give candy to] he did the children in libraries on weekends. b. * ...and [give candy to children in] he did libraries on weekends. In Pesetskys left-branching Layered structures this restriction follows immediately, because the bracketed strings in (3940) are not constituents. Under the Merge Right approach, on the other hand, in which partial VPfronting is just fronting of intermediate stages in the construction of right-branching VP structures, the bracketed strings in (3940) are constituents, and so we might expect them to be allowed to front. The additional requirement seems to be that the fronted portion of VP be a potential complete VP. None of the fronted strings in (3940) are possible as complete VPs. I assume that the requirement that a potential complete VP fronts is a construction-specific semantic requirement, and that the restriction does not undermine the claim that the bracketed strings in (3940) are potential constituents. As we will see later in this section, Right Node Raising allows coordination of many of the constituents that cannot undergo VP-fronting, because it is not subject to the same semantic requirement. Note that the restriction cannot be that only adverbial phrases can be stranded by VP-fronting. The two complements of the verb in a dative construction may sometimes be separated by VP-fronting. When the goal argument is optional, as with the verb give, the verb and the theme may be fronted (41a); when the goal argument is obligatory, as with the verb hand, the verb and the theme cannot be fronted (41b). (41) a. (?) ...and [give candy] he did to the children in libraries on weekends. b. * ...and [hand candy] he did to the children in libraries on weekends. The contrast between (41ab) is consistent with the generalization that only potential complete VPs may be fronted, and shows that this notion must be relativized to individual lexical items. I leave the reason for the semantic restriction on VP-fronting as an open question for the time being. However, I note one reason why the restriction is not surprising. The fronted portion of VP in VP-fronting constructions is the entire VP in the first conjunct of these constructions. The initial conjunct for sentences like (3940) would have to be like (42), which are clearly impossible. (42) a. * John intended to give the children, and ... b. * John intended to give candy to, and ... c. * John intended to give candy to children in, and ... Next we consider constructions which constituency conflicts in only a limited range of situations, in a manner predicted by their ordering properties. 2.5.2 PP-movement The interaction of binding and movement processes involving PPs shows a constituency conflict similar to what we have seen with VP-fronting in (3638), but with an additional twist which enables us to test the predictions of Merge Right more closely. As we have already seen above, the binding and coordination properties of noun phrases inside PPs motivate right-branching structures in which the NP is not the sister of the preposition that selects it, and instead forms a constituent with the category that follows it. (43) gives some examples of the kinds of binding phenomena which have led to this conclusion, and (44) shows the split PP structure that these motivate. (43) a. Mrs. McGarrick sent a card to every childi on hisi birthday. b. The urban-hygiene inspectors departed from every cityi during itsi rush hour. c. The chef told the guests about every dishi as iti was served. d. Mrs. McGarrick gave a card to none of the children on any of their birthdays. e. Mrs. Murray gave money to her children on each others birthdays. (44)  A P-NP combination that has been split up in the manner shown in (44) is not a constituent, and therefore should not be able to undergo movement. Clearly, though, leftward movement of PPs presents no problems, as the examples in (45) indicate. This implies that the P-NP combination is a constituent after all. (45) a. To each of the girls John gave a package ___ wrapped in brown paper. b. To which city in Connecticut did Mary take the train ___ every day of the week? Moreover, the kind of binding out of a PP which motivated the PPsplitting structures is still possible when the PP containing the binder is fronted, as the examples in (4647) show. The examples in (47ab) are taken from Pesetsky 1995 (p.228). (46) a. To each of the girls John gave money for her college fees. b. To which pair of boys did John accidentally give money on each others birthdays? (47) a. To none of the officials did Sue send her money ___ on any of these days. b. On which table did Tom put the book ___ during its construction? The interaction of movement and binding with PPs thus gives rise to a constituency contradiction similar to the one we saw with VP-fronting, since the movement properties support a structure in which the PP is a constituent, whereas the binding properties support a structure in which the PP is not a constituent. The contradiction can be accounted for in exactly the same way that the VPfronting facts were explained, because both links of the movement chain are built prior to the addition of the adverbial phrase that creates the ccommand relation required for binding and destroys the constituency of the PP. The relevant steps of the derivation of sentence (46a) are given in (48). First the fronted PP is built sentence initially (48a). At this point the PP to each of the girls is a constituent. Next the material intervening between the head and the tail of the PP-movement chain is added to the structure (48b), and then a copy of the fronted PP is inserted at the appropriate position in VP for the goal argument of give (48c). At this point both ends of the PPmovement chain are constituents. It is only when the additional phrase for her college fees is added on the right that the PP to each of the girls is split, such that the NP each of the girls forms a constituent with the following PP and is able to license the bound variable pronoun her (48d). (48) a. [to [each of the girls]] b. [to [each of the girls]] John gave money c. [to [each of the girls]] John gave money [to [each of the girls]] d. [to [each of the girls]] John gave money [to [[each of the girls] for her college fees ]] Thus far the PP movement facts are entirely parallel to the VP-fronting facts in (3638). In both cases we have observed what under standard assumptions would be a straightforward constituency contradiction, and shown that the Merge Right approach makes it possible to account for such facts in terms of how constituency changes over the course of a derivation. Until now, though, we have not directly tested the prediction that once a constituent is destroyed it cannot be referred to again later in the derivation. This prediction can be tested with PPs, since PPs can be moved both leftwards and rightwards. If the Merge Right approach to contradictory constituency involving PPs is correct, then we should expect to find differences between leftward and rightward movement of PPs with respect to how they interact with binding. I assume here that rightward movement is identical to leftward movement insofar as it involves a series of copies of a given phrase, just one of which is overtly realized. The only difference between leftward and rightward movement, therefore, will be in whether the overt copy is on the left or on the right of the unpronounced copies. I assume in addition that heavy shift operations involve a lowering operation which copies a phrase in its base position inside VP to a position lower in a right-branching VP structure. Now consider the structure in (44), repeated below with category labels as (49a). (49) a. b.   As we have already seen, structure (49a) is consistent with leftward movement of the PP to every child, because PP-splitting occurs only after the movement chain has been completed. On the other hand the structure in (49a) should be incompatible with rightward movement (i.e. heavy shift) of the PP to every child across the PP on his birthday. This is because a left-to-right derivation does not allow the rightward movement chain to be completed before the addition of the following PP, which would normally be the trigger for PPsplitting. If, on the other hand, the PP fails to undergo PP-splitting and remains a constituent when a subsequent PP enters the derivation, yielding the slightly less right-branching VP-structure in (49b), then the PP should be fully capable of participating in a rightward movement chain. The price of failing to undergo PP-splitting, though, is that the NP every child should no longer be able to act as a binder, because it cannot c-command out of PP. This prediction appears to be correct, as the following examples show. First we need to show that rightward movement does in principle allow reconstruction effects for the purposes of binding. (50) demonstrates this for Heavy NP Shift using an example from Baltin & Postal 1996. (50) a. I described [the victim whose sight had been impaired by the explosion] to himself. b. I described ___ to himself [the victim whose sight had been impaired by the explosion]. (51) shows that Heavy PP Shift is a possible operation. The crucial examples in (52) and (53) show that when a PP that allows binding when in-situ (52a, 53a) undergoes Heavy PP Shift, the binding is no longer possible (52b, 53b)., (51) I gave money in an envelope to every boy who had helped me clean the yard. (52) a. I gave money to the boys for themselves. b. * I gave money ___ for themselves to the boys who had helped me clean the yard. (53) a. I gave money to every boy on his birthday. b. * I gave money ___ on his birthday to every boy who had helped me clean the yard. The contrast between the possibility of reconstruction and binding when a PP is moved leftwards and the impossibility of reconstruction and binding when the same PP is moved rightwards is a straightforward consequence of a theory which assumes left-to-right derivations and splitting of PPs. As far as I can tell the contrast is not expected under accounts in which structure is built from bottom-to-top or in which there are no derivations. 2.5.3 Right Node Raising Right Node Raising gives rise to constituency puzzles similar to the ones discussed in 2.5.1 and 2.5.2, in that different properties of a single sentence appear to provide evidence for two different structural analyses of that sentence. But Right Node Raising provides the most extreme case yet. Whereas in 2.5.1 and 2.5.2 we were concerned with conflicting structural analyses for PPs or VPs, Right Node Raising creates conflicts in the analysis of entire sentences. The classic form of Right Node Raising (RNR) involves coordination of subject-verb sequences, with the remaining clausal material effectively shared between the two conjuncts, as in (54). (54) [John sold] and [Mary bought] the stack of books that was required for linguistics 101. If we adopt the logic standardly applied to coordination, that strings that can be coordinated are constituents, then the fact that the strings John sold and Mary bought can be coordinated in (54) provides evidence that the subject and the verb can form a constituent to the exclusion of the object. The primary aim of this section is to show that RNR motivates the existence of non-standard constituents like [subject verb], which most accounts of RNR have attempted to deny. The secondary aim of the section is to show that the existence of constituents like [subject verb] does not entail the existence of structures like (55) in which the subject fails to c-command the object. (55)  The structure in (55) predicts that the subject should not be able to bind an object in RNR examples similar to (54). (56) shows that this prediction is clearly false. The shared object in an RNR sentence can be bound by the subject. Further evidence against the structure in (55) is given below. (56) a. John sold and Mary bought each others textbooks. b. Everyonei suspected but nobodyi really believed that hei was being investigated by the FBI. I will show that the apparent conflict between the constituency motivated by the coordination and the binding in (56) can be resolved in a left-to-right approach to structure building. 2.5.3.1 Disguised Clausal Coordination Since almost all phrase structure analyses have assumed that the subject and the verb in English uncontroversially do not form a constituent to the exclusion of the object, RNR has typically been analyzed as one form or another of disguised coordination of clauses. There have been two basic approaches to treating RNR as clausal coordination. The first approach, illustrated in (57), is to assume that RNR involves clausal coordination followed by across-the-board (ATB) rightward extraction of the shared material (cf. Ross 1967/1986, Maling 1972, Postal 1974 and many others). In other words, the shared material is part of both conjuncts, but it is not in-situ in either conjunct. (57)  The second kind of clausal coordination approach to RNR encompasses a number of theories which modify standard phrase structure theories in such a way that the shared material in RNR can be in-situ and shared between both conjuncts without ATB extraction. Versions of this approach have been advanced by Williams 1978 and Erteshik-Shir 1987 under the heading of clausal factorization, by Goodall 1987 in terms of phrase marker union, and by Muadz 1991 and Moltmann 1992 under the heading of three-dimensional phrase markers. What these approaches have in common is that they assume that RNR is the result of the superimposing of two partially identical sentences or factors upon one another. Where the two sentences are identical, there is just one representation for both occurrences. Only where the factors differ do the representations of the two factors diverge, as (58) shows. This separation of the two sentences is marked by a conjunction such as and, which is quite crucially not a part of either of the independent factors. (58) a. I know that John sold a large stack of linguistics books. and I know that Mary bought a large stack of linguistics books. b.  Both of the clausal coordination approaches to RNR manage to avoid positing non-standard constituents (e.g. subject-verb) by assuming that the shared material is somehow a part of each conjunct, either in-situ or extracted. However, I will argue that neither of these approaches can rescue a clausal coordination analysis of RNR (regardless of their other merits). Therefore, if it can be shown that the shared material in RNR sentences cannot have been moved out of both of the conjuncts, and cannot be in-situ in both conjuncts, then we must conclude that the shared material in RNR is not a part of both conjuncts, and therefore RNR must involve the coordination of units smaller than a clause. If this is the case, then the characterization of the puzzle in (5456) stands. 2.5.3.2 Right Node Non-Raising There are a number of arguments in the literature against the ATB extraction analysis of RNR. The logic of these arguments is typically to show that the shared material behaves as if it has not undergone movement based on some diagnostic or other. This could involve either evidence that the shared material in RNR fails to induce movement violations in situations in which the ATB extraction analysis would predict a movement violation, or evidence that binding relations are possible which are unexpected if the shared material has been displaced. (59) shows that RNR does not induce wh-island violations (Wexler & Culicover 1980). (59ab) shows that leftward movement across who leads to ungrammaticality; (59c) shows that no such violations are incurred in RNR, suggesting that movement has not occurred. (59) a. * What does Mary know a man who buys and Bill know a man who sells? b. * It is pictures of Fred that Mary knows a man who buys and Bill knows a man who sells. c. Mary knows a man who buys and Bill knows a man who sells pictures of Fred. (60) shows that in languages in which preposition stranding is strongly ungrammatical the complement of a preposition can be shared in RNR (McCloskey 1986), leaving the preposition stranded at the right-hand edge of each conjunct. This suggests that extraction from PP has not occurred. Example (60) is taken from Irish, but identical arguments can be made with Spanish, French or Polish, as McCloskey shows. (60) Nl s in aghaidh an dl a thuilleadh a bheith ag isteacht le is-not it against the law anymore be listen(prog) n ag breathn ar ridi agus teilifs an Iarthair. with or look(prog) on radio and television the West(gen) It is no longer against the law to listen to, or to watchWestern radio and television. Next, although the simplest cases of RNR involve the coordination of subject-verb sequences and the sharing of a direct object, the examples in (61) show that more than just subject-verb sequences can be coordinated and more than just direct objects can be shared in RNR constructions. (61) a. [John will] and [Mary already has] mailed the conference program to all of the presenters. b. [John will post] and [Mary is about to e-mail] a copy of the conference program to all of the presenters. c. [John will mail the abstracts] and [Mary is about to e-mail the program] to anybody who registered in advance. The relevance of the examples in (61) to ATB accounts of RNR is that they show that a wide range of different categories can serve as the shared material, including categories for which there is no independent evidence that they can undergo movement. For example, neither the VP headed by a participial in (61a) nor the two objects of the double complement construction in (61b) can undergo leftward movement in English, as the examples in (62) show. Nor can they undergo rightward movement, as (63) shows. (62) a. * (and) [mailed the conference program to all of the presenters] Mary already has. b. * [A copy of the conference program to all of the presenters] Mary is about to e-mail. (63) a. * Mary already has ___ from her local post office [mailed the conference program to all of the presenters]. b. * Mary is about to email ___ from her company account [a copy of the conference program to all of the presenters]. I do not claim to have an explanation of why the movements shown in (62) are impossible. The relevance of the examples in (6263) is just that since we know that the shared phrases in (61ab) cannot be moved leftwards or rightwards, it would be surprising if these phrases are allowed to move only when the movement is string vacuous. But this assumption would be the only way of accommodating (61ab) under an ATB analysis of RNR. As a further argument against the ATB account of RNR, (64) shows that the shared material in RNR behaves as if it is in-situ for the purposes of a variety of tests of binding and coreference (Levine 1985). The subject can bind a variable inside the object in (64a), it can license a negative polarity item (64b), and it induces a Condition C violation in (64c). These facts again suggest that movement has not taken place. (64) a. [Everyonei liked] and [at least one personi loved] the paper hei had been asked to review. b. [Nobody enjoyed] and [few people even liked] any of the talks on Right Node Raising. c. * [I know that shei said] and [I think we all agree] that Maryi needs a new car. Finally, if RNR does not involve movement, then the shared constituent should always fill the final position of the coordinated constituents (cf. Oehrle 1991, Wilder 1994). If, on the other hand, RNR involves ATB extraction (Williams 1990; Postal 1994), then it ought to be possible to share a phrase that has been extracted from the middle of both conjuncts. Distinguishing between these alternatives requires some care, because RNR may interact with heavy NP shift in such a way that it appears that the shared material has been extracted. For example, we could derive (65) either by directly moving the clause final NP out of each of the underlined gaps, or by first applying HNPS in each conjunct, and then sharing the final NP without movement, as in the derivation sketched in (67). (65) [Patty sent ___ to Greenland] and [Susie sent ___ to her rich Uncle Ben] a list of all the things she wanted for Christmas. (66) a. [Patty sent ___ to Greenland ___] and [Susie sent ___ to her rich Uncle Ben ___] a list of all the things she wanted for Christmas. (67) a. [... V NP PP] and [... V NP PP] basic order b. [... V __ PP NP] and [... V __ PP NP] heavy NP shift c. [... V __ PP] and [... V __ PP] NP right node raising We can test for whether (65) is the result of ATB extraction from the middle of each conjunct or the result of heavy shift feeding RNR by constructing examples in which the shared material cannot undergo heavy NP shift. Once we do this, as (6870) show, RNR becomes impossible. (68) shows that stranding prepositions in RNR, where the stranded preposition is the final word of the first conjunct, is acceptable. (69) shows that P-stranding does however cause problems for heavy NP shift. (70) is like (68), except that it contains the impossible HNPS environment from (69). (68) Patty wrote to and Susie sent email to the person she hoped would bring her wonderful Christmas gifts. (69) * Patty wrote to after breakfast the person she hoped would bring her wonderful Christmas gifts. (70) * Patty wrote to after breakfast and Susie sent email to just before lunch the person she hoped would bring her wonderful Christmas gifts. The fact that (70) is also bad therefore implies that RNR cannot share material from the middle of the conjuncts. Therefore, the impression that this is possible that we might draw from (6566) is just due to the fact that heavy shift feeds RNR. For reasons like these, it has generally been concluded that the ATB movement approach to Right Node Raising is not viable. But this does not necessarily entail that Right Node Raising cannot be clausal coordination, because all of the facts in (5970) are are consistent with the clausal factorization approach to RNR. This is because the shared material is in-situ in both conjuncts in clausal factorization theories. 2.5.3.3 Factorization and Ordering In what follows I do not try to argue against three-dimensional or factorization approaches to coordination in general. In fact, I think that there are a number of good reasons to adopt such an approach. My criticism is targeted specifically at the use of these approaches to give a clausal coordination analysis of RNR, and thereby avoid the need to posit non-standard constituents like [subject verb]. I assume that the final representations of RNR sentences involve in-situ shared phrases, as is the case in factorization theories, but I assume that coordination occurs at an earlier point in the left-to-right derivation, when the shared material has not yet been added. The example of RNR in (71a) is derived by first building a subject-verb constituent, at which point it can be coordinated with another subject-verb constituent. I assume that this conjunction receives the kind of interpretation that Moltmann 1992 proposes for parallel structures in her theory of three-dimensional structures. It is only after this coordination has been licensed that the shared object is added on the right, destroying the constituency that had licensed the coordination, and creating a configuration in which the subject c-commands object. (71) a. John sold and Mary bought the stack of books that were required for linguistics 101. b.  Therefore, RNR only gives the appearance of coordinating the pieces of left-branching structures because the coordinated phrases are constructed prior to the addition of the shared material. Given that the final representation in (71b) looks very much like what is assumed in factorization approaches, an natural question to ask is why there is any need to assume that non-standard subject-verb constituents are coordinated. The argument for this comes from some facts involving the relative ordering of the conjuncts and the shared material. Any account of RNR must explain the impossibility of examples like (72), in which the shared material occurs at the end of the first rather than the second conjunct. (72) * John saw Mary and Bill likes. According to the left-to-right theory, (72) is impossible because by the time in the derivation when the complete sentence John saw Mary has been built, the string John saw is no longer a constituent, and so there is no longer a subject-verb constituent available to coordinate with Bill likes. In other approaches to RNR it is also rather straightforward to account for the ill-formedness of (72), by invoking some additional ordering requirement. This additional mechanism either deletes the copy of the shared material in the first conjunct (Wexler & Culicover 1980, Kayne 1994), or aligns the phrase that is in-situ in both conjuncts with the right-hand edge of the second conjunct (McCawley 1982, Moltmann 1992). However, we can show that such additional mechanisms fall short when faced with some additional ordering facts. Thus far in this section we have only considered examples of RNR in which the two conjuncts are connected by a standard coordinator, such as and. However, it is possible to construct examples of non-coordinate RNR, as pointed out by Richard Hudson in a 1976 paper and largely overlooked in most treatments of RNR since then. (73) repeats some of Hudsons examples. The examples in (74) are from Postal 1994: the highlighted strings take the role that coordinators take in simpler cases of RNR above. As the examples show, it is even possible (73a) for a subject and an object to behave as the two conjuncts in this variety of RNR. (73) a. Of the people questioned, [those who liked] outnumbered by two to one [those who disliked] the way in which the devaluation of the pound had been handled. b. Id have said he was sitting [on the edge of] rather than [in the middle of] the puddle. c. Its interesting to compare [the people who like] with [the people who dislike] the power of the big unions. (74) a. [Politicians who have fought for] may well snub [those who have fought against] chimpanzee rights. b. [People who are learning to speak (in)] may hate [those who already can speak (in)] that little-known language. c. [People who believe there may soon be on Venus] tend to distrust [those who believe there already are on Mars] extraterrestrials capable of understanding parasitic gaps. d. [Spies who learn when] can be more valuable than [those able to learn where] major troop movements are going to occur. The examples in (75) extend Hudsons examples and show that in noncoordinate RNR, where the Coordinate Structure Constraint presumably does not apply, one of the conjuncts can undergo movement, independently of the other. (75bc) shows raising of the first conjunct, (75d) shows passivization, (75e) shows an instance of possible unaccusative raising, and (75f) shows whmovement. (75) a. The people who liked easily outnumbered the people who disliked the movie. b. The people who liked must ___ have easily outnumbered the people who disliked the movie. c. The people who liked seemed ___ to have easily outnumbered the people who disliked the movie. d. The people who like are easily outnumbered ___ by the people who dislike the movie. e. The people who liked arrived ___ much earlier than the people who disliked the movie. f. Which voter group that liked ___ outnumbered which voter group that disliked the info-mercial? Now consider the examples in (7677). In each of the examples the shared material appears to the right of both conjuncts. But whereas in the (a) examples the shared material also appears to the right of the underlying position of both conjuncts, in the ungrammatical (b) examples the shared material appears to the left of the underlying position of the moved conjunct. (76) a. The people who liked seemed ___ to have offended the people who disliked the movie about Reagans childhood. b. * The people who liked seemed to the people who disliked the movie about Reagans childhood ___ to be complete fools. (77) a. Which voter group that liked ___ outnumbered which voter group that disliked the info-mercial? b. * Which voter group that disliked did which voter group that liked the info-mercial outnumber ___? Therefore, there appears to be a requirement that the shared material appear to the right of both the surface and the underlying positions of both conjuncts. This fact does not follow from an approach to ordering in RNR which assumes that the ordering restrictions are the result of a surface linearization rule or surface filter. On the other hand, the restriction does follow from the account that I have given, in which coordination takes place before the shared material is added to the derivation. (78) shows the range of possible and impossible movements in non-coordinate RNR, as predicted by the account given here. (78) a. Conj1 Conj2 shared-material  b. Conj1 Conj2 shared-material  c. Conj1 shared-material Conj2  If the first conjunct moves to a position on the left of the second conjunct (78a), both constituents are available to be coordinated before the shared material is added. Problems arise, however, if the first conjunct must move to an underlying position to the right of the second conjunct and the shared material, as in (76b) and (77b). The reason for this is that once the shared material has been added, neither the first conjunct alone nor the first conjunct plus the shared material form a constituent. Thus movement is blocked, despite the fact that the surface ordering of the conjuncts and the shared material is the same as in well-formed instances of RNR. A theory which derived ordering restrictions on RNR from a surface linearization filter would therefore fail to exclude (76b) and (77b). Meanwhile, a rule which simply required the underlying position of the shared material to be to the left of both conjuncts would fail to exclude situations like (77c), in which the surface position of the shared material precedes the second conjunct, but its underlying position does not. Such cases are clearly bad, as (79) shows. (79) a. * The people who liked the movie about Reagans childhood seemed to the people who disliked ___ to be complete fools. b. * Which voter group that disliked the info-mercial did which voter group that liked outnumber ___? The examples in (79) are ruled out in the current approach to RNR because the combination of the first conjunct with the shared material prior to the building of the second conjunct rules out the possibility of coordinating the two conjuncts. What I hope to have shown with this argument is that the conjuncts in RNR have properties before the shared material is added which they do not have after the shared material is added, e.g., they can move. This distinction is straightforwardly expressed in a left-to-right approach in which RNR involves coordination of non-final constituents, but it is not easily captured in a more standard version of factorization theories in which RNR involves coordination of units which include the shared material. 2.5.4 Summary Briefly summarizing the results of this section. I have provided evidence for two aspects of the left-to-right approach to structure building. First, in both VPfronting and Right Node Raising constructions I gave evidence for the participation in grammatical processes of the pieces of incomplete phrase markers. Second, I have shown evidence for an account of constituency conflicts which attributes conflicting results of different constituency diagnostics to the different stages of a left-to-right derivation. We have observed a series of constructions in which evidence for right-branching structures appears to coexist with evidence for non-right-branching structures. In each case, though, the syntactic relations which motivated the non-right-branching structures were shown to be established to the left of the syntactic relations which motivated the right-branching structures. In the one case where this ordering generalization was violated (Heavy PP Shift) contradictory constituency effects were not observed. This ordering generalization receives a straightforward explanation in the Merge Right approach to syntactic structure building, but is hard to capture otherwise. 2.6 Constituency Conflicts and Parallelism This section demonstrates a contrast in the distribution of contradictory constituency effects between VP-fronting constructions of the kind already discussed in 2.5 and VP-ellipsis constructions. The two constructions are superficially similar, in that they involve replacement of a VP by do and allow stranding of adverbial phrases. (80) a. Mary read the book on Monday and John did on Thursday. (VP ellipsis) b. John had to finish the paper, and finish the paper he did on Thursday. (VP fronting) The contrast that I focus on here is that while the VP-fronting construction (VPF) exhibits contradictory constituency effects, as 2.5 showed, the VP-ellipsis (VPE) construction does not. This contrast provides support for the left-to-right approach to structure building proposed here, as we shall see. Both VPE and VPF involve coordination, and like all coordinate structures they are subject to parallelism requirements. (81) repeats the prediction from 2.3 above about the interaction of parallelism requirements and contradictory constituency effects. (81) Prediction III a. Parallelism requirements across two conjuncts should only be able to refer to properties of the final constituent structure of the first conjunct. b. Parallelism requirements between conjuncts should block contradictory constituency effects which would be possible in either of the conjuncts individually. The reasoning behind this prediction is as follows. In a left-to-right derivation the first of a pair of conjuncts will be fully assembled before the second conjunct is built. Therefore, as the second conjunct is being constructed it should only be possible to access the properties of the completed first conjunct, and not properties of intermediate stages in the derivation of the first conjunct. Since contradictory constituency effects in this theory are explained with reference to intermediate stages in the derivation of clauses the conditions for contradictory constituency effects should not be available when parallelism constraints apply. 2.6.1 An Asymmetry between VP-Fronting and VP-Ellipsis Both VPF and VPE allow fronting/ellipsis of strings of phrases starting at the left edge of VP, and stranding of material from the right edge of VP. Examples are shown in (8283). These are the kinds of facts which in the past have led people to assume that complex VPs have a left-branching structure. (82) a. ... and [give candy to children in libraries on weekends] he did. b. ... and [give candy to children in libraries] he did on weekends. c. ... and [give candy to children] he did in libraries on weekends. d. ... and [give candy] he did to children in libraries on weekends. e. * ...and [to children in libraries] he did give candy on weekends. f. * ...and [in libraries on weekends] he did give candy to children. (83) a. John gives candy to children in libraries on weekends, and Mary does (too). b. John gives candy to children in libraries on weekends and Mary does on federal holidays. c. John gives candy to children in libraries on weekends and Mary does in urban parks on federal holidays. Both VPF and VPE show evidence for right-branching structure within the fronted/elided portion of VP based on binding evidence, as shown in (8485), in which no VP-material is stranded. (84) a. ... and [introduce the children to each other] the teacher proceeded to do. b. ... and [congratulate everybody on his birthday] he did. (85) a. The principal introduced the children to each other, and then the teacher did (too). b. The boss congratulated everybody on his birthday, and the receptionist did (too). Up to this point VPF and VPE are entirely alike. However, a contrast emerges when we look at the relations that are possible between the fronted/elided portion of VP and the stranded portion of VP. The examples in (86) parallel examples from 2.5 above which show that the fronted portion of VP has the binding properties that it would have if it were insitu in a right-branching VP. The evidence for this is that material in the fronted portion of VP is able to bind reciprocals or bound variable pronouns in the stranded portion of VP (86ab), and a quantificational direct object is able to take wide scope with respect to a stranded adverbial, as demonstrated by the availability of a distributive reading for (86c), according to which each individual book-reading was fast. (86) a. John said he would give books to them, ... and give books to them he did [on each others birthdays]. b. Mary said she would congratulate every boy, ... and congratulate every boy she did [at his graduation]. c. John said he would read every book, ... and read every book he did [at breakneck speed]. In corresponding examples involving VPE, on the other hand, we do not find corresponding evidence for right-branching structure. Material inside the elided portion of VP is not able to license anaphors or bound variables in the stranded portion of VP, as (87ab) show., (87) a. * John gave books to them on each others birthdays, and Mary did [on each others first day of school]. b. * Mary congratulated every boy at his graduation, and Sue did [at his 21st birthday party]. Stranded VP-material takes wide scope with respect to material in the elided portion of VP. This can be seen by comparing the possible interpretations of the single clause in (88) with corresponding examples involving VP-ellipsis. (88) Mary finished every book quickly. (ambiguous) (88) allows both a collective reading in which it is the reading of all of the books which took place quickly, or a distributive reading, in which the reading of each individual book was fast. Speakers tend to report a preference for the distributive reading, which I take to be a reading in which the object NP has wide scope with respect to the adverbial, as in the tree in (89a). Both the collective reading and the distributive reading are available, however. (89) a. b.   When sentences like (88) are embedded in a VP-ellipsis context, though, ambiguities disappear. Given the two readings of the single conjunct in (88) there are up to four potential readings for the two conjuncts of a VP-ellipsis sentence (i.e. collectivecollective, distributivedistributive, collectivedistributive, distributivecollective). Only one of these four possibilities is actually available, namely the collectivecollective reading in which what was quick (or slow) was the reading of the entire set of books, and not individual book-readings. The unavailability of the two readings in which the conjuncts have differing scopes may be ruled out by appeal to parallelism constraints, but we need an explanation for the absence of the distributivedistributive reading. (90) Mary finished every book quickly, and John did slowly. (collective reading only) This loss of ambiguity is particularly striking because it involves the loss of the reading that is generally preferred in the simple sentence in (88), with the consequence that many speakers experience a garden path kind of misanalysis when they first read through examples like (90). To my knowledge, all of the examples in the literature showing loss of a scopal reading in ellipsis contexts involve the loss of the reading that is the marked or dispreferred scope reading in simple sentences. This makes the loss of the preferred reading in sentences like (90) all the more striking. Consistent with the loss of the distributive reading in (90), if we replace the quantifier every in (88) and (90) with a quantifier like each, which only allows a distributive reading in the simple sentence (91a), we find that VPE becomes impossible (91b). (91) a. Mary finished each book quickly. (distributive reading only) b. * Mary finished each book quickly, and John did slowly. Therefore the examples in (8791) show the following contrast between VPfronting and VP-ellipsis. When a partial VP is fronted, it has the binding properties that it would have if it was in-situ in its underlying position and formed part of a right-branching VP structure. When a partial VP is elided, on the other hand, has the binding properties that it would have if it was replaced in its underlying position and formed a part of a more left-branching VP. Further confirmation of this contrast between VPF and VPE is provided by constructions which require a right-branching VP structure. By hypothesis, resultative constructions require a complement structure in which the object and the result-phrase form a single constituent, as they do in the right-branching structure in (92) (cf. Kayne 1985, Van Voorst 1986, Hoekstra 1988, 1992; but cf. Carrier & Randall 1992, Levin & Rapaport Hovav 1995 for dissenting opinion). (92)  If VPF but not VPE allows reconstruction into a right-branching VP, then we expect that VPF will allow fronting of the verb and the direct object, stranding the result-phrase, but that VPE will not tolerate similar stranding of the result phrase. This prediction is correct, as the examples in (9394) show. (93) On Saturday Mary resolved to paint her garage door, ... and paint her garage door she did all the colors of the rainbow. (94) * Mary painted her garage door black, and John did all the colors of the rainbow. Therefore, it seems to be a reliable fact that VPE does not allow reconstruction into a rightbranching VP. We can also rule out the possibility that the absence of effects of reconstruction into a right-branching VP is an artifact of semantic or discourse properties of VPE. This possibility is ruled out by the fact that VPE does allow the scope relations of right-branching VP-structures when the entire VP is elided (95), so it cannot be a property of ellipsis per se that blocks the distributive reading in (90) above. (95) Mary read all the books quickly, and John did too. (collective & distributive readings both ok) Example (96) shows that the VP-deaccenting construction (VPD), which has been shown to be very similar to VPE in a number of respects (cf. Tancredi 1992), does not show the loss of the distributive reading that we saw in (90). (96) is most felicitous when the adverbs are read with contrastive stress. (96) Mary read all the books quickly, and John read all the books slowly (collective and distributive readings both ok) Since VPE and VPD imply exactly the same kind of parallels and contrasts between the two conjuncts, we can rule out the possibility that the loss of the distributive reading in (90) is due to the semantic parallelism that has to hold between the two conjuncts in VPE, and we can therefore be confident that the loss of the distributive reading in (90) and the parallel unavailability of right-branching binding relations in (87) is due to some syntactic property of VPE. The Merge Right theory provides an account of the contrast between VPE and VPF as follows. In 2.5 I already showed how I assume that contradictory constituency effects are made possible in VPF constructions. This derivation is repeated in (97). Building as usual in a strictly left-to-right fashion, first the fronted portion of VP is built, presumably in a left-adjoined position (97a). The fronted portion of VP is internally right-branching. Next the subject and do are added (97b), and then a copy of the fronted VP is inserted in the normal position of VP (97c). At this point the movement chain can be licensed. Subsequent to this the stranded VP material is added at the right of VP, and the structure of the VP can be altered in the now familiar fashion to allow the continuation of a right-branching VP to be built (97d). (97) a. ... and [give [the book [to them]]] b. ... and [give [the book [to them]]] he did c. ... and [give [the book [to them]]] he did [give [the book [to them]]] d. ... and [give [the book [to them]]] he did [give [the book [to [them [on each others birthdays]]]]] In this way we can resolve the apparent contradiction between the kinds of partial VPs that can be fronted, which lead to the appearance of a left-branching structure, but the possibility of the scope and binding relations of a right-branching structure. Next consider what happens if we try to derive similar effects in a VPellipsis construction. I focus here on the loss of the distributive scope reading shown in (90), but the analysis applies equally to the impossibility of binding relations shown in (87). In a strictly left-to-right derivation the first conjunct of the VPE construction will be built in its entirety before the second conjunct is built. Let us suppose that there are two possible ways of deriving the first conjunct, one of which yields a left-branching VP structure, in which the adverbial takes wide scope with respect to the object NP (collective reading), and the other of which yields a right-branching structure (distributive reading). These are the alternatives shown in (89) above, and repeated in (98). (98) a. b.   Just as I assumed that only constituents of VP may be fronted (although they need not be final constituents), I adopt the standard assumption that only constituents may undergo ellipsis, and that they must also be identical to a constituent of VP in the first conjunct. If a left-branching VP like (98b) is formed in the first conjunct then the verb and the direct object form a constituent in the final structure for that conjunct. Therefore, ellipsis of the verb and the direct object is possible in the second conjunct when it is built, allowing for collective scope readings. Additionally, the semantic parallelism constraint that the two conjuncts in ellipsis are subject to forces the adverbial to stand in the same relation to the rest of the VP as the adverbial in the first conjunct, i.e. it must c-command the rest of the VP.  If, on the other hand, a right-branching VP is formed in the first conjunct (98a), then the verb and the direct object will not form a constituent in the final structure of the first conjunct, and therefore they will not be a candidate for ellipsis in the second conjunct. The fact that the verb and the direct object in the first conjunct had been a constituent at an earlier point in the derivation is irrelevant, because this stage in the derivation is invisible at the stage in the building of the second conjunct where the constituency condition on ellipsis applies. Therefore, only the left-branching VP (98a) licenses VP-ellipsis, and this is why left-to-right binding relations are impossible between an elided VP and stranded material (cf. 87) and why in object-adverbial sequences like (90) only the collective reading is available. This completes the account of why VP-ellipsis constructions do not show properties of right-branching structures, whereas superficially similar VPfronting constructions do. This analysis relies crucially on the properties of left-to-right structure building. In more standard non-derivational or bottom-up accounts of phrase structure it is not difficult to find accounts of either the VPfronting facts or the VP-ellipsis facts presented so far. However, all such accounts that I am aware of fail to capture the contrast between VPE and VPF, and predict that the two constructions should show identical results on constituency tests. I should stress that the account of the loss of right-branching effects in VPE depends on the presence in both conjuncts of the adverbial that destroys the verb-object constituent, and does not depend in particular on the fact that the elided partial VP is in the second conjunct rather than the first. This point is developed further in the discussion of comparative ellipsis in 2.6.4 below. One consequence of this is that I predict the same loss of right-branching effects to be found in ellipsis constructions in which material is elided from the first conjunct, as in (99). (99) Because John did, Bill read all the books. In first conjunct ellipsis the stranding of adverbials is only marginally acceptable, but modulo this concern, the example in (100) shows exactly the same scopal properties as the VPE sentence in (90), with just the collective reading being available. (100) Because John did quickly, Bill read all the books slowly. (collective reading only) I should also point out that the Merge Right account of VPE automatically rules out distributivedistributive readings in VPE, as we have seen, but it does not automatically rule out certain situations in which the two conjuncts have differing scope readings. For example, a collectivedistributive reading could be generated by building a left-branching VP in the first conjunct, in which the verb and the object form a constituent, and then a right-branching VP in the second conjunct. In this derivation there is a stage at which the verb and the object form a constituent in both conjuncts, which is satisfies the constituency condition on ellipsis. As already mentioned above, I assume that such mismatching readings are independently excluded by a parallelism condition on ellipsis. It is for this reason that I have devoted most attention to explaining the absence of the distributivedistributive reading, which is not excluded by parallelism constraints. 2.6.2 Scope and Ellipsis in Japanese This section considers the interaction of scope and ellipsis in Japanese VPs, which are verb-final. I show that the same account that I gave for the loss of scope readings in English holds for Japanese, despite the fact that left-branching structures are not available. In Japanese, both orderings of a quantificational NP and an adverbial are possible. When the adverb precedes the NP, both scope readings are possible (101a), but when the NP precedes the adverb, only the surface scope reading is possible (101b). (101) a. John-ga isoide dono hon-mo yonda. -nom quickly all books-acc read John read all the books quickly. (collective & distributive readings available) b. John-ga dono hon-mo isoide yonda. -nom all books-acc quickly read John read all the books quickly. (distributive reading only) The fact that one ordering is scopally ambiguous and the other reading is unambiguous in (101) is unsurprising, given well known existing facts about scope judgements in Japanese. In basic transitive sentences the order subject object verb is scopally unambiguous, with the subject obligatorily taking wide scope, and the order object subject verb is scopally ambiguous (Kuroda 1970, Kuno 1973, Hoji 1985). If we assume by extension of these facts that the lack of ambiguity implies underlying order and the presence of ambiguity implies a derived order, then we reach the conclusion that the underlying order of objects and adverbials in Japanese is object adverb verb. The structures for the VPs in (101ab) are shown in (102ab) respectively. I assume that both orderings of the object and adverbial may take scope in their surface position (leftmost takes widest scope), and that additionally the scrambled adverbial in (102a) may move to its underlying position and take scope there. (102) a. b.  Japanese has a construction in which a VP is replaced by soo su, roughly equivalent to English do so. This construction allows stranding of adverbials, just as in English. The one important contrast with English, not surprisingly, is that the pro-VP occurs clause finally, and therefore follows the stranded adverbial. As in English, the scope readings avail