Derivations and trees
You may recall from the last class that there was something I wanted to say, but it’s possible that it may not have been that clear once I actually got a chance to say it, in the last -2 minutes of class. So, let me put it writing for your consideration.
There will be quite a bit of tree-drawing and derivations that will take place from now until the end of the semester, and so I wanted to say a couple of things about how you represent derivations. On p. 144 of the Adger textbook is a pretty good example of how a derivation looks, as it proceeds step-by-step. Except for the typos, anyway, which unfortunately have left a number of “V”s where “VPs” should be. See an adjacent blog entry about that.
So, there are at least two things I wanted to mention here. One concerns the “bar-level” of the labels you get once you Merge or Adjoin. Let me first say this: conceptually, this doesn’t matter one whit. When you take two syntactic objects and Merge them together to form a composite object, the features of one of the two objects “project” and serve as the features of the composite object as a whole. What I’m about to talk about here are just conventions for labeling these on the page as we draw the derivation.
The label of a terminal node, one of the original things that you take from your lexicon and are found on the frontier of the tree, is usually the pronunciation of the word itself. In cases where there isn’t really a pronunciation per se (for example, if the node is a “little v“, or if we’re just talking about something in the abstract, like on p. 135), the category is used as the label. Sometimes a terminal node might be written with a superscript zero as well, like V0 or P0.
In situations where you are concentrating on the feature-checking that happens as the tree is built, you will also want to write the relevant features at the bottom of the tree, under or next to the head whose features they are.
The labels of the nonterminal nodes can be determined in a couple of different ways. One of the ways, the one I introduced in class initially, determines the “bar-level” of a node based on whether the features project higher and whether the features were projected to that node from below. Simply, if the features do not project higher, the node is a maximal projection (as the name itself suggests). Maximal projections are labeled with “P” after the category (e.g., VP, vP, NP, …). Otherwise, if the features do not project from below, then the node must be a head, drawn from the lexicon, and written as just the category (V, N, etc.), or the category with a superscript zero (V0, N0, and so forth). Otherwise, if the features are both projected from below and project higher, you have an intermediate projection, written with the category followed by a prime (or written with a bar over it) (V′, N′, etc.).
The algorithm Adger uses to determine the label of a node is a bit different, and in fact I think I like his way of doing it better. Again, this is not really a difference in the content of the theory, this is just a difference in the convention of how the nodes are written down.
In Adger’s system, the label is named based on the feature content itself, in particular, the unchecked uninterpretable features. Here’s how this works: If an element has no uninterpretable features left to check, it is a maximal projection (NP, VP, etc.). The reason, of course, is that its features can’t project any higher because your features only project when you Merge with something that checks one of your uninterpretable features. If an element is straight out of the lexicon (at the frontier of the tree), then it is a head (N, V, …, or N0, V0, …). Otherwise, if it still has uninterpretable features to check, it is an intermediate projection.
One advantage to using Adger’s labeling conventions is that it is less ambiguous about what the node labels should look like when you adjoin one thing to another. Think it through, and you’ll probably see what I mean.
One difference between the way I’d proposed doing it in class and the way Adger does it can be seen, for example, in the trees on p. 133: v and VP are Merged, and the resulting object is listed as being a v′. Why is it a v′? Easy: It’s not a head (it didn’t come out of the lexicon, it exists because Merge put it together), and it has at least one unchecked uninterpretable feature. So, we know that the next Merge is going to have to check that feature and that the features of v are going to project further. (The way I introduced this in class, that same object would have been a vP—and not a v′—until the next step, at which point its features would project higher).
I think for clarity and consistency, I would prefer that you go ahead and use Adger’s system as outlined here.
Now, on to another convention about how trees are drawn: the annotation of nodes with their feature content. Adger tends to write all of the features under the heads, and then write all of the uninterpretable features left to check by the label of the combined units, although he is not really consistent about doing that. It might be a help to your accounting of what’s left to check to do it that way. But, what I generally do is write all of the features under the heads, and then cross them out there (still under the head) as they are checked off. As far as I can see, the system Adger uses when drawing his trees is this: A node is annotated with the uninterpretable feature that is checked when the node is Merged with its sister (and if the node does not yet have a sister, the feature isn’t yet checked). See p. 137 or p. 144 for examples.
I, on the other hand, would just write all of the features under the head, not annotate any nonterminal nodes, and cross out the features as they are checked, under the head. What I like more about this is that it’s clearer than, e.g., the trees on p. 137, about what the features are that the lexical item came with in the first place.
So, I’d say this is a good policy to follow: Use Adger’s conventions for node labeling, based on how many uninterpretable features a node has. Use my conventions for annotating nodes with features, by putting the features down by each head, without any annotations of the feature content of nonterminal nodes.