酷兔英语

章节正文
文章总共2页

B

 

 

Bayes' rule
This statistical rule relates the conditionalprobability Pr(A | B) to Pr(B | A) for two events A and B. The rule states that

 

Pr(A | B) = Pr(B | A) × Pr(A) / Pr(B)

BELIEVE
BELIEVE is a modal operator in the language for representing logical forms. BELIEVE and other operators like it have some unexpected properties such as failure of substitutivity. For more details, read page 237 in Allen. Page 542 ff. provides yet more on belief in NLP (but this material is well beyond the needs of COMP9414).
bigram
A bigram is a pair of things, but usually a pair of lexical categories. Suppose that we are concerned with two lexical categories L1 and L2. The term bigram is used in statistical NLP in connection with the conditionalprobability that a word will belong to L2 given that the preceding word was in L1. This probability is written Pr(L2 | L1), or more fully Prob(w[i] in L2 | w[i-1] in L1). For example, in the phrase "The flies", given that The is tagged with ART, we would be concerned with the conditional probabilities Pr(N | ART) and Pr(V | ART) given that flies can be tagged with N and V.
bottom-up parser
A parsing method that proceeds by assembling words into phrases, and phrases into higher level phrases, until a complete sentence has been found. Contrast top-down.

The chart parser described in lectures is a bottom-up parser, and can parse sentences, using any context-free grammar, in cubic time: i.e., in time proportional to the cube of the number of words in the sentence.

bound morpheme
A bound morpheme is a prefix or suffix, which cannot stand as a word in its own right, but which, can be attached to a free morpheme and modify the meaning of the free morpheme. For example, "happy" is a free morpheme, which becomes "unhappily" when the prefix "un-", and suffix "-ly", both bound morphemes, are attached.

 

C

 

 

cardinal
Number words like one, two, four, twenty, fifty, hundred, million. Contrast ordinal.

 

case
The term case is used in two different (though related) senses in NLP and linguistics. Originally it referred to what is now termed syntactic case. Syntactic case essentially depends on the relationship between a noun (or noun phrase) and the verb that governs it. For example, in "Mary ate the pizza", "Mary" is in the nominative or subject case, and "the pizza" is in the accusative or object case. Other languages may have a wider range of cases. English has remnants of a couple more cases - genitive (relating to possession, as with the pronoun "his") and dative (only with ditransitive verbs - the indirect object of the verb is said to be in the dative case).

Notice that in "The pizza was eaten by Mary", "the pizza" becomes the syntactic subject, whereas it was the syntactic object in the equivalentsentence "Mary ate the pizza".

With semantic case, which is the primary sense in which we are concerned with the term case in COMP9414, the focus is on the meaning-relationship between the verb and the noun or noun phrase. Since this does not change between "Mary ate the pizza" and "The pizza was eaten by Mary", we want to use the same syntactic case for "the pizza" in both sentences. The term used for the semantic case of "the pizza" is theme. Similarly, the semantic case of "Mary" in both versions of the sentence is agent. Other cases frequently used include instrument, coagent, experiencer, at-loc, from-loc, and to-loc, at-poss, from-poss, and to-poss, at-value, from-value, and to-value, at-time, from-time, and to-time, and beneficiary.

Semantic cases are also referred to as thematic roles.

cataphor
Opposite of anaphor, and much rarer in actual language use. A cataphor is a phrase that is explained by text that comes after the phrase. Example: "Although he loved fishing, Paul went skating with his girlfriend." Here he is a cataphoric reference to Paul.
CFG
= context-free grammar
chart
A chart is a data structure used in parsing. It consists of a collection of active arcs (sometimes also called edges), together with a collection of constituents, (sometimes also called inactive arcs or inactive edges.

See also chart parsing.

chart parsing
A chart parser is a variety of parsing algorithm that maintains a table of well-formed substrings found so far in the sentence being parsed. While the chart techniques can be incorporated into a range of parsing algorithms, they were studied in lectures in the context of a particular bottom-up parsing algorithm.

That algorithm will now be summarized:

to parse a sentence S using a grammar G and lexicon L:

  1. Initially there are no constituents or active arcs
  2. Scan the next word w of the sentence, which lies between positions i and i+1 in the sentence.
  3. Look up the word w in the lexicon L. For each lexical categoryC to which w belongs, create a new constituent of type C, from i to i+1.
  4. Look up the grammar G. For each categoryC found in the step just performed, and each grammar rule R whose right-hand side begins with C, create a new active arc whose rule is R, with the dot in the rule immediately after the first category on the right-hand side, and from i to i+1.
  5. If any of the active arcs can have their dots advanced (this is only possible if the arc was created in a previous cycle of this algorithm) then advance them.
  6. If any active arcs are now completed (that is, the dot is now after the last category on the right-hand side of the active arc's rule), then convert that active-arc to a constituent (or inactive arc), and go to step 4.
  7. If there are any more words in the sentence, go to step 2.

to check if an active arc can have its dot advanced

  1. Let the active arc be ARCx: CC[1] ... C[j] . C[j+1] ... C[n] from m to n.
  2. If there is a constituent of type C[j] from n to p, then the dot can be advanced.
    The resulting new active arc will be:
    ARCy: CC[1] ... C[j+1] . C[j+2] ... C[n] from m to n
    where y is a natural number that has not yet been used in an arc-name.

Example: For the active arc ARC2: NP → ART1 . ADJ N from 2 to 3 if there is a constituent ADJ2: ADJ → "green" from 3 to 4 (so that the to position, 3, and the type, ADJ, of the constituent of the active arc immediately after the dot, match the from position, 3, and the type, ADJ, of the constituent ADJ2) then the active arc ARC2 can be extended, i.e. have its dot advanced, creating a new active arc, say ARC3: NP → ART1 ADJ2 . N from 2 to 4.

Chomsky hierarchy
The Chomsky hierarchy is an ordering of types of grammar according to generality. The classification in fact only depends on the type of grammar rule or production used. The grammar types described in COMP9414 included:
  • unrestricted grammars (rules of the form a → b with no restrictions on the strings a and b)
  • context sensitive grammars (rules of the form a → b with the restriction length(a) <= length(b))
  • context free grammars (rules of the form X → b where X is a single non-terminal symbol)
  • regular grammars (rules of the form X → a and X → aN where X and N are non-terminal symbols, and a is a terminal symbol.)

Named after the linguist Noam Chomsky.

CNP
symbol used in grammar rules for an common noun phrase.
co-agent
You really need to know what an agent is before proceeding. A co-agent is someone who acts with the agent in a sentence. A sentence with a prepositional phrase introduced by the preposition with and whose object is an animate is likely to be a coagent. "Jane ate the pizza with her mother" - her mother is the coagent.
co-refer
Two items (an anaphor and its antecedent) that describe the same thing are said to co-refer.
common noun
A common noun is a noun that describes a type, for example woman, or philosophy rather than an individual, such as Amelia Earhart. Contrast proper noun.
common noun phrase
A common noun phrase is a phrasal grammaticalcategory of chieflytechnical significance. Examples include "man" "big man" "man with the pizza", but not these same phrases with "the" or "a" in front - that is, "the man with the pizza", etc., are NPs, not a CNP. The need for the category CNP as a separate named object arises from the way articles like "the" act on a CNP. The word "the", regarded as a natural language quantifier, acts on the whole of the CNP that it precedes: it's "the[man with the pizza]", not "the[man] with the pizza". For this reason, it makes sense to make phrases like "man with the pizza" into syntactic objects in their own right, so that the semantic interpretation phase does not need to reorganize the structural description of the sentence in order to be able to interpret it.
complement
A complement is a grammaticalstructure required in a sentence, typically to complete the meaning of a verb or adjective. For example, the verb "believe" can take a sentential complement, that is, be followed by a sentence, as in "I believe you are standing on my foot."

There is a wide variety of complement structures. Some are illustrated in the entry for subcategorization.

An example of an adjective with a complement is "thirsty for blood", as in "The football crowd was thirsty for blood after the home team was defeated." This is a PP-complement. Another would be "keen to get out of the stadium", a TO-INF complement, as in "The away-team supporters were keen to get out of the stadium."

compositional semantics
Compositional semantics signifies a system of constructing logical forms for sentences or parts of sentences in such a way that the meanings of the components of the sentence (or phrase) are used to construct the meanings of the whole sentence (or whole phrase). For example, in "three brown dogs", the meaning of the phrase is constructed in an obvious way from the meanings of three, brown anddogs. By way of contrast, a phrase like "kick the bucket" (when read as meaning "die") does not have compositional semantics, as the meaning of the whole ("die") is unrelated to the meanings of the component words.

The semantic system described in COMP9414 assumes compositional semantics.

concrete noun
A concrete noun is a noun that describes a physical object, for example apple. Contrast abstract noun.
conditional probability
The conditionalprobability of event B given event A is the probability that B will occur given that we know that A has occurred. The example used in lecture notes was that of a horse Harry that won 20 races out of 100 starts, but of the 30 of these races that were run in the rain, Harry won 15. So while the probability that Harry would win a race (in general) would be estimated as 20/100, the conditionalprobability Pr(Win | Rain) would be estimated as 15/30 = 0.5. The formaldefinition of Pr(B | A) is Pr(B & A) / Pr(A). In the case of B = Win and A = Rain, Pr(B & A) is the probability that it will be raining and Harry will win (which on the data given above is 15/100), while Pr(A) is the probability that it will be raining, or 30/100. So again Pr(B | A) = 0.15/0.30 = 0.5
CONJ
symbol used in grammar rules for a conjunction.

文章总共2页
文章标签:词典  

章节正文