17.11.08
Word Mangling, Verb Conjugation, Concepts
I've been busy with mundane parts of the nlp system recently. In the past week, I rewrote about 90 percent of the word recognizer and grammar parser. The new word recognition allows extended forms of words (eg. run -> runner -> runneresque) and irregular verbs (eg. go, went, gone). Of course, just because the system can recognize words like filelike doesn't mean it can assign meaning to them, aside from the rather vague "has some properties a file has." Of course, that's all you or I get as well; it's an evocative word, used to convey a starting point rather than to nail down specific properties. Computationally, perhaps it could be used for something like linking a new concept to nod in an existing conceptual graph. I'll see how that can work as I continue to research and/or reinvent and/or borrow wheels.
The upgrade to the grammar parser now recognizes as many different tenses, aspects, voices, and persons as I've found to exist so far, including some I'm somewhat dubious about. Is future perfect continuous passive ("will have been being eaten") for real? It may just be PageRank, but google's only showing grammar sites for at least the first 50 hits for "will have been being".
So now I'm back to where I thought I was two weeks ago, ready to start working on a new knowledge representation. And that brings me to this sentence:
"Alice believed Bob lied."
Pulling this apart, there's at least 6 concepts introduced here, each pronounable in the next sentence:
C1: the act of lying. "It's something we've all done."
C2: Bob. "He's been known to prevaricate."
C3: the event of Bob lying, in the past. "It's happened before."
C4: the act of (or quality of) belief in C3. "It's true of Eve as well."
C5: Alice. "She's not quick to trust."
C6: Alice holding a belief in C3. "It's because of their past."
This has been bugging me for the past week and a half or so. How many of those are conceived on first reading? C1 and C4 specifically: they're the hardest to pronounize in a non awkward way. Do the pronouns force a reevaluation of the sentence when the other four don't fit the pronoun?
I've been cowardly and not calling shots for a while, so nothing to report there outside of the progress mentioned above. I've got just one called shot for the next week: research in the area of knowledge representation.
The upgrade to the grammar parser now recognizes as many different tenses, aspects, voices, and persons as I've found to exist so far, including some I'm somewhat dubious about. Is future perfect continuous passive ("will have been being eaten") for real? It may just be PageRank, but google's only showing grammar sites for at least the first 50 hits for "will have been being".
So now I'm back to where I thought I was two weeks ago, ready to start working on a new knowledge representation. And that brings me to this sentence:
"Alice believed Bob lied."
Pulling this apart, there's at least 6 concepts introduced here, each pronounable in the next sentence:
C1: the act of lying. "It's something we've all done."
C2: Bob. "He's been known to prevaricate."
C3: the event of Bob lying, in the past. "It's happened before."
C4: the act of (or quality of) belief in C3. "It's true of Eve as well."
C5: Alice. "She's not quick to trust."
C6: Alice holding a belief in C3. "It's because of their past."
This has been bugging me for the past week and a half or so. How many of those are conceived on first reading? C1 and C4 specifically: they're the hardest to pronounize in a non awkward way. Do the pronouns force a reevaluation of the sentence when the other four don't fit the pronoun?
I've been cowardly and not calling shots for a while, so nothing to report there outside of the progress mentioned above. I've got just one called shot for the next week: research in the area of knowledge representation.
Labels: adam, calling shots, language, nlp