Hack Jam Log Book is a log of progress made in and around a weekly hack session. Topics include natural language processing, high energy electronics, linguistics, interface design, &c. Enjoy.

Recent Posts:

Archives:

14.10.08

 

Natural Language Processing - Part I

I went to see Iron Man back in May when it came out. It was a fun movie and it definitely satisfied my inner comic nerd but the first words out of my mouth when we left the theater were, "I have some software to write." Since then, I've been working on a small portion of that software: a natural language interface for a computer, a sort of natural language shell.

The first stab was a naive one. It worked in a subset of english where every word had only one meaning and every sentence had only one possible parse. I thought I knew how bad of a limitation I had placed on myself, but I wanted to start building an approximate solution and get experience in the problem space. It turned out that I underestimated the limitations I was working under and had to abandon that approach sooner than I had hoped. Before I stopped, however, I had a system that I could make statements to, ask questions of, and even tell to launch programs, provided I was careful in my wordings.

One sentence would be read in at a time, parsed according to a BNF grammar, and classified as declarative, imperative, or inquisitive. Declarative sentences would be added to the knowledge base (KB). There was no validation on what was added; the system could be fed two inconsistent statements and it'd take both as gospel. Answering questions involved a search of the KB for an answer. Yes/no questions did a backward substituion search for satisfiability and "what is" questions searched for facts in the KB that answer the question. So given: "All men are mortal", and "Socrates was a man", it could answer the question "Was Socrates a mortal?" with "Yes." and "Who was Socrates?" with "A man." Imperative sentences were recognized, and if the verb of the sentence was "run" and the object a constant (names like "emacs" that weren't in the dictionary were treated as constants), it would try to run a program with that constant's name in the background.

Just over two months ago, I started from scratch. (I had been learning Haskell as I wrote the first system, so even the code that had reusable functionality was a mess.) Through the magic of Monads, the new parsing code is about 40% smaller and returns a set of all the interpretations for a sentence it can come up with. This set gets much larger than I expected. Just introducing the ability to interpret any word as a name explodes the number of possible responses. The program shows the sentence "Emacs is a program that opens text files" as having 23 unique parses. Now, there's a lot of those 23 that can be discarded through checking some form of subject verb agreement. What's left can even be ignored given a parse that assumes less names, so it looks like there's still a workable solution there. In the worst cases, the system will have to deal with a nondeterministic KB until either more context is available to make the sentence clearer, or the system can ask questions.

This is an interesting problem in a lot of ways. More next week.

Comments:
I'd love to hear more about your exact techniques --- how you represent each possible parsing of a sentence, etc.

Looking forward to part II.
 
Post a Comment

Subscribe to Post Comments [Atom]





<< Home

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]