28.11.08
Write or Die: Now on Emacs!
For those of you that write, or wish you wrote, or otherwise procrastinate while performing text-based activities, try out write-or-die. It is a nice cattle prod; if you don't write for several seconds, it will punish you. I find that the Kamikaze mode is most effective --- the punishment is that it will delete words, bit by bit, until you begin to write again.
However, I use emacs; and switching from my carefully-tuned text-editor extraordinaire to a javascript-based text-box is a bit of a disappointment. That's why I threw together this emacs version of write-or-die.
Just paste the following into your init.el, and when the time comes, M-x write-or-die. When you have accomplished your goal, M-x no-more-die, and you're set.
Possible improvements --- flashing colors, like the web-version has, to warn you of the time limit; word-by-word deletion, rather than 10 characters at a time; variable time limit; etc. But those can wait. Right now I need to write another 10,000 words.
However, I use emacs; and switching from my carefully-tuned text-editor extraordinaire to a javascript-based text-box is a bit of a disappointment. That's why I threw together this emacs version of write-or-die.
Just paste the following into your init.el, and when the time comes, M-x write-or-die. When you have accomplished your goal, M-x no-more-die, and you're set.
(let ( (write-or-die-time ())
(write-or-die-init ()) )
(defun write-or-die ()
(interactive)
(setf write-or-die-init (run-with-idle-timer 3 t 'w-or-die)))
(defun w-or-die ()
(interactive)
(kill-region (point-max) (- (point-max) 10))
(setf write-or-die-time (run-with-idle-timer
(time-add (current-idle-time)
(seconds-to-time 3))
()
'w-or-die)))
(defun no-more-die ()
(interactive)
(cancel-timer write-or-die-init)
(cancel-timer write-or-die-time))
)
Possible improvements --- flashing colors, like the web-version has, to warn you of the time limit; word-by-word deletion, rather than 10 characters at a time; variable time limit; etc. But those can wait. Right now I need to write another 10,000 words.
26.11.08
MythTV
For a quick breather away from the piano, I decided to set up a MythTV box.
For those of you who are unfamiliar, MythTV is the Linux DVR project. It also has a lot of optional plug-ins (installed seperately) for things like Weather, News, NetFlix queuing, etc. Not having a digital capture card makes my success thus far limited, but allow me to tell you what I have learned.
First of all, for those who are interested in different types of cards out there, here are the different terms that get thrown around:
* NTSC [in terms of broadcast] = analog/"your grandmother's television". Think the TV with two knobs, the first one you turn to "U" to make the second knob work.
* ATSC = new, hip digital broadcast TV [officially taking off Feb 2009]
* Clear QAM = The "free" broadcast channels your local cable provider gives out. Usually includes the ATSC channels in your area. For those who are interested, my research shows that Cox Cable has a whopping 6 Clear QAM channels available in Norman, OK.
* DVB = Digital Video Broadcast. DVB-T is terrestial, DVB-C is cable, DVB-S is satellite.
With the basic terminology out of the way, have fun looking for good cards. Reviews are limited. You might enjoy these handy posts for tutorials on installing MythTV and selecting cards:
http://www.mythtv.org/docs/mythtv-HOWTO.html#toc5 [Installing MythTV]
http://www.mythtv.org/wiki/index.php/Category:Video_capture_cards [Capture Cards]
There are a lot of pre-requisites, including some that aren't listed. The big one is of "Qt3MySQL" [why they don't list it as an actual pre-req, I don't know]. The ones listed in the tutorial include:
mysql, gcc, freetype2-devel, xorg-xserver-devel, qt-devel and lame.
Also required:
qmake, Xv, QT3MySQL, mplayer, ffmpeg, transcode, many others
Pro Tip!
If while making a program you get lines that show...
/usr/bin/ld: cannot find -lMERPMERP
collect2: ld returned 1 exit status
Do a search in your package manager for MERPMERP. Many times, that will resolve it and you can resume MAKEing your program.
Anyway, when you go to MAKE the program, assuming you have everything, it will take about 30+ minutes to compile with a pretty decent CPU, so I recommend a good lunch.
Once everything is installed, run mythfilldatabase to populate your SQL backend with channel data from your card. If you don't like something you've done during the initial setup, don't worry... you can always run mythtv-setup again.
As a last note, MythPlugins is worth installing at least for MythArchive. Unless you want to purchase a nice JBOD system for harddrives, install it and play around with the "Archive" [burns recorded shows to DVD]!
As I hear more about cards, plug-ins and more, I'll keep you posted.
Regards,
Brandon
For those of you who are unfamiliar, MythTV is the Linux DVR project. It also has a lot of optional plug-ins (installed seperately) for things like Weather, News, NetFlix queuing, etc. Not having a digital capture card makes my success thus far limited, but allow me to tell you what I have learned.
First of all, for those who are interested in different types of cards out there, here are the different terms that get thrown around:
* NTSC [in terms of broadcast] = analog/"your grandmother's television". Think the TV with two knobs, the first one you turn to "U" to make the second knob work.
* ATSC = new, hip digital broadcast TV [officially taking off Feb 2009]
* Clear QAM = The "free" broadcast channels your local cable provider gives out. Usually includes the ATSC channels in your area. For those who are interested, my research shows that Cox Cable has a whopping 6 Clear QAM channels available in Norman, OK.
* DVB = Digital Video Broadcast. DVB-T is terrestial, DVB-C is cable, DVB-S is satellite.
With the basic terminology out of the way, have fun looking for good cards. Reviews are limited. You might enjoy these handy posts for tutorials on installing MythTV and selecting cards:
http://www.mythtv.org/docs/mythtv-HOWTO.html#toc5 [Installing MythTV]
http://www.mythtv.org/wiki/index.php/Category:Video_capture_cards [Capture Cards]
There are a lot of pre-requisites, including some that aren't listed. The big one is of "Qt3MySQL" [why they don't list it as an actual pre-req, I don't know]. The ones listed in the tutorial include:
mysql, gcc, freetype2-devel, xorg-xserver-devel, qt-devel and lame.
Also required:
qmake, Xv, QT3MySQL, mplayer, ffmpeg, transcode, many others
Pro Tip!
If while making a program you get lines that show...
/usr/bin/ld: cannot find -lMERPMERP
collect2: ld returned 1 exit status
Do a search in your package manager for MERPMERP. Many times, that will resolve it and you can resume MAKEing your program.
Anyway, when you go to MAKE the program, assuming you have everything, it will take about 30+ minutes to compile with a pretty decent CPU, so I recommend a good lunch.
Once everything is installed, run mythfilldatabase to populate your SQL backend with channel data from your card. If you don't like something you've done during the initial setup, don't worry... you can always run mythtv-setup again.
As a last note, MythPlugins is worth installing at least for MythArchive. Unless you want to purchase a nice JBOD system for harddrives, install it and play around with the "Archive" [burns recorded shows to DVD]!
As I hear more about cards, plug-ins and more, I'll keep you posted.
Regards,
Brandon
Labels: brandon mythtv dvr linux
21.11.08
Postscript Posters
One of the more exciting things you can do with Postscript is to make a poster -- vector graphics are sexy and you can easily make them as large or small as you want while the plotter or viewing device faithfully retains the original image's sharpness and definition. And bigger is always better: Why make a Rhode Island-sized poster when you can have Texas for the cost of a scaling factor?
I set out to make a poster and found that although it was ultimately a straightforward task, I had to figure a few things out to get it to work.
First, it was necessary for me to write the whole poster on a single 8.5x11 sheet. Apparently it is very hard to change the page size in postscript, so sticking with the default and specifying an appropriate bounding box worked best for me. The bounding box is done up by writing near the top of the file
%%BoundingBox: x1 y1 x2 y2
So, you specify the lower-left and upper-right corners of the smallest rectangle that includes everything you want to be in the image.
I set out to make a poster and found that although it was ultimately a straightforward task, I had to figure a few things out to get it to work.
First, it was necessary for me to write the whole poster on a single 8.5x11 sheet. Apparently it is very hard to change the page size in postscript, so sticking with the default and specifying an appropriate bounding box worked best for me. The bounding box is done up by writing near the top of the file
%%BoundingBox: x1 y1 x2 y2
So, you specify the lower-left and upper-right corners of the smallest rectangle that includes everything you want to be in the image.
On my first try I simply drew my graphics at the intended poster size on the page -- all that was visible to me was the bottom corner of the page, while the majority of the graphics were plotted at coordinates not within the page size.
Once I had everything looking the way I wanted it to I just had to issue the scale commands at the beginning of the file:
.15 .15 scale
caused the graphics to follow to be drawn at the fifteen percent required to get them on the page.
At that point I used the "poster" utility with a scaling factor of the inverse of the one I supplied to postscript -- here, 6.667 -- and piped the output to a new ps file. The new ps file went to ps2pdf, and the final product was the poster at the size I needed split over 40 8.5x11 pages. Cool beans!
Ultimately I decided that what I did want was a whole poster and they are not that expensive so I will take the downscaled pdf to Kinko's and have them upscale it to poster size and print it for me.
Making posters is pretty cool and I think you should try it too.
Labels: jws, postscript
18.11.08
Livescribe Pulse Hacking: The buisness already!
Blah, blah, blah. Those of you who were up to date on your Fourier transform are now bored out of your mind --- and those of you who weren't are still wondering why I'm talking so much about it rather than about the pen.
So now, it happens! Data is sent from the pen to my laptop, running Linux, via vibrating air! Doesn't it just put you at the edge of your seat?
The good news: "Hello, World!" did get from my pen to my desktop. The bad news: it did so at something like 8bps --- that's 8 bits per second, mind you. So it took something like a quarter of a minute just to send those two words.
Are improvements possible? Absolutely. I can increase my sampling rate (currently 8kHz, could easily be 16). I can increase the number of different frequencies to increase the per-unit-bandwidth.
But what is currently killing me is (I believe) the period between tones; not the tones themselves. There are two dangerous things that happen during the transition. One is that the Fourier analysis for time-bins which contain a frequency change have two peaks; this makes classifying those bins far more challenging, even classifying them as worthless and ignoring them.
Second, and a little more worrying, is on the pen: the amount of time that passes between the current sound clip ending after I request that the next sound play, and the next sound actually playing. It is, uh ... not zero. And that of course makes the the whole in-between-tone period even HARDER to identify and cope with.
Short note. Hopefully have more time next week to discuss possible fixes for all these faults.
So now, it happens! Data is sent from the pen to my laptop, running Linux, via vibrating air! Doesn't it just put you at the edge of your seat?
The good news: "Hello, World!" did get from my pen to my desktop. The bad news: it did so at something like 8bps --- that's 8 bits per second, mind you. So it took something like a quarter of a minute just to send those two words.
Are improvements possible? Absolutely. I can increase my sampling rate (currently 8kHz, could easily be 16). I can increase the number of different frequencies to increase the per-unit-bandwidth.
But what is currently killing me is (I believe) the period between tones; not the tones themselves. There are two dangerous things that happen during the transition. One is that the Fourier analysis for time-bins which contain a frequency change have two peaks; this makes classifying those bins far more challenging, even classifying them as worthless and ignoring them.
Second, and a little more worrying, is on the pen: the amount of time that passes between the current sound clip ending after I request that the next sound play, and the next sound actually playing. It is, uh ... not zero. And that of course makes the the whole in-between-tone period even HARDER to identify and cope with.
Short note. Hopefully have more time next week to discuss possible fixes for all these faults.
Labels: livescribe, pulse, sam
17.11.08
Word Mangling, Verb Conjugation, Concepts
I've been busy with mundane parts of the nlp system recently. In the past week, I rewrote about 90 percent of the word recognizer and grammar parser. The new word recognition allows extended forms of words (eg. run -> runner -> runneresque) and irregular verbs (eg. go, went, gone). Of course, just because the system can recognize words like filelike doesn't mean it can assign meaning to them, aside from the rather vague "has some properties a file has." Of course, that's all you or I get as well; it's an evocative word, used to convey a starting point rather than to nail down specific properties. Computationally, perhaps it could be used for something like linking a new concept to nod in an existing conceptual graph. I'll see how that can work as I continue to research and/or reinvent and/or borrow wheels.
The upgrade to the grammar parser now recognizes as many different tenses, aspects, voices, and persons as I've found to exist so far, including some I'm somewhat dubious about. Is future perfect continuous passive ("will have been being eaten") for real? It may just be PageRank, but google's only showing grammar sites for at least the first 50 hits for "will have been being".
So now I'm back to where I thought I was two weeks ago, ready to start working on a new knowledge representation. And that brings me to this sentence:
"Alice believed Bob lied."
Pulling this apart, there's at least 6 concepts introduced here, each pronounable in the next sentence:
C1: the act of lying. "It's something we've all done."
C2: Bob. "He's been known to prevaricate."
C3: the event of Bob lying, in the past. "It's happened before."
C4: the act of (or quality of) belief in C3. "It's true of Eve as well."
C5: Alice. "She's not quick to trust."
C6: Alice holding a belief in C3. "It's because of their past."
This has been bugging me for the past week and a half or so. How many of those are conceived on first reading? C1 and C4 specifically: they're the hardest to pronounize in a non awkward way. Do the pronouns force a reevaluation of the sentence when the other four don't fit the pronoun?
I've been cowardly and not calling shots for a while, so nothing to report there outside of the progress mentioned above. I've got just one called shot for the next week: research in the area of knowledge representation.
The upgrade to the grammar parser now recognizes as many different tenses, aspects, voices, and persons as I've found to exist so far, including some I'm somewhat dubious about. Is future perfect continuous passive ("will have been being eaten") for real? It may just be PageRank, but google's only showing grammar sites for at least the first 50 hits for "will have been being".
So now I'm back to where I thought I was two weeks ago, ready to start working on a new knowledge representation. And that brings me to this sentence:
"Alice believed Bob lied."
Pulling this apart, there's at least 6 concepts introduced here, each pronounable in the next sentence:
C1: the act of lying. "It's something we've all done."
C2: Bob. "He's been known to prevaricate."
C3: the event of Bob lying, in the past. "It's happened before."
C4: the act of (or quality of) belief in C3. "It's true of Eve as well."
C5: Alice. "She's not quick to trust."
C6: Alice holding a belief in C3. "It's because of their past."
This has been bugging me for the past week and a half or so. How many of those are conceived on first reading? C1 and C4 specifically: they're the hardest to pronounize in a non awkward way. Do the pronouns force a reevaluation of the sentence when the other four don't fit the pronoun?
I've been cowardly and not calling shots for a while, so nothing to report there outside of the progress mentioned above. I've got just one called shot for the next week: research in the area of knowledge representation.
Labels: adam, calling shots, language, nlp
13.11.08
Makefile , Blink , and Software Controlled Buttons
As promised, another update on the glorious piano project.
This week has mostly been about getting a good solid foundation for development. After some time, I managed to make my first Makefile specifically for the Atmega16 and my avr-toolchain. I also managed to test it both on my JTAG programmer, as well as Sam's STK500v2 which I'm still in possession of.
Additionally, I spent a good deal of time trying to get a software controlled switch to work. Getting the code to compile and upload to the Atmega was simplified with my Makefile, for sure, but for some reason the silly microcontroller was resetting on me. The data sheet didn't seem to provide much help [I went through all of the RESETs listed, to no avail]. Sam suggested I research the internal pull-up resistors, but they seemed to hold the current around +3V, and also produce the same intermittent reset problem. I finally said enough was enough and decided to do external pull-up resistors. If you are interested, here is a site which shows the basic diagram of how to put a switch into a Microcontroller: http://www.micahcarrick.com/05-15-2006/avr-tutorial-switch-debounce.html .
I've uploaded my sample code onto my website for now. You can find all of the available warez under http://brandon.bagwellonline.com/piano . Currently, you will find a simple "blink" program, a "swbutton" [which has a software-controlled button listed], the Makefile, and eventually an image of my JTAG programming breadboard. Hopefully, I will add more interesting pics as they become available, but with were the project is at the moment, all the fun will exist in the world of C, Assembly, and my head.
Updates will be coming, though they will be slower since the end of the year is a busy time for me at work [everyone else is burning their vacation time]. Besides, I'm still catching up on Doctor Who and Business Week.
More next time,
Brandon
This week has mostly been about getting a good solid foundation for development. After some time, I managed to make my first Makefile specifically for the Atmega16 and my avr-toolchain. I also managed to test it both on my JTAG programmer, as well as Sam's STK500v2 which I'm still in possession of.
Additionally, I spent a good deal of time trying to get a software controlled switch to work. Getting the code to compile and upload to the Atmega was simplified with my Makefile, for sure, but for some reason the silly microcontroller was resetting on me. The data sheet didn't seem to provide much help [I went through all of the RESETs listed, to no avail]. Sam suggested I research the internal pull-up resistors, but they seemed to hold the current around +3V, and also produce the same intermittent reset problem. I finally said enough was enough and decided to do external pull-up resistors. If you are interested, here is a site which shows the basic diagram of how to put a switch into a Microcontroller: http://www.micahcarrick.com/05-15-2006/avr-tutorial-switch-debounce.html .
I've uploaded my sample code onto my website for now. You can find all of the available warez under http://brandon.bagwellonline.com/piano . Currently, you will find a simple "blink" program, a "swbutton" [which has a software-controlled button listed], the Makefile, and eventually an image of my JTAG programming breadboard. Hopefully, I will add more interesting pics as they become available, but with were the project is at the moment, all the fun will exist in the world of C, Assembly, and my head.
Updates will be coming, though they will be slower since the end of the year is a busy time for me at work [everyone else is burning their vacation time]. Besides, I'm still catching up on Doctor Who and Business Week.
More next time,
Brandon
Labels: avr piano gcc atmega brandon
12.11.08
Livescribe Pulse Modem Hack: The Fourier Transform
So we have a rough idea what the frequency domain looks like; now we need some way of actually transforming data in the time domain (raw from the microphone) into frequency domain data. We'll do that through the creative application of interference.
For this trick to work, we must remember that the average value of a sinusoid is zero --- half the wave is positive, half the wave is negative, and it all cancels out nicely.
The other thing we need to know is how sinusoids multiply. So imagine a sinusoid, plodding along; a nice, slow wave with a long wavelength. Now imagine a fast, short wave sinusoid right on top of it. For each point along the x-axis of our imaginary graph, multiply the value of the slow wave with the value of the fast wave; and take care --- I said multiply, not add.
If your imagination is in smooth working order, you might piece together that the result is a periodic, oscillating wave. If your imagination is exceptionally good, you might even be able to notice that this curve is made up of two summed sinusoidal components. Truth be told, my imagination didn't hold up to the strain --- now I've got to find a new one. I recommend this for when your function imagination breaks down.
(In the meantime, we can examine a little math and discover that the two components are, in fact, sine waves whose frequencies are the sum of our original two sine waves and their difference, respectively.)
But regardless of what the components are, they are sinusoidal --- and we already said that the average value of a sine is zero.
What happens, though, if we take two identical sine waves, and multiply them? Every place that one is positive, the other is positive as well; a positive times a positive is another positive... and every place one is negative, the other is as well; negative times negative is also positive.
All of a sudden the average value is no longer zero!
More exciting still, the trick still works even when we use a sum of multiple sinusoids for our first wave --- if the sinusoid we're multiplying by is exactly the same frequency as one of the components of our first wave, then the average value of the product is non-zero --- and that value is proportional to the amplitude of the component wave. We've found a way to pick a single sine wave out of a sum of them! Our transform! Right?
There's an insidious evil lurking behind this joy.
Notice what happens when our two waves fall ninety degrees out of phase, so that the peaks of one wave occur simultaneous to the zeros of the other. When we multiply, BAM. another simple sine wave. That's no help --- its average is still zero. As it turns out, the average value of the product of same-frequency sines is not simply proportional to the amplitude of the original wave; it's proportional to the amplitude and phase shift. Shoot.
There is one hope, however: cosine. Cosine's already ninety degrees out of phase with sine, right? So whenever multiplying by a sine wave results in a false negative, multiplying by a cosine will get a full-strength output.
And sure enough, in combination, multiplying by sine, separately multiplying by cosine, and then adding the two results, negates the phase-based part of the new wave. Loverly. Here, we've invented the full form of the Fourier Transform (for that is indeed what it is): The amplitude of a given frequency component of a wave is equal to the average of the sum of a unit cosine at the given frequency times the original wave times a unit sine at the given frequency times the original wave.
That's it --- easy as pie, right?
Well, no, actually. Nowadays, no one uses that kind of notation --- we transform everything into the complex domain before we even begin. Even worse, there are huge pitfalls as soon as you make the whole thing digital --- when you no longer have smooth curves but individual samples the whole mess gets much more involved, plus there's the issue of efficiency; the method we used is about as fast as a mole on a paddleboat. It can't even reach the pedals, poor thing.
The practical application of all this comes right back to the Livescribe pulse. You remember that, right? All this has been about that. What we have now is a way to determine which frequency dominates a given audio sample. By making the pen play individual, pre-chosen frequencies, we can send data from the pen's speaker, through the PC microphone, and then decode the sounds back into labeled frequencies. Peachy. Now all we need is a way to encode data as a list of frequencies, and we're all set!
That will be the topic of my next post, where, with luck, I will be able to declare success in sending data from the pen to my PC without the use of Livescribe's desktop software.
So stay tuned.
For this trick to work, we must remember that the average value of a sinusoid is zero --- half the wave is positive, half the wave is negative, and it all cancels out nicely.
The other thing we need to know is how sinusoids multiply. So imagine a sinusoid, plodding along; a nice, slow wave with a long wavelength. Now imagine a fast, short wave sinusoid right on top of it. For each point along the x-axis of our imaginary graph, multiply the value of the slow wave with the value of the fast wave; and take care --- I said multiply, not add.
If your imagination is in smooth working order, you might piece together that the result is a periodic, oscillating wave. If your imagination is exceptionally good, you might even be able to notice that this curve is made up of two summed sinusoidal components. Truth be told, my imagination didn't hold up to the strain --- now I've got to find a new one. I recommend this for when your function imagination breaks down.
(In the meantime, we can examine a little math and discover that the two components are, in fact, sine waves whose frequencies are the sum of our original two sine waves and their difference, respectively.)
But regardless of what the components are, they are sinusoidal --- and we already said that the average value of a sine is zero.
What happens, though, if we take two identical sine waves, and multiply them? Every place that one is positive, the other is positive as well; a positive times a positive is another positive... and every place one is negative, the other is as well; negative times negative is also positive.
All of a sudden the average value is no longer zero!
More exciting still, the trick still works even when we use a sum of multiple sinusoids for our first wave --- if the sinusoid we're multiplying by is exactly the same frequency as one of the components of our first wave, then the average value of the product is non-zero --- and that value is proportional to the amplitude of the component wave. We've found a way to pick a single sine wave out of a sum of them! Our transform! Right?
There's an insidious evil lurking behind this joy.
Notice what happens when our two waves fall ninety degrees out of phase, so that the peaks of one wave occur simultaneous to the zeros of the other. When we multiply, BAM. another simple sine wave. That's no help --- its average is still zero. As it turns out, the average value of the product of same-frequency sines is not simply proportional to the amplitude of the original wave; it's proportional to the amplitude and phase shift. Shoot.
There is one hope, however: cosine. Cosine's already ninety degrees out of phase with sine, right? So whenever multiplying by a sine wave results in a false negative, multiplying by a cosine will get a full-strength output.
And sure enough, in combination, multiplying by sine, separately multiplying by cosine, and then adding the two results, negates the phase-based part of the new wave. Loverly. Here, we've invented the full form of the Fourier Transform (for that is indeed what it is): The amplitude of a given frequency component of a wave is equal to the average of the sum of a unit cosine at the given frequency times the original wave times a unit sine at the given frequency times the original wave.
That's it --- easy as pie, right?
Well, no, actually. Nowadays, no one uses that kind of notation --- we transform everything into the complex domain before we even begin. Even worse, there are huge pitfalls as soon as you make the whole thing digital --- when you no longer have smooth curves but individual samples the whole mess gets much more involved, plus there's the issue of efficiency; the method we used is about as fast as a mole on a paddleboat. It can't even reach the pedals, poor thing.
The practical application of all this comes right back to the Livescribe pulse. You remember that, right? All this has been about that. What we have now is a way to determine which frequency dominates a given audio sample. By making the pen play individual, pre-chosen frequencies, we can send data from the pen's speaker, through the PC microphone, and then decode the sounds back into labeled frequencies. Peachy. Now all we need is a way to encode data as a list of frequencies, and we're all set!
That will be the topic of my next post, where, with luck, I will be able to declare success in sending data from the pen to my PC without the use of Livescribe's desktop software.
So stay tuned.
Labels: livescribe, modem, pulse, sam
6.11.08
No Hackjam 11.5.08
No 11.5 hack jam; physical conditions were unfavorable (torrential rain).
I succeeded in all my calls for last week.
This week:
I succeeded in all my calls for last week.
This week:
- a post explaining the fourier transform, and how to use it.
- the first bit actually transmitted from pen to computer by sound
- another 10000 words of novel
3.11.08
NLP: Extracting Meaning from Noun Phrases
Today we'll look at one way to build up a representation of noun phrases as logical statements (see the first post for an introduction to the problem space we're working in). Once again, I'll be using Haskell code derived from my current project to demonstrate exactly what I mean. One thing that has become more and more apparent to me as I progress is that this kind of representation of the meaning of sentences is the wrong way to go. As I describe the structure I've been using, we'll take a look at some concepts that are difficult to represent in ways that don't depend on the english meaning of predicates themselves.
First, we need to revisit and expand our grammar somewhat. This is still a simplified grammar relative to English taken as a whole.
A quick grammar refresher. Relative clauses are a kind of extra, descriptive, verb phrase. In the sentence, "He who smelt it, dealt it," the relative clause "who smelt it" modifies he.
We're going to classify the words who what where why when and how as query words. In this naive system, the presense of a query word is going to set off the whole sentence as a question. So this system will flag "That's not how you do it," as a query. This is not a fatal flaw, but requires a bit of analysis to get around.
Now let's update our logical notation.
From the definitions we used last time, we're adding some elements of higher order logic: sets and higher order terms. (Sets were in the last post but not used.) A number of concepts are much more natural to express in sets, and higher order terms are used sparingly to store meta information to pass on to a higher module, rather than passing around an extra structure to store intent, etc.
One more thing before we get into the meat of this post. Let's define a couple shorthand notations for patterns that will show up a couple of times.
aSetWhere defines a set where all it's members satisfy a given sentence.
Throughout convertNounPhrase, which we are about to (re)define, the variable "x" will represent the current noun phrase. We can avoid clashes by being carefull to substitute variables whenever we add two noun phrases together. We'll do that in combineNVSents, another function from last time that we'll expand in just a bit, and we'll use that function in a general way anytime we're combining sentences.
The first important change is that our convertNounPhrase will now return a list of LogicSentences, each representing a /possible/ interpretation of the given grammar parse.
The conversion of Names is unchanged from last time, except for wrapping the result inside a list. Something I didn't point out last time was the awkwardness here in mapping, say, "Bob" to "Ex called-Bob(x)". What this does is put information inside the name of the object /inside/ the name of the predicate. Unfortunately, there's no where else to put it. My first system used a Const type of Term, but that doesn't scale to any system where two different objects have the same name.
Here's the big reason why Higher order terms were added; it's much more elegant to state the meaning of a question to be "speaker-asking(q)" where q is a sentence describing what the question should answer, Jeopardy style, than to avoid higher oder terms and pass another parameter around all the functions.
The case of a plural noun brings up our first ambiguous meaning. So what does a word like "potatoes" refer to? All potates, or some? Compare the sentences "Potatoes come from the ground," and "Bob had potatoes for breakfast." It's impossible to tell the meaning of "potatoes" without context, so all we can do here is shrug and let something downstream try to disambiguate the meanings.
The last case is a bit odd. I haven't been able to come up with any sentence where it makes sense to use a sigular noun without an article outside of gems like "I can has cheeseburger?" Hopefully our parser throws out parses with this form when more correct parses are available, but if we are asked to convert the NounPhrase, we should do our best. Unfortunately in this case, our best is to throw all the possibilities out there.
Pronouns are the same as last time, just wrapped in lists. I'm still truncating this for brevity, but you can see here a good case for adding the meaning of at least some words into the Word type itself. We could define a "meaningOf" field for the Word type and use that instead of all these Atom clauses. This also allows an agent to modify the results it gets from this module by modifying its dictionary, rather than having to rewrite meanings built in here.
And if we extend our definition of words to include the occasional phrase like "on fire", we can easily have the meanings of "on fire" and "aflame" be the same without any special cases.
I've broken an adjective phrase into six different cases based on the article, if any, it begins with and the plurality of the noun. You can see that we ignore phrases like "a potatoes." Above I argued that we should try to find a meaning for anything we get, but here we've got contradictory indicators about the plurality of the term rather than some grammar abuse.
One other part that was suprising to me was the seemingly contradictory interpretations of singular nouns preceeded by a or an. The first case is a sentence like "A book is written by an author," where the intent is to refer to all books through the idea of a book-ness. The second case is the intuitive one: "A book is misssing."
You can also see that adjectives are treated the same way as nouns: as atomic sentences. "A big ant" becomes "big(x) & ant(x)," which is problematic. Through And-Elimination, we get "big(x)"; this ant is no longer big for an ant, it is just /big/. We could introduce big-relative-to-ants(x), or possibly "Ey average-ant(y) & big-relative-to(x, y)," neither of which is particularly appealing. It gets worse however. From the Akhmatova paper I mentioned in my last progress posting: take the sentence "The gastronomic capital of France is Lyon." And-Elimination (and a proper interpretation of the rest of the sentence) will leave you with "The capital of France is Lyon." So adjectives are difficult to shoehorn in to logic like this, regardless of whether they are limiting or modifying the base noun.
Moving on, the conversion of relative clauses is fairly straightforward, funny notation notwithstanding. This funny looking structure is the one bit of Haskell I decided to sneak in here. It's a list comprehension, and it's best explained via a contrived example. Let's say you wanted a list of strings that consisted of all possible combinations of a set of prefixes and suffixes. In Haskell, you could say:
This creates a list of nine strings. Now compare that with definition for relative clauses above, and you can see that we are making all possible combinations of noun and verb phrases we can muster. (The definition of convertVerbPhrase from the last post returned only a single LogicSentence; here we're using one modified to return a list of Sentences.)
That's it for this post. There are, of course, more types of noun phrases, but the pattern is the same for them: determine the proper representation in your scheme and add a case to convertNounPhrase to handle it. Verb phrases are trickier, and I hope to get to them soon.
I'm including below the handful of defintions that were reused from last time, in case you want to refer to them (it also makes this post compile as well!). This includes the modified convertVerbPhrase and a new convertSentence that deals with convertNounPhrase and convertVerbPhrase returning lists. It should look familiar after the last example.
=====
> module HackJamNLP2 where
First, we need to revisit and expand our grammar somewhat. This is still a simplified grammar relative to English taken as a whole.
> data NounPhrase = ANoun Word
> | APronoun Word
> | Name Word
> | AdjectivesNoun [Word] Word
> | NPRelClause NounPhrase RelativeClause
> data RelativeClause = RelativeClause Word VerbPhrase
A quick grammar refresher. Relative clauses are a kind of extra, descriptive, verb phrase. In the sentence, "He who smelt it, dealt it," the relative clause "who smelt it" modifies he.
> data PartOfSpeech = Noun | Verb | Adjective | Article | Pronoun
> | Adverb | QueryWord | Unknown
> deriving Eq
We're going to classify the words who what where why when and how as query words. In this naive system, the presense of a query word is going to set off the whole sentence as a question. So this system will flag "That's not how you do it," as a query. This is not a fatal flaw, but requires a bit of analysis to get around.
Now let's update our logical notation.
> data LogicSentence = Atom ID [Term]
> | And LogicSentence LogicSentence
> | Implies LogicSentence LogicSentence
> | ForAll Term LogicSentence
> | Exists Term LogicSentence
> | MemberOf Term Term
> deriving Show
>
> data Term = Variable ID
> | Set [Term]
> | HigherOrder LogicSentence
> deriving Show
From the definitions we used last time, we're adding some elements of higher order logic: sets and higher order terms. (Sets were in the last post but not used.) A number of concepts are much more natural to express in sets, and higher order terms are used sparingly to store meta information to pass on to a higher module, rather than passing around an extra structure to store intent, etc.
One more thing before we get into the meat of this post. Let's define a couple shorthand notations for patterns that will show up a couple of times.
> x = Variable "x"
> inContext = Atom "in-context" [x]
> aSetWhere sent = let s = Variable "s" in
> Exists s (ForAll x ((x `MemberOf` s) `Implies` sent))
aSetWhere defines a set where all it's members satisfy a given sentence.
Throughout convertNounPhrase, which we are about to (re)define, the variable "x" will represent the current noun phrase. We can avoid clashes by being carefull to substitute variables whenever we add two noun phrases together. We'll do that in combineNVSents, another function from last time that we'll expand in just a bit, and we'll use that function in a general way anytime we're combining sentences.
The first important change is that our convertNounPhrase will now return a list of LogicSentences, each representing a /possible/ interpretation of the given grammar parse.
> convertNounPhrase :: NounPhrase -> [LogicSentence]
> convertNounPhrase (Name name)
> = [Exists x (Atom ("called-" ++ wordString name) [x])]
The conversion of Names is unchanged from last time, except for wrapping the result inside a list. Something I didn't point out last time was the awkwardness here in mapping, say, "Bob" to "Ex called-Bob(x)". What this does is put information inside the name of the object /inside/ the name of the predicate. Unfortunately, there's no where else to put it. My first system used a Const type of Term, but that doesn't scale to any system where two different objects have the same name.
> convertNounPhrase (ANoun n)
> | partOfSpeech n == QueryWord
> = [Atom "speaker-asking"
> [HigherOrder (ForAll x baseSentence)]]
> | isPlural n = [ForAll x baseSentence, aSetWhere baseSentence]
> | otherwise = [Exists x baseSentence,
> ForAll x baseSentence,
> aSetWhere baseSentence]
> where baseSentence = Atom (wordString n) [x]
Here's the big reason why Higher order terms were added; it's much more elegant to state the meaning of a question to be "speaker-asking(q)" where q is a sentence describing what the question should answer, Jeopardy style, than to avoid higher oder terms and pass another parameter around all the functions.
The case of a plural noun brings up our first ambiguous meaning. So what does a word like "potatoes" refer to? All potates, or some? Compare the sentences "Potatoes come from the ground," and "Bob had potatoes for breakfast." It's impossible to tell the meaning of "potatoes" without context, so all we can do here is shrug and let something downstream try to disambiguate the meanings.
The last case is a bit odd. I haven't been able to come up with any sentence where it makes sense to use a sigular noun without an article outside of gems like "I can has cheeseburger?" Hopefully our parser throws out parses with this form when more correct parses are available, but if we are asked to convert the NounPhrase, we should do our best. Unfortunately in this case, our best is to throw all the possibilities out there.
> convertNounPhrase (APronoun n)
> = case wordString n of
> "I" -> [Exists x (Atom "speaker" [x])]
> "you" -> [Exists x (Atom "audience" [x])]
> otherwise -> [Exists x inContext]
Pronouns are the same as last time, just wrapped in lists. I'm still truncating this for brevity, but you can see here a good case for adding the meaning of at least some words into the Word type itself. We could define a "meaningOf" field for the Word type and use that instead of all these Atom clauses. This also allows an agent to modify the results it gets from this module by modifying its dictionary, rather than having to rewrite meanings built in here.
And if we extend our definition of words to include the occasional phrase like "on fire", we can easily have the meanings of "on fire" and "aflame" be the same without any special cases.
> convertNounPhrase (AdjectivesNoun (adj:adjs) noun)
> | isThe adj && isPlural noun
> = [ForAll x (inContext `And` (baseSentence adjs noun)),
> aSetWhere (inContext `And` (baseSentence adjs noun))]
> | isA adj && isPlural noun
> = []
> | isPlural noun = [ForAll x (baseSentence (adj:adjs) noun),
> aSetWhere (baseSentence (adj:adjs) noun)]
> | isThe adj = [Exists x (inContext `And` (baseSentence adjs noun))]
> | isA adj = [ForAll x (baseSentence adjs noun),
> Exists x (baseSentence adjs noun)]
> | otherwise = [Exists x (baseSentence (adj:adjs) noun)]
> where
> isThe adj = wordString adj == "the"
> isA adj = wordString adj == "a" || wordString adj == "an"
> baseSentence [] n
> = Atom (wordString n) [x]
> baseSentence (adj:adjs) n
> = Atom (wordString adj) [x] `And` (baseSentence adjs n)
I've broken an adjective phrase into six different cases based on the article, if any, it begins with and the plurality of the noun. You can see that we ignore phrases like "a potatoes." Above I argued that we should try to find a meaning for anything we get, but here we've got contradictory indicators about the plurality of the term rather than some grammar abuse.
One other part that was suprising to me was the seemingly contradictory interpretations of singular nouns preceeded by a or an. The first case is a sentence like "A book is written by an author," where the intent is to refer to all books through the idea of a book-ness. The second case is the intuitive one: "A book is misssing."
You can also see that adjectives are treated the same way as nouns: as atomic sentences. "A big ant" becomes "big(x) & ant(x)," which is problematic. Through And-Elimination, we get "big(x)"; this ant is no longer big for an ant, it is just /big/. We could introduce big-relative-to-ants(x), or possibly "Ey average-ant(y) & big-relative-to(x, y)," neither of which is particularly appealing. It gets worse however. From the Akhmatova paper I mentioned in my last progress posting: take the sentence "The gastronomic capital of France is Lyon." And-Elimination (and a proper interpretation of the rest of the sentence) will leave you with "The capital of France is Lyon." So adjectives are difficult to shoehorn in to logic like this, regardless of whether they are limiting or modifying the base noun.
> convertNounPhrase (NPRelClause np (RelativeClause word vp))
> = [combineNVSents nSent (vSent (subjectOf nSent))
> | nSent <- convertNounPhrase np, > vSent <- convertVerbPhrase vp]
Moving on, the conversion of relative clauses is fairly straightforward, funny notation notwithstanding. This funny looking structure is the one bit of Haskell I decided to sneak in here. It's a list comprehension, and it's best explained via a contrived example. Let's say you wanted a list of strings that consisted of all possible combinations of a set of prefixes and suffixes. In Haskell, you could say:
> contrived = [ prefix ++ suffix | prefix <- ["un", "im", "non"], > suffix <- ["bad", "good", "ugly"] ]
This creates a list of nine strings. Now compare that with definition for relative clauses above, and you can see that we are making all possible combinations of noun and verb phrases we can muster. (The definition of convertVerbPhrase from the last post returned only a single LogicSentence; here we're using one modified to return a list of Sentences.)
That's it for this post. There are, of course, more types of noun phrases, but the pattern is the same for them: determine the proper representation in your scheme and add a case to convertNounPhrase to handle it. Verb phrases are trickier, and I hope to get to them soon.
I'm including below the handful of defintions that were reused from last time, in case you want to refer to them (it also makes this post compile as well!). This includes the modified convertVerbPhrase and a new convertSentence that deals with convertNounPhrase and convertVerbPhrase returning lists. It should look familiar after the last example.
=====
> data VerbPhrase = AVerb Word
> data Word = Word {
> wordString :: String,
> partOfSpeech :: PartOfSpeech,
> isPlural :: Bool
> }
>
> type ID = String
> convertVerbPhrase :: VerbPhrase -> [Term -> LogicSentence]
> convertVerbPhrase (AVerb v) = [\subject -> Atom (wordString v) [subject]]
> subjectOf :: LogicSentence -> Term
> subjectOf (ForAll t _) = t
> subjectOf (Exists t _) = t
> combineNVSents :: LogicSentence -> LogicSentence -> LogicSentence
> combineNVSents (ForAll t npDescription) vpDescription
> = ForAll t (npDescription `Implies` vpDescription)
> combineNVSents (Exists t npDescription) vpDescription
> = Exists t (npDescription `And` vpDescription)