PDA

View Full Version : anyone know much about language filter technology?


rocketdodger
18th October 2007, 01:53 PM
I am very interested in technology that can pick out meaningful sentences among gibberish. Anyone know much about the topic?

Soapy Sam
18th October 2007, 02:21 PM
The NSA.

So they say anyway.

drkitten
18th October 2007, 05:46 PM
I am very interested in technology that can pick out meaningful sentences among gibberish. Anyone know much about the topic?

Yes, quite a bit. Basically, language has dependency structure (if you are an English speaker, you almost certainly know the word that follows "Will you do me a ...", and you probably have a strong preference between "little red fire engine" and "red little fire engine," even if you can't explain it). In a very simple case, anything that follows the word "the" is almost certainly a noun, adverb, or adjective.

If what you are looking at doesn't obey the dependency structure, then it's gibberish. The closer your gibberish-generator can match real dependency structure, the harder it will be to filter. Hence the arms race between spammers and spam filters....