"Software Learns to Translate by Reading Up"
New Scientist (02/22/05); Knight, Will
Kevin Knight of the University of Southern California's Information Sciences Institute said his new translation software is in line with the new direction of machine learning. Speaking at the American Association for the Advancement of Science meeting in Washington, D.C., Knight said the translation parameters that his statistical machine translation software develops allow computers to generate ideas about the structure of different languages. "Before long a machine will discover something about linguistics that only a machine could, by crunching through billions of words," Knight said. Knight and Daniel Marcu, also with the institute, developed the automated translation tool and formed Language Weaver in Los Angeles to sell the software. The new software is designed to translate dictionaries, patterns, and rules, and build probability rules for words, phrases, and syntactic structures. Current translation software tends to use hand-coded rules when transposing words and phrases. The software is said to be faster, and able to handle less familiar vocabulary and languages better.
Full Article: http://www.newscientist.com/article.ns?id=dn7054
This is similar to a paper that I researched for a class, but instead of learning words -- it was for web searching. By looking at a specific website that has one topic, it would search for words and store the word's probability compared to that site. The words were then separated into a priority, based on how often the word is used throughout the site. So at a computer science site, the words: artificial intelligence might come up several times, and would be given a higher score. If you look at the scores of the key words, after limiting the common words like "The", "A", etc.. You would have a good idea what were the common words and lingo for that type of subject.
My paper and power point presentation: http://www.cs.sunyit.edu/~andrusw/Project/Search_Engines/
No comments:
Post a Comment