More information on internal seminars can be required to Elena Lofrese
Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure
07 February 2012, 11:00 - Location: C-29
- Speakers:
-
Oscar
Täckström (SICS / Uppsala University, Sweden)
The ability to predict the linguistic structure of sentences
or documents is central to the study of natural language
processing. While annotated resources for parsing and several
other tasks are available in a number of languages, we cannot
expect to have access to labeled resources for all tasks in all
languages. In this talk I will describe how cross-lingual word
clusters can be used as a way to sidestep this problem, focusing
on the important tasks of syntactic dependency parsing and
named-entity recognition (NER). First, I will show how
monolingual word clusters can be used to improve parsing and NER
for a range of different languages, across families. I will then
describe an algorithm for inducing cross-lingual word clusters
using large corpora and word alignments and how these clusters
can significantly improve the accuracy of cross-lingual structure
prediction. Specifically, I will show how an English dependency
parser and NER system can be transferred to a range of other
languages, without any need for target language training
data.