Structured Relation Discovery using Generative Models, by Limin Yao, Aria Haghighi, Sebastian Riedel, and Andrew McCallum, EMNLP 2011.
This paper exemplifies my problems with LDA and unsupervised learning in general. Basically what they did was tweak LDA to model relations instead of words, with each “topic” having a set of multinomial distributions over features for an observed relation (e.g., the Gamma Knife, made by the Swedish medical technology firm Elekta, is a “word” in this LDA model, with a set of features that are used when predicting the relation between the two entities, corresponding to the “topic” in LDA). At its heart, this isn’t very different from what Haghighi and Klein did for coreference resolution, or what I am planning on doing with entity resolution in NELL. The models look almost identical if you squint a little bit. But they have relations as their “topic” instead of entities, as in H&K’s coreference model, with different features that are intended to be better suited to relations.
Read more...
Coreference for Learning to Extract Relations: Yes, Virginia, Coreference Matters, by Ryan Gabbard, Marjorie Freedman, and Ralph Weischedel (some folks at Raytheon), ACL 2011 short paper.
This is a simple short paper with one main point. When you build an information extraction system (theirs in semi-supervised), including the output of a coreference resolution system should increase your recall, at the cost of precision. That’s pretty much what you would expect; the result is intuitive.
Read more...
Collective Cross-Document Relation Extraction Without Labelled Data, by Limin Yao, Sebastian Riedel, and Andrew McCallum, EMNLP 2010.
This paper is really very similar to a lot of previous work in weakly supervised information extraction, including several papers I’ve already posted here. They do the typical “find sentences that mention two Freebase entities and learn a relation classifier from those sentences” business. There were two things in this paper that I thought were worth mentioning.
Read more...
Supervised Noun Phrase Coreference Research: The First Fifteen Years, by Vincent Ng, ACL 2010.
This is an interesting survey paper that gives a pretty good overview for someone who is new to coreference resolution. It’s focused completely on within-document, supervised coreference resolution, though he mentions a couple of unsupervised models in passing (he doesn’t ever talk about cross-document coreference). He spends most of his time on the mention-pair model, because that’s what has been used most frequently, though newer models can currently get better performance. It ends with 6 1/2 pages of references.
Read more...
Stanford's Multi-Pass Sieve Coreference Resolution System at the CoNLL-2011 Shared Task, by Lee, Peirsman, Chang, Chambers, Surdeanu, and Jurafsky.
This system had the highest coreference resolution performance on the shared task, and this paper is a write up describing the system.
Read more...