Effectively Creating Weakly Labeled Training Examples via Approximate Domain Knowledge

One of the challenges to information extraction is the requirement of human annotated examples, commonly called gold-standard examples. Many successful approaches alleviate this problem by employing some form of distant supervision, i.e., look into knowledge bases such as Freebase as a source of sup...

Full description

Saved in:
Bibliographic Details
Published inInductive Logic Programming Vol. 9046; pp. 92 - 107
Main Authors Natarajan, Sriraam, Picado, Jose, Khot, Tushar, Kersting, Kristian, Re, Christopher, Shavlik, Jude
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 01.01.2015
Springer International Publishing
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text
ISBN9783319237077
3319237071
ISSN0302-9743
1611-3349
DOI10.1007/978-3-319-23708-4_7

Cover

Loading…
Abstract One of the challenges to information extraction is the requirement of human annotated examples, commonly called gold-standard examples. Many successful approaches alleviate this problem by employing some form of distant supervision, i.e., look into knowledge bases such as Freebase as a source of supervision to create more examples. While this is perfectly reasonable, most distant supervision methods rely on a hand-coded background knowledge that explicitly looks for patterns in text. For example, they assume all sentences containing Person X and Person Y are positive examples of the relation married(X, Y). In this work, we take a different approach – we infer weakly supervised examples for relations from models learned by using knowledge outside the natural language task. We argue that this method creates more robust examples that are particularly useful when learning the entire information-extraction model (the structure and parameters). We demonstrate on three domains that this form of weak supervision yields superior results when learning structure compared to using distant supervision labels or a smaller set of gold-standard labels.
AbstractList One of the challenges to information extraction is the requirement of human annotated examples, commonly called gold-standard examples. Many successful approaches alleviate this problem by employing some form of distant supervision, i.e., look into knowledge bases such as Freebase as a source of supervision to create more examples. While this is perfectly reasonable, most distant supervision methods rely on a hand-coded background knowledge that explicitly looks for patterns in text. For example, they assume all sentences containing Person X and Person Y are positive examples of the relation married(X, Y). In this work, we take a different approach – we infer weakly supervised examples for relations from models learned by using knowledge outside the natural language task. We argue that this method creates more robust examples that are particularly useful when learning the entire information-extraction model (the structure and parameters). We demonstrate on three domains that this form of weak supervision yields superior results when learning structure compared to using distant supervision labels or a smaller set of gold-standard labels.
Author Kersting, Kristian
Khot, Tushar
Natarajan, Sriraam
Shavlik, Jude
Picado, Jose
Re, Christopher
Author_xml – sequence: 1
  givenname: Sriraam
  surname: Natarajan
  fullname: Natarajan, Sriraam
  email: natarasr@indiana.edu
– sequence: 2
  givenname: Jose
  surname: Picado
  fullname: Picado, Jose
– sequence: 3
  givenname: Tushar
  surname: Khot
  fullname: Khot, Tushar
– sequence: 4
  givenname: Kristian
  surname: Kersting
  fullname: Kersting, Kristian
– sequence: 5
  givenname: Christopher
  surname: Re
  fullname: Re, Christopher
– sequence: 6
  givenname: Jude
  surname: Shavlik
  fullname: Shavlik, Jude
BookMark eNqNkM9OwzAMhwMMRBl7Ai59gYBTp0lznMb4IyZxGXCM0jQdZV1b2jLG25NtcODGyfLP_izrOyODqq4cIRcMLhmAvFIyoUiRKRqhhIRyLQ_IyKfos13ED0nABGMUkaujPzMpByQAhIgqyfGEBApR-EESnZJR170BAIuVijkE5Hma5872xdqVX-GkdaYvqkX44szS9zOTutJl4bw1RbXNpxuzakrXhevChOOmaetNsTK9C6_rlV8JH6r60wMLd06Oc1N2bvRTh-TpZjqf3NHZ4-39ZDyjC8Skp0kKMs_TjBlreBaB5RlklmUWLPIszVUCGXJkFhW3aSTzBBSkNhfcmFikAoeE7e92TesfdK1O63rZaQZ661F7Kxq196J30rT36BncM_799w_X9dptIeuqvjWlfTVN79pOi0iJOBY6Uf4U_peKYyUgjn6pb6qMhZk
ContentType Book Chapter
Copyright Springer International Publishing Switzerland 2015
Copyright_xml – notice: Springer International Publishing Switzerland 2015
DBID FFUUA
DEWEY 005.115
DOI 10.1007/978-3-319-23708-4_7
DatabaseName ProQuest Ebook Central - Book Chapters - Demo use only
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Mathematics
Computer Science
EISBN 9783319237084
331923708X
EISSN 1611-3349
Editor Davis, Jesse
Ramon, Jan
Editor_xml – sequence: 1
  fullname: Davis, Jesse
– sequence: 2
  fullname: Ramon, Jan
EndPage 107
ExternalDocumentID EBC6296556_89_103
EBC5596052_89_103
GroupedDBID 0D6
0DA
38.
AABBV
AAGZE
AAZAK
AAZUS
ABBVZ
ABFTD
ABMNI
ACKNT
ACRRC
AEDXK
AEJLV
AEKFX
AETDV
AEZAY
ALMA_UNASSIGNED_HOLDINGS
APFYR
AZZ
BBABE
CZZ
FFUUA
I4C
IEZ
IY-
LDH
SBO
SFQCF
TMQGW
TPJZQ
TSXQS
TWXRB
Z83
Z88
-DT
-~X
29L
2HA
2HV
ACGFS
ADCXD
EJD
F5P
LAS
P2P
RSU
~02
ID FETCH-LOGICAL-g338t-8b07ffbd1aca4d20c4d0dc1dc0c34dbf980d3431c394cb27f8090bcf64aa56b63
ISBN 9783319237077
3319237071
ISSN 0302-9743
IngestDate Tue Jul 29 20:03:17 EDT 2025
Mon Apr 07 01:55:03 EDT 2025
Thu May 29 00:28:37 EDT 2025
IsPeerReviewed true
IsScholarly true
LCCallNum QA8.9-QA10.3Q334-342
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-g338t-8b07ffbd1aca4d20c4d0dc1dc0c34dbf980d3431c394cb27f8090bcf64aa56b63
OCLC 933623782
PQID EBC5596052_89_103
PageCount 16
ParticipantIDs springer_books_10_1007_978_3_319_23708_4_7
proquest_ebookcentralchapters_6296556_89_103
proquest_ebookcentralchapters_5596052_89_103
PublicationCentury 2000
PublicationDate 2015-01-01
PublicationDateYYYYMMDD 2015-01-01
PublicationDate_xml – month: 01
  year: 2015
  text: 2015-01-01
  day: 01
PublicationDecade 2010
PublicationPlace Switzerland
PublicationPlace_xml – name: Switzerland
– name: Cham
PublicationSeriesSubtitle Lecture Notes in Artificial Intelligence
PublicationSeriesTitle Lecture Notes in Computer Science
PublicationSeriesTitleAlternate Lect.Notes Computer
PublicationSubtitle 24th International Conference, ILP 2014, Nancy, France, September 14-16, 2014, Revised Selected Papers
PublicationTitle Inductive Logic Programming
PublicationYear 2015
Publisher Springer International Publishing AG
Springer International Publishing
Publisher_xml – name: Springer International Publishing AG
– name: Springer International Publishing
RelatedPersons Kleinberg, Jon M.
Mattern, Friedemann
Naor, Moni
Mitchell, John C.
Terzopoulos, Demetri
Steffen, Bernhard
Pandu Rangan, C.
Kanade, Takeo
Kittler, Josef
Weikum, Gerhard
Hutchison, David
Tygar, Doug
RelatedPersons_xml – sequence: 1
  givenname: David
  surname: Hutchison
  fullname: Hutchison, David
– sequence: 2
  givenname: Takeo
  surname: Kanade
  fullname: Kanade, Takeo
– sequence: 3
  givenname: Josef
  surname: Kittler
  fullname: Kittler, Josef
– sequence: 4
  givenname: Jon M.
  surname: Kleinberg
  fullname: Kleinberg, Jon M.
– sequence: 5
  givenname: Friedemann
  surname: Mattern
  fullname: Mattern, Friedemann
– sequence: 6
  givenname: John C.
  surname: Mitchell
  fullname: Mitchell, John C.
– sequence: 7
  givenname: Moni
  surname: Naor
  fullname: Naor, Moni
– sequence: 8
  givenname: C.
  surname: Pandu Rangan
  fullname: Pandu Rangan, C.
– sequence: 9
  givenname: Bernhard
  surname: Steffen
  fullname: Steffen, Bernhard
– sequence: 10
  givenname: Demetri
  surname: Terzopoulos
  fullname: Terzopoulos, Demetri
– sequence: 11
  givenname: Doug
  surname: Tygar
  fullname: Tygar, Doug
– sequence: 12
  givenname: Gerhard
  surname: Weikum
  fullname: Weikum, Gerhard
SSID ssj0001599540
ssj0002792
Score 1.9897096
Snippet One of the challenges to information extraction is the requirement of human annotated examples, commonly called gold-standard examples. Many successful...
SourceID springer
proquest
SourceType Publisher
StartPage 92
SubjectTerms Artificial intelligence
Computer programming / software development
Distant Supervision
Freebase
Gold Standard Examples
Markov Logic Networks (MLN)
Mathematical theory of computation
Weak Supervision
Title Effectively Creating Weakly Labeled Training Examples via Approximate Domain Knowledge
URI http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=5596052&ppg=103
http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=6296556&ppg=103
http://link.springer.com/10.1007/978-3-319-23708-4_7
Volume 9046
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lj9MwELa65QIcgAXEWz5wogQ5sfM6cECoaLV099Rd9hbZjrPbRW2lJEUr_gx_lZnYTtOChJZL1EZW43om42_GM98Q8pblEbw0hgcmi2QgKlkFSuRpgAH_nBnAGB0D38lpcnQmji_ii9Ho1yBradOqD_rnX-tK_keqcA_kilWyt5Bs_6NwAz6DfOEKEobrHvjdDbO6dEGkasXEH-yXrDHlHzOtln4vwgivbGUtr12Ms17UUi57W4i1JGt_CNDb3at1Z5Xnm-ZK1ltrDBhxsWsVhrpmKZBhKsiV3KHQ1eXkm5Hf4ftMKtjZSmRR73pRTKY3EhmJm8mPhUQQXK9vFgCbDWD5JQyZfPVRPmvwkIi5-ThzRx2n67bLIJv4bhTeOA2jF2G8F73w0cu9-Oc2BLfj7nKOeDRlrvGLK_sCkw5OkbWSxlrxBLkZueVCdZbZdtxze7zrtPvH9jHMGMHqLnxYFogiPSAHaRaPyZ1P0-PZ-TaIh3RtSJbjtn5kY7THVnZOWEzk5xxauqftf-g5sCzN8d4TdzyevUP6DvvMH5L7WA9DsVAFFu8RGZnVIXng15-69T8k9056BuDmMTkf6AT1OkGtTlCnE9TrBPU6QUEn6EAnqNUJ2uvEE3L2ZTr_fBS4Hh7BJedZG2SKpVWlylBqKcqIaVGyUoelZpqLUlV5xkoOIFbzXGgVpVXGcqZ0lQgp40Ql_CkZr9Yr84xQUfKoMkqFElnaKvCEGYBpIzR4yFVszHPy3q9Y0WUauPRmbdenKcB5Buc9KrK8CBn_5_AkypM4Tvrh77wQChzdFJ7wG4RX8AKEV3TCK0B4L24z-CW5u30zXpFxW2_Ma0C6rXrj9O03zOmlKQ
linkProvider Library Specific Holdings
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Inductive+Logic+Programming&rft.au=Natarajan%2C+Sriraam&rft.au=Picado%2C+Jose&rft.au=Khot%2C+Tushar&rft.au=Kersting%2C+Kristian&rft.atitle=Effectively+Creating+Weakly+Labeled+Training+Examples+via+Approximate+Domain+Knowledge&rft.series=Lecture+Notes+in+Computer+Science&rft.date=2015-01-01&rft.pub=Springer+International+Publishing&rft.isbn=9783319237077&rft.issn=0302-9743&rft.eissn=1611-3349&rft.spage=92&rft.epage=107&rft_id=info:doi/10.1007%2F978-3-319-23708-4_7
thumbnail_s http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F5596052-l.jpg
http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F6296556-l.jpg