Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data

Most of the semi-supervised classification methods developed so far use unlabeled data for regularization purposes under particular distributional assumptions such as the cluster assumption. In contrast, recently developed methods of classification from positive and unlabeled data (PU classification...

Full description

Saved in:

Bibliographic Details
Main Authors	Sakai, Tomoya, Plessis, Marthinus Christoffel du, Niu, Gang, Sugiyama, Masashi
Format	Journal Article
Language	English
Published	23.05.2016
Subjects	Computer Science - Learning
Online Access	Get full text

Cover

Loading…

Abstract	Most of the semi-supervised classification methods developed so far use unlabeled data for regularization purposes under particular distributional assumptions such as the cluster assumption. In contrast, recently developed methods of classification from positive and unlabeled data (PU classification) use unlabeled data for risk evaluation, i.e., label information is directly extracted from unlabeled data. In this paper, we extend PU classification to also incorporate negative data and propose a novel semi-supervised classification approach. We establish generalization error bounds for our novel methods and show that the bounds decrease with respect to the number of unlabeled data without the distributional assumptions that are required in existing semi-supervised classification methods. Through experiments, we demonstrate the usefulness of the proposed methods.
AbstractList	Most of the semi-supervised classification methods developed so far use unlabeled data for regularization purposes under particular distributional assumptions such as the cluster assumption. In contrast, recently developed methods of classification from positive and unlabeled data (PU classification) use unlabeled data for risk evaluation, i.e., label information is directly extracted from unlabeled data. In this paper, we extend PU classification to also incorporate negative data and propose a novel semi-supervised classification approach. We establish generalization error bounds for our novel methods and show that the bounds decrease with respect to the number of unlabeled data without the distributional assumptions that are required in existing semi-supervised classification methods. Through experiments, we demonstrate the usefulness of the proposed methods.
Author	Sugiyama, Masashi Niu, Gang Plessis, Marthinus Christoffel du Sakai, Tomoya
Author_xml	– sequence: 1 givenname: Tomoya surname: Sakai fullname: Sakai, Tomoya – sequence: 2 givenname: Marthinus Christoffel du surname: Plessis fullname: Plessis, Marthinus Christoffel du – sequence: 3 givenname: Gang surname: Niu fullname: Niu, Gang – sequence: 4 givenname: Masashi surname: Sugiyama fullname: Sugiyama, Masashi
BackLink	https://doi.org/10.48550/arXiv.1605.06955$$DView paper in arXiv
BookMark	eNpdj09LxDAUxHPQg65-AE_mC7SmTV-SPWr9CwsKu-KxvDQvEGjTJa1Fv73d1ZOnGYZhht85O4lDJMauCpFXBkDcYPoKc14oAblQa4Az9rGlPmTbzz2lOYzkeN3hOAYfWpzCEPkdHsLF_Mt9Gnr-NoxhCjNxjI6_xw4tdUv7Hie8YKceu5Eu_3TFdo8Pu_o527w-vdS3mwyVhqzUXqHxALZABU4uVhIJo5VWtvRSlMZBS61dV9aAraRCXUjhvVyuWlfJFbv-nT2SNfsUekzfzYGwORLKH6KaTmo
ContentType	Journal Article
Copyright	http://arxiv.org/licenses/nonexclusive-distrib/1.0
Copyright_xml	– notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0
DBID	AKY GOX
DOI	10.48550/arxiv.1605.06955
DatabaseName	arXiv Computer Science arXiv.org
DatabaseTitleList
Database_xml	– sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository
DeliveryMethod	fulltext_linktorsrc
ExternalDocumentID	1605_06955
GroupedDBID	AKY GOX
ID	FETCH-LOGICAL-a675-27f6a8f55b1a65d38f53ee087676b2f3028d5cecb94b85b436a7130ff3abecd43
IEDL.DBID	GOX
IngestDate	Mon Jan 08 05:45:32 EST 2024
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	false
IsScholarly	false
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-a675-27f6a8f55b1a65d38f53ee087676b2f3028d5cecb94b85b436a7130ff3abecd43
OpenAccessLink	https://arxiv.org/abs/1605.06955
ParticipantIDs	arxiv_primary_1605_06955
PublicationCentury	2000
PublicationDate	2016-05-23
PublicationDateYYYYMMDD	2016-05-23
PublicationDate_xml	– month: 05 year: 2016 text: 2016-05-23 day: 23
PublicationDecade	2010
PublicationYear	2016
Score	1.6322405
SecondaryResourceType	preprint
Snippet	Most of the semi-supervised classification methods developed so far use unlabeled data for regularization purposes under particular distributional assumptions...
SourceID	arxiv
SourceType	Open Access Repository
SubjectTerms	Computer Science - Learning
Title	Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data
URI	https://arxiv.org/abs/1605.06955
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1LSwMxEB7anryIolKf5OA1aDeP3R7roxbBB7TFvS2ZTQI9uJZ2W_z5TrIriuAtTB6HLyTfN8lkAnBJOx6pBnQcM9Rc-lRxJJbgPkXib6OyDMM55NOznszlY67yDrDvtzBm9bnYNvmBcX010OHIQw-V6kI3SULI1sNL3lxOxlRcbfufdqQxo-kXSYz3YLdVd2zUTMc-dFx1AG9T977g080yLMu1syx-RBlCdCIq7MYEIxX-2MPLD_Yag6q2jpHHz-YVzRnxhGV3pjaHMBvfz24nvP3PgBuS5TxJvTaZVwoHRisrqCicCynhUo2JF8T0VpWuxKHETKEU2pAHee29oKFLK8UR9KqPyvWBGZkaq6gzMYskpwKFKIMQUmicLm12DP2IQrFsUlYUAaAiAnTyf9Up7JAc0OFuPBFn0KtXG3dOlFvjRcT9C4Z0geU
link.rule.ids	228,230,783,888
linkProvider	Cornell University
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Semi-Supervised+Classification+Based+on+Classification+from+Positive+and+Unlabeled+Data&rft.au=Sakai%2C+Tomoya&rft.au=Plessis%2C+Marthinus+Christoffel+du&rft.au=Niu%2C+Gang&rft.au=Sugiyama%2C+Masashi&rft.date=2016-05-23&rft_id=info:doi/10.48550%2Farxiv.1605.06955&rft.externalDocID=1605_06955