Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data
Most of the semi-supervised classification methods developed so far use unlabeled data for regularization purposes under particular distributional assumptions such as the cluster assumption. In contrast, recently developed methods of classification from positive and unlabeled data (PU classification...
Saved in:
Main Authors | , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
23.05.2016
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Most of the semi-supervised classification methods developed so far use
unlabeled data for regularization purposes under particular distributional
assumptions such as the cluster assumption. In contrast, recently developed
methods of classification from positive and unlabeled data (PU classification)
use unlabeled data for risk evaluation, i.e., label information is directly
extracted from unlabeled data. In this paper, we extend PU classification to
also incorporate negative data and propose a novel semi-supervised
classification approach. We establish generalization error bounds for our novel
methods and show that the bounds decrease with respect to the number of
unlabeled data without the distributional assumptions that are required in
existing semi-supervised classification methods. Through experiments, we
demonstrate the usefulness of the proposed methods. |
---|---|
AbstractList | Most of the semi-supervised classification methods developed so far use
unlabeled data for regularization purposes under particular distributional
assumptions such as the cluster assumption. In contrast, recently developed
methods of classification from positive and unlabeled data (PU classification)
use unlabeled data for risk evaluation, i.e., label information is directly
extracted from unlabeled data. In this paper, we extend PU classification to
also incorporate negative data and propose a novel semi-supervised
classification approach. We establish generalization error bounds for our novel
methods and show that the bounds decrease with respect to the number of
unlabeled data without the distributional assumptions that are required in
existing semi-supervised classification methods. Through experiments, we
demonstrate the usefulness of the proposed methods. |
Author | Sugiyama, Masashi Niu, Gang Plessis, Marthinus Christoffel du Sakai, Tomoya |
Author_xml | – sequence: 1 givenname: Tomoya surname: Sakai fullname: Sakai, Tomoya – sequence: 2 givenname: Marthinus Christoffel du surname: Plessis fullname: Plessis, Marthinus Christoffel du – sequence: 3 givenname: Gang surname: Niu fullname: Niu, Gang – sequence: 4 givenname: Masashi surname: Sugiyama fullname: Sugiyama, Masashi |
BackLink | https://doi.org/10.48550/arXiv.1605.06955$$DView paper in arXiv |
BookMark | eNpdj09LxDAUxHPQg65-AE_mC7SmTV-SPWr9CwsKu-KxvDQvEGjTJa1Fv73d1ZOnGYZhht85O4lDJMauCpFXBkDcYPoKc14oAblQa4Az9rGlPmTbzz2lOYzkeN3hOAYfWpzCEPkdHsLF_Mt9Gnr-NoxhCjNxjI6_xw4tdUv7Hie8YKceu5Eu_3TFdo8Pu_o527w-vdS3mwyVhqzUXqHxALZABU4uVhIJo5VWtvRSlMZBS61dV9aAraRCXUjhvVyuWlfJFbv-nT2SNfsUekzfzYGwORLKH6KaTmo |
ContentType | Journal Article |
Copyright | http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
Copyright_xml | – notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
DBID | AKY GOX |
DOI | 10.48550/arxiv.1605.06955 |
DatabaseName | arXiv Computer Science arXiv.org |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository |
DeliveryMethod | fulltext_linktorsrc |
ExternalDocumentID | 1605_06955 |
GroupedDBID | AKY GOX |
ID | FETCH-LOGICAL-a675-27f6a8f55b1a65d38f53ee087676b2f3028d5cecb94b85b436a7130ff3abecd43 |
IEDL.DBID | GOX |
IngestDate | Mon Jan 08 05:45:32 EST 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-a675-27f6a8f55b1a65d38f53ee087676b2f3028d5cecb94b85b436a7130ff3abecd43 |
OpenAccessLink | https://arxiv.org/abs/1605.06955 |
ParticipantIDs | arxiv_primary_1605_06955 |
PublicationCentury | 2000 |
PublicationDate | 2016-05-23 |
PublicationDateYYYYMMDD | 2016-05-23 |
PublicationDate_xml | – month: 05 year: 2016 text: 2016-05-23 day: 23 |
PublicationDecade | 2010 |
PublicationYear | 2016 |
Score | 1.6322405 |
SecondaryResourceType | preprint |
Snippet | Most of the semi-supervised classification methods developed so far use
unlabeled data for regularization purposes under particular distributional
assumptions... |
SourceID | arxiv |
SourceType | Open Access Repository |
SubjectTerms | Computer Science - Learning |
Title | Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data |
URI | https://arxiv.org/abs/1605.06955 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1LSwMxEB7anryIolKf5OA1aDeP3R7roxbBB7TFvS2ZTQI9uJZ2W_z5TrIriuAtTB6HLyTfN8lkAnBJOx6pBnQcM9Rc-lRxJJbgPkXib6OyDMM55NOznszlY67yDrDvtzBm9bnYNvmBcX010OHIQw-V6kI3SULI1sNL3lxOxlRcbfufdqQxo-kXSYz3YLdVd2zUTMc-dFx1AG9T977g080yLMu1syx-RBlCdCIq7MYEIxX-2MPLD_Yag6q2jpHHz-YVzRnxhGV3pjaHMBvfz24nvP3PgBuS5TxJvTaZVwoHRisrqCicCynhUo2JF8T0VpWuxKHETKEU2pAHee29oKFLK8UR9KqPyvWBGZkaq6gzMYskpwKFKIMQUmicLm12DP2IQrFsUlYUAaAiAnTyf9Up7JAc0OFuPBFn0KtXG3dOlFvjRcT9C4Z0geU |
link.rule.ids | 228,230,783,888 |
linkProvider | Cornell University |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Semi-Supervised+Classification+Based+on+Classification+from+Positive+and+Unlabeled+Data&rft.au=Sakai%2C+Tomoya&rft.au=Plessis%2C+Marthinus+Christoffel+du&rft.au=Niu%2C+Gang&rft.au=Sugiyama%2C+Masashi&rft.date=2016-05-23&rft_id=info:doi/10.48550%2Farxiv.1605.06955&rft.externalDocID=1605_06955 |