scCross: efficient search for rare subpopulations across multiple single-cell samples

Abstract Motivation Identifying rare cell types is an important task to capture the heterogeneity of single-cell data, such as scRNA-seq. The widespread availability of such data enables to aggregate multiple samples, corresponding for example to different donors, into the same study. Yet, such aggr...

Full description

Saved in:

Bibliographic Details
Published in	Bioinformatics (Oxford, England) Vol. 40; no. 6
Main Authors	Gerniers, Alexander, Nijssen, Siegfried, Dupont, Pierre
Format	Journal Article
Language	English
Published	England Oxford University Press 03.06.2024 Oxford Publishing Limited (England)
Subjects	Algorithms Availability Cluster Analysis Clustering Data integration Genes Heterogeneity Humans Lung cancer Lung Neoplasms - genetics Original Paper Pancreatic cancer Sequence Analysis, RNA - methods Single-Cell Analysis - methods Software Subpopulations Target detection
Online Access	Get full text
ISSN	1367-4811 1367-4803 1367-4811
DOI	10.1093/bioinformatics/btae371

Cover

Loading…

Abstract	Abstract Motivation Identifying rare cell types is an important task to capture the heterogeneity of single-cell data, such as scRNA-seq. The widespread availability of such data enables to aggregate multiple samples, corresponding for example to different donors, into the same study. Yet, such aggregated data is often subject to batch effects between samples. Clustering it therefore generally requires the use of data integration methods, which can lead to overcorrection, making the identification of rare cells difficult. We present scCross, a biclustering method identifying rare subpopulations of cells present across multiple single-cell samples. It jointly identifies a group of cells with specific marker genes by relying on a global sum criterion, computed over entire subpopulation of cells, rather than pairwise comparisons between individual cells. This proves robust with respect to the high variability of scRNA-seq data, in particular batch effects. Results We show through several case studies that scCross is able to identify rare subpopulations across multiple samples without performing prior data integration. Namely, it identifies a cilium subpopulation with potential new ciliary genes from lung cancer cells, which is not detected by typical alternatives. It also highlights rare subpopulations in human pancreas samples sequenced with different protocols, despite visible shifts in expression levels between batches. We further show that scCross outperforms typical alternatives at identifying a target rare cell type in a controlled experiment with artificially created batch effects. This shows the ability of scCross to efficiently identify rare cell subpopulations characterized by specific genes despite the presence of batch effects. Availability and implementation The R and Scala implementation of scCross is freely available on GitHub, at https://github.com/agerniers/scCross/. A snapshot of the code and the data underlying this article are available on Zenodo, at https://zenodo.org/doi/10.5281/zenodo.10471063.
AbstractList	Identifying rare cell types is an important task to capture the heterogeneity of single-cell data, such as scRNA-seq. The widespread availability of such data enables to aggregate multiple samples, corresponding for example to different donors, into the same study. Yet, such aggregated data is often subject to batch effects between samples. Clustering it therefore generally requires the use of data integration methods, which can lead to overcorrection, making the identification of rare cells difficult. We present scCross, a biclustering method identifying rare subpopulations of cells present across multiple single-cell samples. It jointly identifies a group of cells with specific marker genes by relying on a global sum criterion, computed over entire subpopulation of cells, rather than pairwise comparisons between individual cells. This proves robust with respect to the high variability of scRNA-seq data, in particular batch effects. We show through several case studies that scCross is able to identify rare subpopulations across multiple samples without performing prior data integration. Namely, it identifies a cilium subpopulation with potential new ciliary genes from lung cancer cells, which is not detected by typical alternatives. It also highlights rare subpopulations in human pancreas samples sequenced with different protocols, despite visible shifts in expression levels between batches. We further show that scCross outperforms typical alternatives at identifying a target rare cell type in a controlled experiment with artificially created batch effects. This shows the ability of scCross to efficiently identify rare cell subpopulations characterized by specific genes despite the presence of batch effects. The R and Scala implementation of scCross is freely available on GitHub, at https://github.com/agerniers/scCross/. A snapshot of the code and the data underlying this article are available on Zenodo, at https://zenodo.org/doi/10.5281/zenodo.10471063. Abstract Motivation Identifying rare cell types is an important task to capture the heterogeneity of single-cell data, such as scRNA-seq. The widespread availability of such data enables to aggregate multiple samples, corresponding for example to different donors, into the same study. Yet, such aggregated data is often subject to batch effects between samples. Clustering it therefore generally requires the use of data integration methods, which can lead to overcorrection, making the identification of rare cells difficult. We present scCross, a biclustering method identifying rare subpopulations of cells present across multiple single-cell samples. It jointly identifies a group of cells with specific marker genes by relying on a global sum criterion, computed over entire subpopulation of cells, rather than pairwise comparisons between individual cells. This proves robust with respect to the high variability of scRNA-seq data, in particular batch effects. Results We show through several case studies that scCross is able to identify rare subpopulations across multiple samples without performing prior data integration. Namely, it identifies a cilium subpopulation with potential new ciliary genes from lung cancer cells, which is not detected by typical alternatives. It also highlights rare subpopulations in human pancreas samples sequenced with different protocols, despite visible shifts in expression levels between batches. We further show that scCross outperforms typical alternatives at identifying a target rare cell type in a controlled experiment with artificially created batch effects. This shows the ability of scCross to efficiently identify rare cell subpopulations characterized by specific genes despite the presence of batch effects. Availability and implementation The R and Scala implementation of scCross is freely available on GitHub, at https://github.com/agerniers/scCross/. A snapshot of the code and the data underlying this article are available on Zenodo, at https://zenodo.org/doi/10.5281/zenodo.10471063. Identifying rare cell types is an important task to capture the heterogeneity of single-cell data, such as scRNA-seq. The widespread availability of such data enables to aggregate multiple samples, corresponding for example to different donors, into the same study. Yet, such aggregated data is often subject to batch effects between samples. Clustering it therefore generally requires the use of data integration methods, which can lead to overcorrection, making the identification of rare cells difficult. We present scCross, a biclustering method identifying rare subpopulations of cells present across multiple single-cell samples. It jointly identifies a group of cells with specific marker genes by relying on a global sum criterion, computed over entire subpopulation of cells, rather than pairwise comparisons between individual cells. This proves robust with respect to the high variability of scRNA-seq data, in particular batch effects.MOTIVATIONIdentifying rare cell types is an important task to capture the heterogeneity of single-cell data, such as scRNA-seq. The widespread availability of such data enables to aggregate multiple samples, corresponding for example to different donors, into the same study. Yet, such aggregated data is often subject to batch effects between samples. Clustering it therefore generally requires the use of data integration methods, which can lead to overcorrection, making the identification of rare cells difficult. We present scCross, a biclustering method identifying rare subpopulations of cells present across multiple single-cell samples. It jointly identifies a group of cells with specific marker genes by relying on a global sum criterion, computed over entire subpopulation of cells, rather than pairwise comparisons between individual cells. This proves robust with respect to the high variability of scRNA-seq data, in particular batch effects.We show through several case studies that scCross is able to identify rare subpopulations across multiple samples without performing prior data integration. Namely, it identifies a cilium subpopulation with potential new ciliary genes from lung cancer cells, which is not detected by typical alternatives. It also highlights rare subpopulations in human pancreas samples sequenced with different protocols, despite visible shifts in expression levels between batches. We further show that scCross outperforms typical alternatives at identifying a target rare cell type in a controlled experiment with artificially created batch effects. This shows the ability of scCross to efficiently identify rare cell subpopulations characterized by specific genes despite the presence of batch effects.RESULTSWe show through several case studies that scCross is able to identify rare subpopulations across multiple samples without performing prior data integration. Namely, it identifies a cilium subpopulation with potential new ciliary genes from lung cancer cells, which is not detected by typical alternatives. It also highlights rare subpopulations in human pancreas samples sequenced with different protocols, despite visible shifts in expression levels between batches. We further show that scCross outperforms typical alternatives at identifying a target rare cell type in a controlled experiment with artificially created batch effects. This shows the ability of scCross to efficiently identify rare cell subpopulations characterized by specific genes despite the presence of batch effects.The R and Scala implementation of scCross is freely available on GitHub, at https://github.com/agerniers/scCross/. A snapshot of the code and the data underlying this article are available on Zenodo, at https://zenodo.org/doi/10.5281/zenodo.10471063.AVAILABILITY AND IMPLEMENTATIONThe R and Scala implementation of scCross is freely available on GitHub, at https://github.com/agerniers/scCross/. A snapshot of the code and the data underlying this article are available on Zenodo, at https://zenodo.org/doi/10.5281/zenodo.10471063. Motivation Identifying rare cell types is an important task to capture the heterogeneity of single-cell data, such as scRNA-seq. The widespread availability of such data enables to aggregate multiple samples, corresponding for example to different donors, into the same study. Yet, such aggregated data is often subject to batch effects between samples. Clustering it therefore generally requires the use of data integration methods, which can lead to overcorrection, making the identification of rare cells difficult. We present scCross, a biclustering method identifying rare subpopulations of cells present across multiple single-cell samples. It jointly identifies a group of cells with specific marker genes by relying on a global sum criterion, computed over entire subpopulation of cells, rather than pairwise comparisons between individual cells. This proves robust with respect to the high variability of scRNA-seq data, in particular batch effects. Results We show through several case studies that scCross is able to identify rare subpopulations across multiple samples without performing prior data integration. Namely, it identifies a cilium subpopulation with potential new ciliary genes from lung cancer cells, which is not detected by typical alternatives. It also highlights rare subpopulations in human pancreas samples sequenced with different protocols, despite visible shifts in expression levels between batches. We further show that scCross outperforms typical alternatives at identifying a target rare cell type in a controlled experiment with artificially created batch effects. This shows the ability of scCross to efficiently identify rare cell subpopulations characterized by specific genes despite the presence of batch effects. Availability and implementation The R and Scala implementation of scCross is freely available on GitHub, at https://github.com/agerniers/scCross/. A snapshot of the code and the data underlying this article are available on Zenodo, at https://zenodo.org/doi/10.5281/zenodo.10471063.
Author	Nijssen, Siegfried Gerniers, Alexander Dupont, Pierre
Author_xml	– sequence: 1 givenname: Alexander orcidid: 0000-0002-7968-6978 surname: Gerniers fullname: Gerniers, Alexander email: alexander.gerniers@uclouvain.be – sequence: 2 givenname: Siegfried surname: Nijssen fullname: Nijssen, Siegfried – sequence: 3 givenname: Pierre surname: Dupont fullname: Dupont, Pierre
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/38889273$$D View this record in MEDLINE/PubMed
BookMark	eNqNUU1PJCEUJMaNH7P-BdOJFy-tvAa6wYsxk13dxGQv65kAA4rphha6Tfz3Ms5o1NOeILyqol7VIdoNMViEjgGfARbkXPvog4tpUJM3-VxPypIOdtABkLarKQfY_XTfR4c5P2KMGWbtHtonnHPRdOQA3WWzTDHni8o65423YaqyVck8VEW9SirZKs96jOPcl69iyJUya0I1zP3kx76MfbjvbW1s31dZDeUp_0Q_nOqzPdqeC3T3-9e_5U19-_f6z_LqtjaUNFPdEFgBY1RrwZVdUWhciykDZxx3nWsdN4LRttFaGUM5FYpxSrAWxlHoGCELdLnRHWc92JUp7pPq5Zj8oNKLjMrLr5PgH-R9fJYADWtFw4rC6VYhxafZ5kkOPq9XUcHGOUuCO9yJlpewFujkG_QxzimU_SSBhlLCBaxRx58tfXh5j7wA2g3gLcZk3QcEsFx3K792K7fdFiJsiHEe_5fzCk9UsC0
Cites_doi	10.1016/j.cell.2019.05.031 10.1038/nbt.4091 10.1038/s41586-018-0590-4 10.1038/s41587-019-0113-3 10.1016/j.cell.2015.04.044 10.1111/febs.14613 10.1038/s41587-021-00895-7 10.1186/s13059-016-0938-8 10.1038/ncomms14049 10.1371/journal.pone.0216705 10.1186/s12859-019-3289-0 10.1093/nargab/lqaa082 10.1038/s41592-018-0254-1 10.1093/bioinformatics/btab239 10.1038/75556 10.1038/nmeth.4662 10.1038/s41592-021-01336-8 10.1093/nar/gkab978 10.1186/s12859-020-3482-1 10.1038/nprot.2014.006 10.1016/j.cels.2016.08.011 10.1016/j.cels.2016.09.002
ContentType	Journal Article
Copyright	The Author(s) 2024. Published by Oxford University Press. 2024 The Author(s) 2024. Published by Oxford University Press.
Copyright_xml	– notice: The Author(s) 2024. Published by Oxford University Press. 2024 – notice: The Author(s) 2024. Published by Oxford University Press.
DBID	TOX AAYXX CITATION CGR CUY CVF ECM EIF NPM 7QF 7QO 7QQ 7SC 7SE 7SP 7SR 7TA 7TB 7TM 7TO 7U5 8BQ 8FD F28 FR3 H8D H8G H94 JG9 JQ2 K9. KR7 L7M L~C L~D P64 7X8 5PM
DOI	10.1093/bioinformatics/btae371
DatabaseName	Oxford Journals Open Access Collection CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Aluminium Industry Abstracts Biotechnology Research Abstracts Ceramic Abstracts Computer and Information Systems Abstracts Corrosion Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts Materials Business File Mechanical & Transportation Engineering Abstracts Nucleic Acids Abstracts Oncogenes and Growth Factors Abstracts Solid State and Superconductivity Abstracts METADEX Technology Research Database ANTE: Abstracts in New Technology & Engineering Engineering Research Database Aerospace Database Copper Technical Reference Library AIDS and Cancer Research Abstracts Materials Research Database ProQuest Computer Science Collection ProQuest Health & Medical Complete (Alumni) Civil Engineering Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Biotechnology and BioEngineering Abstracts MEDLINE - Academic PubMed Central (Full Participant titles)
DatabaseTitle	CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Materials Research Database Oncogenes and Growth Factors Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Mechanical & Transportation Engineering Abstracts Nucleic Acids Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Health & Medical Complete (Alumni) Materials Business File Aerospace Database Copper Technical Reference Library Engineered Materials Abstracts Biotechnology Research Abstracts AIDS and Cancer Research Abstracts Advanced Technologies Database with Aerospace ANTE: Abstracts in New Technology & Engineering Civil Engineering Abstracts Aluminium Industry Abstracts Electronics & Communications Abstracts Ceramic Abstracts METADEX Biotechnology and BioEngineering Abstracts Computer and Information Systems Abstracts Professional Solid State and Superconductivity Abstracts Engineering Research Database Corrosion Abstracts MEDLINE - Academic
DatabaseTitleList	MEDLINE MEDLINE - Academic Materials Research Database
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database – sequence: 3 dbid: TOX name: Oxford Journals Open Access Collection url: https://academic.oup.com/journals/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Biology
EISSN	1367-4811
ExternalDocumentID	PMC11256925 38889273 10_1093_bioinformatics_btae371 10.1093/bioinformatics/btae371
Genre	Journal Article
GrantInformation_xml	– fundername: UCLouvain, Belgium – fundername: ;
GroupedDBID	--- -E4 -~X .-4 .2P .DC .GJ .I3 0R~ 1TH 23N 2WC 4.4 48X 53G 5GY 5WA 70D AAIJN AAIMJ AAJKP AAJQQ AAKPC AAMDB AAMVS AAOGV AAPQZ AAPXW AAUQX AAVAP AAVLN ABEFU ABEJV ABEUO ABGNP ABIXL ABNGD ABNKS ABPQP ABPTD ABQLI ABQTQ ABWST ABXVV ABZBJ ACGFS ACIWK ACPRK ACUFI ACUKT ACUXJ ACYTK ADBBV ADEYI ADEZT ADFTL ADGKP ADGZP ADHKW ADHZD ADMLS ADOCK ADPDF ADRDM ADRTK ADVEK ADYVW ADZTZ ADZXQ AECKG AEGPL AEJOX AEKKA AEKSI AELWJ AEMDU AENEX AENZO AEPUE AETBJ AEWNT AFFNX AFFZL AFGWE AFIYH AFOFC AFRAH AGINJ AGKEF AGQXC AGSYK AHMBA AHXPO AI. AIJHB AJEEA AJEUX AKHUL AKWXX ALMA_UNASSIGNED_HOLDINGS ALTZX ALUQC AMNDL APIBT APWMN AQDSO ARIXL ASPBG ATTQO AVWKF AXUDD AYOIW AZFZN AZVOD BAWUL BAYMD BHONS BQDIO BQUQU BSWAC BTQHN C1A C45 CAG CDBKE COF CS3 CZ4 DAKXR DIK DILTD DU5 D~K EBD EBS EE~ EJD ELUNK EMOBN F5P F9B FEDTE FHSFR FLIZI FLUFQ FOEOM FQBLK GAUVT GJXCC GROUPED_DOAJ GX1 H13 H5~ HAR HVGLF HW0 HZ~ IOX J21 JXSIZ KAQDR KOP KQ8 KSI KSN M-Z M49 MK~ ML0 N9A NGC NLBLG NMDNZ NOMLY NTWIH NU- NVLIB O0~ O9- OAWHX ODMLO OJQWA OK1 OVD OVEED O~Y P2P PAFKI PB- PEELM PQQKQ Q1. Q5Y R44 RD5 RIG RNI RNS ROL RPM RUSNO RW1 RXO RZF RZO SV3 TEORI TJP TLC TOX TR2 VH1 W8F WOQ X7H YAYTL YKOAZ YXANX ZGI ZKX ~91 ~KM AAYXX CITATION CGR CUY CVF ECM EIF NPM 7QF 7QO 7QQ 7SC 7SE 7SP 7SR 7TA 7TB 7TM 7TO 7U5 8BQ 8FD F28 FR3 H8D H8G H94 JG9 JQ2 K9. KR7 L7M L~C L~D P64 7X8 5PM
ID	FETCH-LOGICAL-c432t-231d1554bb98aed412f60451fcf8f7f6f8c95462bbacc4849a58430b9cf417533
IEDL.DBID	TOX
ISSN	1367-4811 1367-4803
IngestDate	Thu Aug 21 18:32:58 EDT 2025 Fri Jul 11 05:24:30 EDT 2025 Mon Jun 30 10:42:37 EDT 2025 Mon Jul 21 06:06:47 EDT 2025 Tue Jul 01 02:34:04 EDT 2025 Wed Apr 02 07:03:15 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	6
Language	English
License	This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. https://creativecommons.org/licenses/by/4.0 The Author(s) 2024. Published by Oxford University Press.
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c432t-231d1554bb98aed412f60451fcf8f7f6f8c95462bbacc4849a58430b9cf417533
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ORCID	0000-0002-7968-6978
OpenAccessLink	https://dx.doi.org/10.1093/bioinformatics/btae371
PMID	38889273
PQID	3124438917
PQPubID	36124
ParticipantIDs	pubmedcentral_primary_oai_pubmedcentral_nih_gov_11256925 proquest_miscellaneous_3070796827 proquest_journals_3124438917 pubmed_primary_38889273 crossref_primary_10_1093_bioinformatics_btae371 oup_primary_10_1093_bioinformatics_btae371
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2024-Jun-03
PublicationDateYYYYMMDD	2024-06-03
PublicationDate_xml	– month: 06 year: 2024 text: 2024-Jun-03 day: 03
PublicationDecade	2020
PublicationPlace	England
PublicationPlace_xml	– name: England – name: Oxford
PublicationTitle	Bioinformatics (Oxford, England)
PublicationTitleAlternate	Bioinformatics
PublicationYear	2024
Publisher	Oxford University Press Oxford Publishing Limited (England)
Publisher_xml	– name: Oxford University Press – name: Oxford Publishing Limited (England)
References	Baron (2024071814082470200_btae371-B3) 2016; 3 Haghverdi (2024071814082470200_btae371-B11) 2018; 36 Muraro (2024071814082470200_btae371-B18) 2016 van der Maaten (2024071814082470200_btae371-B25) 2008; 9 Baron (2024071814082470200_btae371-B4) 2016 Xie (2024071814082470200_btae371-B26) 2020; 2 Argelaguet (2024071814082470200_btae371-B1) 2021; 39 Zheng (2024071814082470200_btae371-B27) 2017 Guo (2024071814082470200_btae371-B10) 2022; 50 Branders (2024071814082470200_btae371-B5) 2019; 20 Herman (2024071814082470200_btae371-B13) 2018; 15 Zheng (2024071814082470200_btae371-B28) 2017; 8 Tabula Muris Consortium (2024071814082470200_btae371-B21) 2018 Todorov (2024071814082470200_btae371-B23) 2019; 286 Hie (2024071814082470200_btae371-B14) 2019; 37 Stuart (2024071814082470200_btae371-B20) 2019; 177 Luecken (2024071814082470200_btae371-B16) 2022; 19 Picelli (2024071814082470200_btae371-B19) 2014; 9 Tabula Muris Consortium (2024071814082470200_btae371-B22) 2018; 562 Klein (2024071814082470200_btae371-B15) 2015; 161 Muraro (2024071814082470200_btae371-B17) 2016; 3 Hashimshony (2024071814082470200_btae371-B12) 2016; 17 Gerniers (2024071814082470200_btae371-B8) 2022 Gerniers (2024071814082470200_btae371-B9) 2021; 37 Büttner (2024071814082470200_btae371-B6) 2019; 16 Dong (2024071814082470200_btae371-B7) 2020; 21 Ashburner (2024071814082470200_btae371-B2) 2000; 25 van Dam (2024071814082470200_btae371-B24) 2019; 14
References_xml	– start-page: 148 year: 2022 ident: 2024071814082470200_btae371-B8 – volume: 177 start-page: 1888 year: 2019 ident: 2024071814082470200_btae371-B20 article-title: Comprehensive integration of single-cell data publication-title: Cell doi: 10.1016/j.cell.2019.05.031 – volume: 36 start-page: 421 year: 2018 ident: 2024071814082470200_btae371-B11 article-title: Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors publication-title: Nat Biotechnol doi: 10.1038/nbt.4091 – volume: 562 start-page: 367 year: 2018 ident: 2024071814082470200_btae371-B22 article-title: Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris publication-title: Nature doi: 10.1038/s41586-018-0590-4 – volume: 37 start-page: 685 year: 2019 ident: 2024071814082470200_btae371-B14 article-title: Efficient integration of heterogeneous single-cell transcriptomes using Scanorama publication-title: Nat Biotechnol doi: 10.1038/s41587-019-0113-3 – volume: 161 start-page: 1187 year: 2015 ident: 2024071814082470200_btae371-B15 article-title: Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells publication-title: Cell doi: 10.1016/j.cell.2015.04.044 – volume: 286 start-page: 1451 year: 2019 ident: 2024071814082470200_btae371-B23 article-title: Computational approaches for high-throughput single-cell data analysis publication-title: FEBS J doi: 10.1111/febs.14613 – volume: 9 start-page: 2579 year: 2008 ident: 2024071814082470200_btae371-B25 article-title: Visualizing data using t-SNE publication-title: J Mach Learn Res – volume: 39 start-page: 1202 year: 2021 ident: 2024071814082470200_btae371-B1 article-title: Computational principles and challenges in single-cell data integration publication-title: Nat Biotechnol doi: 10.1038/s41587-021-00895-7 – volume: 17 start-page: 77 year: 2016 ident: 2024071814082470200_btae371-B12 article-title: CEL-Seq2: sensitive highly-multiplexed single-cell RNA-seq publication-title: Genome Biol doi: 10.1186/s13059-016-0938-8 – volume-title: Sequence Read Archive year: 2017 ident: 2024071814082470200_btae371-B27 – volume: 8 start-page: 14049 year: 2017 ident: 2024071814082470200_btae371-B28 article-title: Massively parallel digital transcriptional profiling of single cells publication-title: Nat Commun doi: 10.1038/ncomms14049 – volume: 14 start-page: e0216705 year: 2019 ident: 2024071814082470200_btae371-B24 article-title: CiliaCarta: an integrated and validated compendium of ciliary genes publication-title: PLoS One doi: 10.1371/journal.pone.0216705 – volume: 20 start-page: 625 year: 2019 ident: 2024071814082470200_btae371-B5 article-title: Identifying gene-specific subgroups: an alternative to biclustering publication-title: BMC Bioinformatics doi: 10.1186/s12859-019-3289-0 – volume: 2 start-page: lqaa082 year: 2020 ident: 2024071814082470200_btae371-B26 article-title: scAIDE: clustering of large-scale single-cell RNA-seq data reveals putative and rare cell types publication-title: NAR Genom Bioinform doi: 10.1093/nargab/lqaa082 – volume: 16 start-page: 43 year: 2019 ident: 2024071814082470200_btae371-B6 article-title: A test metric for assessing single-cell RNA-seq batch correction publication-title: Nat Methods doi: 10.1038/s41592-018-0254-1 – volume: 37 start-page: 3220 year: 2021 ident: 2024071814082470200_btae371-B9 article-title: MicroCellClust: mining rare and highly specific subpopulations from single-cell expression data publication-title: Bioinformatics doi: 10.1093/bioinformatics/btab239 – year: 2018 ident: 2024071814082470200_btae371-B21 – volume: 25 start-page: 25 year: 2000 ident: 2024071814082470200_btae371-B2 article-title: Gene ontology: tool for the unification of biology publication-title: Nat Genet doi: 10.1038/75556 – volume: 15 start-page: 379 year: 2018 ident: 2024071814082470200_btae371-B13 article-title: FateID infers cell fate bias in multipotent progenitors from single-cell RNA-seq data publication-title: Nat Methods doi: 10.1038/nmeth.4662 – volume: 19 start-page: 41 year: 2022 ident: 2024071814082470200_btae371-B16 article-title: Benchmarking atlas-level data integration in single-cell genomics publication-title: Nat Methods doi: 10.1038/s41592-021-01336-8 – volume: 50 start-page: e8 year: 2022 ident: 2024071814082470200_btae371-B10 article-title: Integration of single cell data by disentangled representation learning publication-title: Nucleic Acids Res doi: 10.1093/nar/gkab978 – start-page: GSE84133 year: 2016 ident: 2024071814082470200_btae371-B4 article-title: A single-cell transcriptomic map of the human and mouse pancreas reveals inter-and intra-cell population structure publication-title: Gene Expression Omnibus – volume: 21 start-page: 158 year: 2020 ident: 2024071814082470200_btae371-B7 article-title: GiniClust3: a fast and memory-efficient tool for rare cell type identification publication-title: BMC Bioinformatics doi: 10.1186/s12859-020-3482-1 – year: 2016 ident: 2024071814082470200_btae371-B18 article-title: A single-cell transcriptome atlas of the human pancreas publication-title: Gene Expression Omnibus – volume: 9 start-page: 171 year: 2014 ident: 2024071814082470200_btae371-B19 article-title: Full-length RNA-seq from single cells using Smart-seq2 publication-title: Nat Protoc doi: 10.1038/nprot.2014.006 – volume: 3 start-page: 346 year: 2016 ident: 2024071814082470200_btae371-B3 article-title: A single-cell transcriptomic map of the human and mouse pancreas reveals inter-and intra-cell population structure publication-title: Cell Syst doi: 10.1016/j.cels.2016.08.011 – volume: 3 start-page: 385 year: 2016 ident: 2024071814082470200_btae371-B17 article-title: A single-cell transcriptome atlas of the human pancreas publication-title: Cell Syst doi: 10.1016/j.cels.2016.09.002
SSID	ssj0005056
Score	2.4517524
Snippet	Abstract Motivation Identifying rare cell types is an important task to capture the heterogeneity of single-cell data, such as scRNA-seq. The widespread... Identifying rare cell types is an important task to capture the heterogeneity of single-cell data, such as scRNA-seq. The widespread availability of such data... Motivation Identifying rare cell types is an important task to capture the heterogeneity of single-cell data, such as scRNA-seq. The widespread availability of...
SourceID	pubmedcentral proquest pubmed crossref oup
SourceType	Open Access Repository Aggregation Database Index Database Publisher
SubjectTerms	Algorithms Availability Cluster Analysis Clustering Data integration Genes Heterogeneity Humans Lung cancer Lung Neoplasms - genetics Original Paper Pancreatic cancer Sequence Analysis, RNA - methods Single-Cell Analysis - methods Software Subpopulations Target detection
Title	scCross: efficient search for rare subpopulations across multiple single-cell samples
URI	https://www.ncbi.nlm.nih.gov/pubmed/38889273 https://www.proquest.com/docview/3124438917 https://www.proquest.com/docview/3070796827 https://pubmed.ncbi.nlm.nih.gov/PMC11256925
Volume	40
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwEA9DEHwRv63OEcEnoWxt0i7xTYZjCOrLBnsrSZrgQLphuwf_e-_6MdeBoM-5Nuld0vsld_cLIXfOWoPhHp8LK334-wW-VhaL3ZXRAQDkQGFE9-U1nsz48zyad0jQ1MLshvAl6-vFsiYRReLivi6UZWXVOHhiZMufvs1_kjrAnzd1wL8-2nJBrbK2LXS5myS55XXGR-Swhov0sbLvMenY7ITsVxdIfp2SWW5G2McDtSUVBLyEVnOXwngobIQtzdd6tbmlK6eqHBRtEgkpnhV8WB8P8GmukCs4PyOz8dN0NPHrixJ8w1lY-IDRUsQFWkuhbMqD0MXIG-OME27oYieMjHgcaq2M4YJLBbCDDbQ0jiNTJzsne9kys5eEiiHj6RB8lhoYHsdKcC0RIkVpOIissx7pN7pLVhUfRlLFsVnS1nZSa9sj96DiPwt3G0sk9WLKE4YYBOOpQ4_cbpphGaBqVGaXa5Apqf5iEYLMRWW4TZcMdvkSYJpHRMukGwGk2G63ZIv3kmob0GgUyzC6-s9HXJODEKBPmVDGumSv-FzbG4Auhe6VW_5eOWe_Aevm9bQ
linkProvider	Oxford University Press
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=scCross%3A+efficient+search+for+rare+subpopulations+across+multiple+single-cell+samples&rft.jtitle=Bioinformatics+%28Oxford%2C+England%29&rft.au=Gerniers%2C+Alexander&rft.au=Nijssen%2C+Siegfried&rft.au=Dupont%2C+Pierre&rft.date=2024-06-03&rft.pub=Oxford+University+Press&rft.eissn=1367-4811&rft.volume=40&rft.issue=6&rft_id=info:doi/10.1093%2Fbioinformatics%2Fbtae371&rft.externalDocID=10.1093%2Fbioinformatics%2Fbtae371
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1367-4811&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1367-4811&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1367-4811&client=summon