Building a predictive model from data in high dimensions with application to analysis of microarray experiments

This work presents a comparative study of methods for building predictive models from data in high dimensionality spaces, i.e. where the number of features describing items to be classified is high as compared with the available number of items used to build the model and test its predictive perform...

Full description

Saved in:
Bibliographic Details
Published in2nd International Conference on Dependability of Computer Systems (DepCoS-RELCOMEX '07) pp. 316 - 323
Main Authors Maciejewski, H., Konarski, L.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2007
Subjects
Online AccessGet full text

Cover

Loading…
Abstract This work presents a comparative study of methods for building predictive models from data in high dimensionality spaces, i.e. where the number of features describing items to be classified is high as compared with the available number of items used to build the model and test its predictive performance. Application of such methods may be quite diverse, ranging from data analysis in life sciences (e.g., analysis of data from experiments generating thousands of feature-numbers per tested case, such a microarray or RT-PCR techniques), to analysis of monitoring data from a complex, highly reliable technical system, where the relationship is being sought between the monitoring data and the relatively infrequent occurrences of some event (such as a faulty or somewhat untypical state of the system). This latter case may be of special interest in early prediction of abnormal conditions in systems focused on dependability. The multidimensional data analysis challenges and generic methods are described in this paper using a very problem specific language of life sciences, namely classification of samples based on gene expression profiles obtained using DNA microarrays. We concentrate on feature selection methods (which in the context are gene selection methods). We also propose a method to evaluate performance of feature (gene) selection methods by looking at predictive power of classifiers based on selected features.
AbstractList This work presents a comparative study of methods for building predictive models from data in high dimensionality spaces, i.e. where the number of features describing items to be classified is high as compared with the available number of items used to build the model and test its predictive performance. Application of such methods may be quite diverse, ranging from data analysis in life sciences (e.g., analysis of data from experiments generating thousands of feature-numbers per tested case, such a microarray or RT-PCR techniques), to analysis of monitoring data from a complex, highly reliable technical system, where the relationship is being sought between the monitoring data and the relatively infrequent occurrences of some event (such as a faulty or somewhat untypical state of the system). This latter case may be of special interest in early prediction of abnormal conditions in systems focused on dependability. The multidimensional data analysis challenges and generic methods are described in this paper using a very problem specific language of life sciences, namely classification of samples based on gene expression profiles obtained using DNA microarrays. We concentrate on feature selection methods (which in the context are gene selection methods). We also propose a method to evaluate performance of feature (gene) selection methods by looking at predictive power of classifiers based on selected features.
Author Maciejewski, H.
Konarski, L.
Author_xml – sequence: 1
  givenname: H.
  surname: Maciejewski
  fullname: Maciejewski, H.
– sequence: 2
  givenname: L.
  surname: Konarski
  fullname: Konarski, L.
BookMark eNotjctOwzAURC0BErT0C1hwfyDFieMkXkIID6moCLpgV93YTmuU2JFtHv17gmA2I410zszIsXVWE3KZ0mWaUnF12zzX69fkpVnV66fmbZlRWi5TdkRmtCwEzypO2SlZhPBOp-SMM0rPiLv5ML0ydgcIo9fKyGg-NQxO6R467wZQGBGMhb3Z7UGZQdtgnA3wZeIecBx7IzFOC0QHaLE_BBPAdTAY6R16jwfQ36P2v2QM5-Skwz7oxX_Pyeau2dQPyWp9_1hfrxKTljwmed6JKpdUVLSlUnOh8qyUVZojb7O2KGTbSaY0SymiLBTmHcdWaCaqdsKQzcnFn9ZorbfjdI7-sJ0cmcg4-wHU6V4c
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/DEPCOS-RELCOMEX.2007.13
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EndPage 323
ExternalDocumentID 4272925
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
AARBI
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
OCL
RIE
RIL
ID FETCH-LOGICAL-i175t-44f984c0980b0ce59d427c814a5b2b66cbfc3de310aac6da4f5ab9e398b84ca3
IEDL.DBID RIE
ISBN 0769528503
9780769528502
IngestDate Wed Jun 26 19:36:45 EDT 2024
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-44f984c0980b0ce59d427c814a5b2b66cbfc3de310aac6da4f5ab9e398b84ca3
PageCount 8
ParticipantIDs ieee_primary_4272925
PublicationCentury 2000
PublicationDate 2007-June
PublicationDateYYYYMMDD 2007-06-01
PublicationDate_xml – month: 06
  year: 2007
  text: 2007-June
PublicationDecade 2000
PublicationTitle 2nd International Conference on Dependability of Computer Systems (DepCoS-RELCOMEX '07)
PublicationTitleAbbrev DEPCOSRELCOMEX
PublicationYear 2007
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0000435300
Score 1.406161
Snippet This work presents a comparative study of methods for building predictive models from data in high dimensionality spaces, i.e. where the number of features...
SourceID ieee
SourceType Publisher
StartPage 316
SubjectTerms Condition monitoring
Data analysis
Diseases
DNA
Gene expression
Multidimensional systems
Predictive models
Risk analysis
Space technology
System testing
Title Building a predictive model from data in high dimensions with application to analysis of microarray experiments
URI https://ieeexplore.ieee.org/document/4272925
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NS8NAEF3anjyptOI3c_Bo2s1mk26u1pQi1hat0FvZ3WygiElJ04P-eneS9APx4C0JJAzLsvMy8948Qu5EyN1EC3QvE8bhvmZOaIxy-omy-NjVsfBRjTx-CUbv_GnuzxvkfqeFMcaU5DPTxcuylx9neoOlsh5nFgoyv0magrJKq7Wrp1Cb9z1Kqz_z0GfCp149YGd7z2p-l0vD3mM0HUzenNfoeTAZR_NqliF6HBx4rJQpZnhMxtvgKmbJR3dTqK7-_jW38b_Rn5DOXswH012aOiUNk7ZJ9lDbYYOEVY7NGjz2oPTFAZScAFJHYZkCzjOGGD0AsK62BizcwkHbG4oMZD3ZBLIEPpHhJ_NcfsHePWDdIbNhNBuMnNp7wVlaQFE4nCeh4JqGgiqKUq3Yhq-Fy6WvmAoCrRLtxcaCQyl1EEue-FKFxguFsq9J74y00iw15wSEYcpz7dcYdmRZXyHK0m4_CdxE2n18Qdq4VotVNV1jUS_T5d-Pr8jRlrFH3WvSKvKNubGwoFC35X74Ae0htoc
link.rule.ids 310,311,786,790,795,796,802,27958,55109
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELVKGWAC1CK-uYGRtI7jpM5KaVWgX4Iidatsx5EqRFO16QC_Hl-SfggxsCWRHJ0sK_dyd-89Qu5EyN1YC3QvE8bhvmZOaIxyGrGy-NjVkfCRjdzrB513_jz2xyVyv-HCGGOy4TNTw8uslx8leoWlsjpnFgoyf4_s2zxPw5yttamoUJv5PUrzf_PQZ8KnXiGxs75nxYSXXVx_bA2bgzfntdVtDnqtca5miC4HOy4rWZJpH5HeOrx8tuSjtkpVTX__Um78b_zHpLql88Fwk6hOSMnMKiR5KAyxQcJ8ge0a_PBB5owDSDoBHB6F6QxQ0RgidAHAytoSsHQLO41vSBOQhbYJJDF84oyfXCzkF2z9A5ZVMmq3Rs2OU7gvOFMLKVKH8zgUXNNQUEWRrBXZ8LVwufQVU0GgVay9yFh4KKUOIsljX6rQeKFQdpn0Tkl5lszMGQFhmPJc-zaGPVnWUIiztNuIAzeW9iSfkwru1WSe62tMim26-PvxLTnojHrdSfep_3JJDtfze9S9IuV0sTLXFiSk6iY7Gz-9RLnd
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2nd+International+Conference+on+Dependability+of+Computer+Systems+%28DepCoS-RELCOMEX+%2707%29&rft.atitle=Building+a+predictive+model+from+data+in+high+dimensions+with+application+to+analysis+of+microarray+experiments&rft.au=Maciejewski%2C+H.&rft.au=Konarski%2C+L.&rft.date=2007-06-01&rft.pub=IEEE&rft.isbn=9780769528502&rft.spage=316&rft.epage=323&rft_id=info:doi/10.1109%2FDEPCOS-RELCOMEX.2007.13&rft.externalDocID=4272925
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780769528502/lc.gif&client=summon&freeimage=true
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780769528502/mc.gif&client=summon&freeimage=true
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9780769528502/sc.gif&client=summon&freeimage=true