Causal Inference on Multivariate and Mixed-Type Data

How can we discover whether X causes Y, or vice versa, that Y causes X, when we are only given a sample over their joint distribution? How can we do this such that X and Y can be univariate, multivariate, or of different cardinalities? And, how can we do so regardless of whether X and Y are of the s...

Full description

Saved in:
Bibliographic Details
Published inMachine Learning and Knowledge Discovery in Databases Vol. 11052; pp. 655 - 671
Main Authors Marx, Alexander, Vreeken, Jilles
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2019
Springer International Publishing
SeriesLecture Notes in Computer Science
Online AccessGet full text
ISBN3030109275
9783030109271
ISSN0302-9743
1611-3349
DOI10.1007/978-3-030-10928-8_39

Cover

Loading…
Abstract How can we discover whether X causes Y, or vice versa, that Y causes X, when we are only given a sample over their joint distribution? How can we do this such that X and Y can be univariate, multivariate, or of different cardinalities? And, how can we do so regardless of whether X and Y are of the same, or of different data type, be it discrete, numeric, or mixed? These are exactly the questions we answer. We take an information theoretic approach, based on the Minimum Description Length principle, from which it follows that first describing the data over cause and then that of effect given cause is shorter than the reverse direction. Simply put, if Y can be explained more succinctly by a set of classification or regression trees conditioned on X, than in the opposite direction, we conclude that X causes Y. Empirical evaluation on a wide range of data shows that our method, Crack, infers the correct causal direction reliably and with high accuracy on a wide range of settings, outperforming the state of the art by a wide margin. Code related to this paper is available at: http://eda.mmci.uni-saarland.de/crack.
AbstractList How can we discover whether X causes Y, or vice versa, that Y causes X, when we are only given a sample over their joint distribution? How can we do this such that X and Y can be univariate, multivariate, or of different cardinalities? And, how can we do so regardless of whether X and Y are of the same, or of different data type, be it discrete, numeric, or mixed? These are exactly the questions we answer. We take an information theoretic approach, based on the Minimum Description Length principle, from which it follows that first describing the data over cause and then that of effect given cause is shorter than the reverse direction. Simply put, if Y can be explained more succinctly by a set of classification or regression trees conditioned on X, than in the opposite direction, we conclude that X causes Y. Empirical evaluation on a wide range of data shows that our method, Crack, infers the correct causal direction reliably and with high accuracy on a wide range of settings, outperforming the state of the art by a wide margin. Code related to this paper is available at: http://eda.mmci.uni-saarland.de/crack.
Author Marx, Alexander
Vreeken, Jilles
Author_xml – sequence: 1
  givenname: Alexander
  surname: Marx
  fullname: Marx, Alexander
  email: amarx@mpi-inf.mpg.de
– sequence: 2
  givenname: Jilles
  surname: Vreeken
  fullname: Vreeken, Jilles
BookMark eNo1kM1OwzAQhA0URFv6BhzyAoa113acIyq_Uisu5Wy5yQYClRPiFMHb47ZwmtWsZlfzTdgotIEYuxRwJQDy6yK3HDkgcAGFtNw6LI7YLNmYzL1nj9lYGCE4oipO2OR_kesRG6dZ8iJXeMYmAqxSJtcGztksxncAkEIYre2YqbnfRr_JnkJNPYWSsjZky-1maL583_iBMh-qbNl8U8VXPx1lt37wF-y09ptIsz-dspf7u9X8kS-eH57mNwveSYUDt2VupVWlhopKoppAGdI6PZeVgrpEkdcV6oqkr0UlwZpCozXr3IBFLwVOmTzcjV3fhFfq3bptP6IT4HaQXKLh0KWqbg_E7SClkDqEur793FIcHO1SJYWh95vyzXcD9dHpQmqUJqlwiQ3-Am90ZSQ
ContentType Book Chapter
Copyright Springer Nature Switzerland AG 2019
Copyright_xml – notice: Springer Nature Switzerland AG 2019
DBID FFUUA
DEWEY 6.3
DOI 10.1007/978-3-030-10928-8_39
DatabaseName ProQuest Ebook Central - Book Chapters - Demo use only
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9783030109288
3030109283
EISSN 1611-3349
Editor Berlingerio, Michele
Ifrim, Georgiana
Hurley, Neil
Gärtner, Thomas
Bonchi, Francesco
Editor_xml – sequence: 1
  fullname: Berlingerio, Michele
– sequence: 2
  fullname: Ifrim, Georgiana
– sequence: 3
  fullname: Gärtner, Thomas
– sequence: 4
  fullname: Bonchi, Francesco
– sequence: 5
  fullname: Hurley, Neil
EndPage 671
ExternalDocumentID EBC5925326_591_675
GroupedDBID 0D6
0DA
38.
AABBV
AEDXK
AEJLV
AEKFX
AEZAY
AIFIR
ALEXF
ALMA_UNASSIGNED_HOLDINGS
AYMPB
BBABE
CXBFT
CZZ
EXGDT
FCSXQ
FFUUA
I4C
IEZ
MGZZY
NSQWD
OORQV
SBO
TPJZQ
TSXQS
Z5O
Z7R
Z7S
Z7U
Z7V
Z7W
Z7X
Z7Y
Z7Z
Z81
Z82
Z83
Z84
Z85
Z87
Z88
-DT
-GH
-~X
1SB
29L
2HA
2HV
5QI
875
AASHB
ABMNI
ACGFS
ADCXD
AEFIE
EJD
F5P
FEDTE
HVGLF
LAS
LDH
P2P
RIG
RNI
RSU
SVGTG
VI1
~02
ID FETCH-LOGICAL-p243t-8c78284c50deceefe046e550212d40fc317fd35de2af1d208695386b76083a213
ISBN 3030109275
9783030109271
ISSN 0302-9743
IngestDate Tue Jul 29 20:13:44 EDT 2025
Fri Apr 11 21:41:05 EDT 2025
IsPeerReviewed true
IsScholarly true
LCCallNum Q334-342
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-p243t-8c78284c50deceefe046e550212d40fc317fd35de2af1d208695386b76083a213
Notes Electronic supplementary materialThe online version of this chapter (https://doi.org/10.1007/978-3-030-10928-8_39) contains supplementary material, which is available to authorized users.
OCLC 1084467560
PQID EBC5925326_591_675
PageCount 17
ParticipantIDs springer_books_10_1007_978_3_030_10928_8_39
proquest_ebookcentralchapters_5925326_591_675
PublicationCentury 2000
PublicationDate 2019
PublicationDateYYYYMMDD 2019-01-01
PublicationDate_xml – year: 2019
  text: 2019
PublicationDecade 2010
PublicationPlace Switzerland
PublicationPlace_xml – name: Switzerland
– name: Cham
PublicationSeriesSubtitle Lecture Notes in Artificial Intelligence
PublicationSeriesTitle Lecture Notes in Computer Science
PublicationSeriesTitleAlternate Lect.Notes Computer
PublicationSubtitle European Conference, ECML PKDD 2018, Dublin, Ireland, September 10-14, 2018, Proceedings, Part II
PublicationTitle Machine Learning and Knowledge Discovery in Databases
PublicationYear 2019
Publisher Springer International Publishing AG
Springer International Publishing
Publisher_xml – name: Springer International Publishing AG
– name: Springer International Publishing
RelatedPersons Kleinberg, Jon M.
Mattern, Friedemann
Naor, Moni
Mitchell, John C.
Terzopoulos, Demetri
Steffen, Bernhard
Pandu Rangan, C.
Kanade, Takeo
Kittler, Josef
Hutchison, David
Tygar, Doug
RelatedPersons_xml – sequence: 1
  givenname: David
  surname: Hutchison
  fullname: Hutchison, David
– sequence: 2
  givenname: Takeo
  surname: Kanade
  fullname: Kanade, Takeo
– sequence: 3
  givenname: Josef
  surname: Kittler
  fullname: Kittler, Josef
– sequence: 4
  givenname: Jon M.
  surname: Kleinberg
  fullname: Kleinberg, Jon M.
– sequence: 5
  givenname: Friedemann
  surname: Mattern
  fullname: Mattern, Friedemann
– sequence: 6
  givenname: John C.
  surname: Mitchell
  fullname: Mitchell, John C.
– sequence: 7
  givenname: Moni
  surname: Naor
  fullname: Naor, Moni
– sequence: 8
  givenname: C.
  surname: Pandu Rangan
  fullname: Pandu Rangan, C.
– sequence: 9
  givenname: Bernhard
  surname: Steffen
  fullname: Steffen, Bernhard
– sequence: 10
  givenname: Demetri
  surname: Terzopoulos
  fullname: Terzopoulos, Demetri
– sequence: 11
  givenname: Doug
  surname: Tygar
  fullname: Tygar, Doug
SSID ssj0002116558
ssj0002792
Score 2.004815
Snippet How can we discover whether X causes Y, or vice versa, that Y causes X, when we are only given a sample over their joint distribution? How can we do this such...
SourceID springer
proquest
SourceType Publisher
StartPage 655
Title Causal Inference on Multivariate and Mixed-Type Data
URI http://ebookcentral.proquest.com/lib/SITE_ID/reader.action?docID=5925326&ppg=675
http://link.springer.com/10.1007/978-3-030-10928-8_39
Volume 11052
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Nj9MwELVouSAOy6foLiAfuCGjxLFT51hKoSofpxb1ZjmJLXFpVyRFiF-_M06cJlEvyyWq0sRy59num7HfDCHvYA4pqRxnJo0dEy5VzJhszkSewNyD74TfMf3-I13vxGYv9-d6jl5dUucfin8XdSX_gyrcA1xRJXsPZLtG4QZ8BnzhCgjDdUR-h2HWtsIQHoO0IUNqIzX8GkJkmFazwOOZXtf3ydQG_6-q_ghZmlPlM22EVLMwELwe9w_4z0BBmzMYv_7akqG36hvpBwlQlzQIEoQg4SjM2It0Lb4MHMsEPaUo4015lG6lBDLGL667_aMW8CrDdxVTuklUNExznTalUkZprlcflzLjEtikllms4aEJmcyVnJKHi9Xm288ucMYxYZBUqNMJnZRNJqVzp3sayUt9GngTow1wzyu2T8hj1JpQFIFAL5-SB_bwjFyFShu0XXifE9FgRTus6PFA-1hRwIqesaKI1Quy-7zaLtesrXfBbrlIaqYKoGtKFDIqLXAXZyORWvAggV2UInIFUD1XJrK03Li45OCMZjDV0nyeAo82PE5ekunheLCvCE1hJXeRcbaMEmGzROH2d5G7KFaldXExIyxYQPtd-fYocNH83kqPsJiR98FMGh-vdEh3DfbViQb7am9fjfa9vmfrN-TRedC-JtP698m-Aa5X529b9O8AeZZMsw
linkProvider Library Specific Holdings
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Machine+Learning+and+Knowledge+Discovery+in+Databases&rft.atitle=Causal+Inference+on+Multivariate+and+Mixed-Type+Data&rft.date=2019-01-01&rft.pub=Springer+International+Publishing+AG&rft.isbn=9783030109271&rft.volume=11052&rft_id=info:doi/10.1007%2F978-3-030-10928-8_39&rft.externalDBID=675&rft.externalDocID=EBC5925326_591_675
thumbnail_s http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Febookcentral.proquest.com%2Fcovers%2F5925326-l.jpg