Big Data Techniques for Public Health: A Case Study
Public health researchers increasingly recognize that to advance their field they must grapple with the availability of increasingly large (i.e., thousands of variables) traditional population-level datasets (e.g., electronic medical records), while at the same time integrating additional large data...
Saved in:
Published in | 2017 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE) pp. 222 - 231 |
---|---|
Main Authors | , , , , , , , , , , , , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.07.2017
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Public health researchers increasingly recognize that to advance their field they must grapple with the availability of increasingly large (i.e., thousands of variables) traditional population-level datasets (e.g., electronic medical records), while at the same time integrating additional large datasets (e.g., data on genomics, the microbiome, environmental exposures, socioeconomic factors, and health behaviors). Leveraging these multiple forms of data might well provide unique and unexpected discoveries about the determinants of health and wellbeing. However, we are in the very early stages of advancing the techniques required to understand and analyze big population-level data for public health research. To address this problem, this paper describes how we propose that big data can be efficiently used for public health discoveries. We show that data analytics techniques traditionally employed in public health studies are not up to the task of the data we now have in hand. Instead we present techniques adapted from big data visualization and analytics approaches used in other domains that can be used to answer important public health questions utilizing these existing and new datasets. Our findings are based on an exploratory big data case study carried out in San Diego County, California where we analyzed thousands of variables related to health to gain interesting insights on the determinants of several health outcomes, including life expectancy and anxiety disorders. These findings provide a promising early indication that public health research will benefit from the larger set of activities in contemporary big data research. |
---|---|
AbstractList | Public health researchers increasingly recognize that to advance their field they must grapple with the availability of increasingly large (i.e., thousands of variables) traditional population-level datasets (e.g., electronic medical records), while at the same time integrating additional large datasets (e.g., data on genomics, the microbiome, environmental exposures, socioeconomic factors, and health behaviors). Leveraging these multiple forms of data might well provide unique and unexpected discoveries about the determinants of health and wellbeing. However, we are in the very early stages of advancing the techniques required to understand and analyze big population-level data for public health research. To address this problem, this paper describes how we propose that big data can be efficiently used for public health discoveries. We show that data analytics techniques traditionally employed in public health studies are not up to the task of the data we now have in hand. Instead we present techniques adapted from big data visualization and analytics approaches used in other domains that can be used to answer important public health questions utilizing these existing and new datasets. Our findings are based on an exploratory big data case study carried out in San Diego County, California where we analyzed thousands of variables related to health to gain interesting insights on the determinants of several health outcomes, including life expectancy and anxiety disorders. These findings provide a promising early indication that public health research will benefit from the larger set of activities in contemporary big data research. |
Author | Papakonstantinou, Yannis Koulouris, Nikos Nandigam, Viswanath Zarifis, Konstantinos Ngo, Mandy Woolf, Steven Smith, Besa Chapman, Derek Griswold, William G. Balac, Natasha Block, Jessica Patrick, Kevin Katsis, Yannis Kapoor, Madhur Menarini, Massimiliano Huang, Jeannie Ong, Kian Win |
Author_xml | – sequence: 1 givenname: Yannis surname: Katsis fullname: Katsis, Yannis email: ikatsis@cs.ucsd.edu organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA – sequence: 2 givenname: Natasha surname: Balac fullname: Balac, Natasha organization: Qualcomm Inst., Univ. of California, San Diego, La Jolla, CA, USA – sequence: 3 givenname: Derek surname: Chapman fullname: Chapman, Derek organization: Center on Soc. & Health, Virginia Commonwealth Univ., Richmond, VA, USA – sequence: 4 givenname: Madhur surname: Kapoor fullname: Kapoor, Madhur organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA – sequence: 5 givenname: Jessica surname: Block fullname: Block, Jessica organization: Qualcomm Inst., Univ. of California, San Diego, La Jolla, CA, USA – sequence: 6 givenname: William G. surname: Griswold fullname: Griswold, William G. organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA – sequence: 7 givenname: Jeannie surname: Huang fullname: Huang, Jeannie organization: Dept. of Family Med. & Public Health, Univ. of California, San Diego, La Jolla, CA, USA – sequence: 8 givenname: Nikos surname: Koulouris fullname: Koulouris, Nikos organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA – sequence: 9 givenname: Massimiliano surname: Menarini fullname: Menarini, Massimiliano organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA – sequence: 10 givenname: Viswanath surname: Nandigam fullname: Nandigam, Viswanath organization: San Diego Supercomput. Center, Univ. of California, San Diego, La Jolla, CA, USA – sequence: 11 givenname: Mandy surname: Ngo fullname: Ngo, Mandy organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA – sequence: 12 givenname: Kian Win surname: Ong fullname: Ong, Kian Win organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA – sequence: 13 givenname: Yannis surname: Papakonstantinou fullname: Papakonstantinou, Yannis organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA – sequence: 14 givenname: Besa surname: Smith fullname: Smith, Besa organization: Dept. of Family Med. & Public Health, Univ. of California, San Diego, La Jolla, CA, USA – sequence: 15 givenname: Konstantinos surname: Zarifis fullname: Zarifis, Konstantinos organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA – sequence: 16 givenname: Steven surname: Woolf fullname: Woolf, Steven organization: Center on Soc. & Health, Virginia Commonwealth Univ., Richmond, VA, USA – sequence: 17 givenname: Kevin surname: Patrick fullname: Patrick, Kevin organization: Dept. of Family Med. & Public Health, Univ. of California, San Diego, La Jolla, CA, USA |
BookMark | eNotzLtOwzAUAFAj0QFKRyYW_0DCvbbjB1sILUGqRKWWubqxHWoppJDH0L9ngOls55Zd9-c-MnaPkCOCe6zqcr_OBaDJLV6xlTMWC3CgjBDihsnn9MlfaCJ-iP7Up585jrw9D3w3N13yvI7UTacnXvKKxsj30xwud2zRUjfG1b9L9rFZH6o6276_vlXlNiNhzZQVmoQw2oOkwpCk4IvGkJXKN0o7DGADBPTBo_TOQgTlGvK2NVKhbw3KJXv4e1OM8fg9pC8aLkcLCFpq-QsNED9z |
CODEN | IEEPAD |
CitedBy_id | crossref_primary_10_1155_2021_6628739 crossref_primary_10_3389_fpubh_2020_00022 crossref_primary_10_3390_technologies6030074 crossref_primary_10_1080_03091902_2020_1769758 |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/CHASE.2017.81 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library Online url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Statistics Public Health |
EISBN | 9781509047222 1509047220 |
EndPage | 231 |
ExternalDocumentID | 8010636 |
Genre | orig-research |
GroupedDBID | 6IE 6IL CBEJK RIE RIL |
ID | FETCH-LOGICAL-a287t-56a2276c03a57a3adc5b7a834cb4691d08d0d1cdc13c980e049bac8f7341cf713 |
IEDL.DBID | RIE |
IngestDate | Thu Jun 29 18:37:43 EDT 2023 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-a287t-56a2276c03a57a3adc5b7a834cb4691d08d0d1cdc13c980e049bac8f7341cf713 |
PageCount | 10 |
ParticipantIDs | ieee_primary_8010636 |
PublicationCentury | 2000 |
PublicationDate | 2017-July |
PublicationDateYYYYMMDD | 2017-07-01 |
PublicationDate_xml | – month: 07 year: 2017 text: 2017-July |
PublicationDecade | 2010 |
PublicationTitle | 2017 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE) |
PublicationTitleAbbrev | CH |
PublicationYear | 2017 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
Score | 1.7470151 |
Snippet | Public health researchers increasingly recognize that to advance their field they must grapple with the availability of increasingly large (i.e., thousands of... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 222 |
SubjectTerms | Big Data Data analysis data exploration Data visualization Diseases machine learning public health Public healthcare Sociology Statistics |
Title | Big Data Techniques for Public Health: A Case Study |
URI | https://ieeexplore.ieee.org/document/8010636 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NS8NAEB3angqC2lb8Zg8eTbr53I23WluKUBFsobeyHxMRoRVND_rrnU3a-oEHb2EhJOwc3rzd994AXISBSZVG9LKUyhBryT0dc-lRqUOZ50g9gPM7j-_S0TS-nSWzGlxuvTCIWIrP0HeP5V2-XZqVOyrrSkdgorQOdZFllVfrKzaz2x_1HgZOrCV8l1L9bVhKiRXDXRhvvlJJRJ79VaF98_ErgPG_v7EHnS9XHrvf4s0-1HDRgp3q3I1VdqIWNF33WIUvtyG6fnpkN6pQbLKJan1j1KWyHy9dsR7rE5gxpyl878B0OJj0R956SoKniO0UXpKqMBSp4ZFKhIqUNYkWSkax0UR9A8ul5TYw1gSRySRHogRaGZkLwi-TE0c9gMZiucBDYOX0F2ENR2JJQSBUThXjIRLH05ip7AjabjfmL1UQxny9Ecd_L59A0xWj0raeQqN4XeEZIXihz8vSfQIveJrr |
link.rule.ids | 310,311,783,787,792,793,799,27937,55086 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NS8NAFHzUerAgqG3Fb_fg0aSbr83GW60tUdsi2EJvZXezERFS0fSgv963ST9UPHgLCyFh32He7M7MA7hwHcWE1NqKGJbBl5xa0qfcwlK7PE019gDG7zwYsnjs302CSQUuV14YrXUhPtO2eSzu8pOZmpujshY3BMZjG7CJfTVnpVtrHZzZ6sTtx66Ra4W2yan-Ni6lQIveDgyW3ylFIi_2PJe2-vwVwfjfH9mF5tqXRx5WiLMHFZ3VYbs8eSOloagONdM_lvHLDfCun5_IjcgFGS3DWt8J9qnkx0tXpE06CGfEqAo_mjDudUed2FrMSbAE8p3cCphw3ZAp6okgFJ5IVCBDwT1fSSS_TkJ5QhNHJcrxVMSpRlIgheJpiAimUmSp-1DNZpk-AFLMfwkTRTXyJMcJRYo1o65Glid1JKJDaJjdmL6WURjTxUYc_b18DlvxaNCf9m-H98dQM4Upla4nUM3f5voU8TyXZ0UZvwB7xp42 |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2017+IEEE%2FACM+International+Conference+on+Connected+Health%3A+Applications%2C+Systems+and+Engineering+Technologies+%28CHASE%29&rft.atitle=Big+Data+Techniques+for+Public+Health%3A+A+Case+Study&rft.au=Katsis%2C+Yannis&rft.au=Balac%2C+Natasha&rft.au=Chapman%2C+Derek&rft.au=Kapoor%2C+Madhur&rft.date=2017-07-01&rft.pub=IEEE&rft.spage=222&rft.epage=231&rft_id=info:doi/10.1109%2FCHASE.2017.81&rft.externalDocID=8010636 |