Big Data Techniques for Public Health: A Case Study

Public health researchers increasingly recognize that to advance their field they must grapple with the availability of increasingly large (i.e., thousands of variables) traditional population-level datasets (e.g., electronic medical records), while at the same time integrating additional large data...

Full description

Saved in:
Bibliographic Details
Published in2017 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE) pp. 222 - 231
Main Authors Katsis, Yannis, Balac, Natasha, Chapman, Derek, Kapoor, Madhur, Block, Jessica, Griswold, William G., Huang, Jeannie, Koulouris, Nikos, Menarini, Massimiliano, Nandigam, Viswanath, Ngo, Mandy, Ong, Kian Win, Papakonstantinou, Yannis, Smith, Besa, Zarifis, Konstantinos, Woolf, Steven, Patrick, Kevin
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.07.2017
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Public health researchers increasingly recognize that to advance their field they must grapple with the availability of increasingly large (i.e., thousands of variables) traditional population-level datasets (e.g., electronic medical records), while at the same time integrating additional large datasets (e.g., data on genomics, the microbiome, environmental exposures, socioeconomic factors, and health behaviors). Leveraging these multiple forms of data might well provide unique and unexpected discoveries about the determinants of health and wellbeing. However, we are in the very early stages of advancing the techniques required to understand and analyze big population-level data for public health research. To address this problem, this paper describes how we propose that big data can be efficiently used for public health discoveries. We show that data analytics techniques traditionally employed in public health studies are not up to the task of the data we now have in hand. Instead we present techniques adapted from big data visualization and analytics approaches used in other domains that can be used to answer important public health questions utilizing these existing and new datasets. Our findings are based on an exploratory big data case study carried out in San Diego County, California where we analyzed thousands of variables related to health to gain interesting insights on the determinants of several health outcomes, including life expectancy and anxiety disorders. These findings provide a promising early indication that public health research will benefit from the larger set of activities in contemporary big data research.
AbstractList Public health researchers increasingly recognize that to advance their field they must grapple with the availability of increasingly large (i.e., thousands of variables) traditional population-level datasets (e.g., electronic medical records), while at the same time integrating additional large datasets (e.g., data on genomics, the microbiome, environmental exposures, socioeconomic factors, and health behaviors). Leveraging these multiple forms of data might well provide unique and unexpected discoveries about the determinants of health and wellbeing. However, we are in the very early stages of advancing the techniques required to understand and analyze big population-level data for public health research. To address this problem, this paper describes how we propose that big data can be efficiently used for public health discoveries. We show that data analytics techniques traditionally employed in public health studies are not up to the task of the data we now have in hand. Instead we present techniques adapted from big data visualization and analytics approaches used in other domains that can be used to answer important public health questions utilizing these existing and new datasets. Our findings are based on an exploratory big data case study carried out in San Diego County, California where we analyzed thousands of variables related to health to gain interesting insights on the determinants of several health outcomes, including life expectancy and anxiety disorders. These findings provide a promising early indication that public health research will benefit from the larger set of activities in contemporary big data research.
Author Papakonstantinou, Yannis
Koulouris, Nikos
Nandigam, Viswanath
Zarifis, Konstantinos
Ngo, Mandy
Woolf, Steven
Smith, Besa
Chapman, Derek
Griswold, William G.
Balac, Natasha
Block, Jessica
Patrick, Kevin
Katsis, Yannis
Kapoor, Madhur
Menarini, Massimiliano
Huang, Jeannie
Ong, Kian Win
Author_xml – sequence: 1
  givenname: Yannis
  surname: Katsis
  fullname: Katsis, Yannis
  email: ikatsis@cs.ucsd.edu
  organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA
– sequence: 2
  givenname: Natasha
  surname: Balac
  fullname: Balac, Natasha
  organization: Qualcomm Inst., Univ. of California, San Diego, La Jolla, CA, USA
– sequence: 3
  givenname: Derek
  surname: Chapman
  fullname: Chapman, Derek
  organization: Center on Soc. & Health, Virginia Commonwealth Univ., Richmond, VA, USA
– sequence: 4
  givenname: Madhur
  surname: Kapoor
  fullname: Kapoor, Madhur
  organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA
– sequence: 5
  givenname: Jessica
  surname: Block
  fullname: Block, Jessica
  organization: Qualcomm Inst., Univ. of California, San Diego, La Jolla, CA, USA
– sequence: 6
  givenname: William G.
  surname: Griswold
  fullname: Griswold, William G.
  organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA
– sequence: 7
  givenname: Jeannie
  surname: Huang
  fullname: Huang, Jeannie
  organization: Dept. of Family Med. & Public Health, Univ. of California, San Diego, La Jolla, CA, USA
– sequence: 8
  givenname: Nikos
  surname: Koulouris
  fullname: Koulouris, Nikos
  organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA
– sequence: 9
  givenname: Massimiliano
  surname: Menarini
  fullname: Menarini, Massimiliano
  organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA
– sequence: 10
  givenname: Viswanath
  surname: Nandigam
  fullname: Nandigam, Viswanath
  organization: San Diego Supercomput. Center, Univ. of California, San Diego, La Jolla, CA, USA
– sequence: 11
  givenname: Mandy
  surname: Ngo
  fullname: Ngo, Mandy
  organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA
– sequence: 12
  givenname: Kian Win
  surname: Ong
  fullname: Ong, Kian Win
  organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA
– sequence: 13
  givenname: Yannis
  surname: Papakonstantinou
  fullname: Papakonstantinou, Yannis
  organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA
– sequence: 14
  givenname: Besa
  surname: Smith
  fullname: Smith, Besa
  organization: Dept. of Family Med. & Public Health, Univ. of California, San Diego, La Jolla, CA, USA
– sequence: 15
  givenname: Konstantinos
  surname: Zarifis
  fullname: Zarifis, Konstantinos
  organization: Dept. of Comput. Sci. & Eng., Univ. of California, San Diego, La Jolla, CA, USA
– sequence: 16
  givenname: Steven
  surname: Woolf
  fullname: Woolf, Steven
  organization: Center on Soc. & Health, Virginia Commonwealth Univ., Richmond, VA, USA
– sequence: 17
  givenname: Kevin
  surname: Patrick
  fullname: Patrick, Kevin
  organization: Dept. of Family Med. & Public Health, Univ. of California, San Diego, La Jolla, CA, USA
BookMark eNotzLtOwzAUAFAj0QFKRyYW_0DCvbbjB1sILUGqRKWWubqxHWoppJDH0L9ngOls55Zd9-c-MnaPkCOCe6zqcr_OBaDJLV6xlTMWC3CgjBDihsnn9MlfaCJ-iP7Up585jrw9D3w3N13yvI7UTacnXvKKxsj30xwud2zRUjfG1b9L9rFZH6o6276_vlXlNiNhzZQVmoQw2oOkwpCk4IvGkJXKN0o7DGADBPTBo_TOQgTlGvK2NVKhbw3KJXv4e1OM8fg9pC8aLkcLCFpq-QsNED9z
CODEN IEEPAD
CitedBy_id crossref_primary_10_1155_2021_6628739
crossref_primary_10_3389_fpubh_2020_00022
crossref_primary_10_3390_technologies6030074
crossref_primary_10_1080_03091902_2020_1769758
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/CHASE.2017.81
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE/IET Electronic Library
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library Online
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Statistics
Public Health
EISBN 9781509047222
1509047220
EndPage 231
ExternalDocumentID 8010636
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-a287t-56a2276c03a57a3adc5b7a834cb4691d08d0d1cdc13c980e049bac8f7341cf713
IEDL.DBID RIE
IngestDate Thu Jun 29 18:37:43 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a287t-56a2276c03a57a3adc5b7a834cb4691d08d0d1cdc13c980e049bac8f7341cf713
PageCount 10
ParticipantIDs ieee_primary_8010636
PublicationCentury 2000
PublicationDate 2017-July
PublicationDateYYYYMMDD 2017-07-01
PublicationDate_xml – month: 07
  year: 2017
  text: 2017-July
PublicationDecade 2010
PublicationTitle 2017 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)
PublicationTitleAbbrev CH
PublicationYear 2017
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.7470151
Snippet Public health researchers increasingly recognize that to advance their field they must grapple with the availability of increasingly large (i.e., thousands of...
SourceID ieee
SourceType Publisher
StartPage 222
SubjectTerms Big Data
Data analysis
data exploration
Data visualization
Diseases
machine learning
public health
Public healthcare
Sociology
Statistics
Title Big Data Techniques for Public Health: A Case Study
URI https://ieeexplore.ieee.org/document/8010636
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NS8NAEB3angqC2lb8Zg8eTbr53I23WluKUBFsobeyHxMRoRVND_rrnU3a-oEHb2EhJOwc3rzd994AXISBSZVG9LKUyhBryT0dc-lRqUOZ50g9gPM7j-_S0TS-nSWzGlxuvTCIWIrP0HeP5V2-XZqVOyrrSkdgorQOdZFllVfrKzaz2x_1HgZOrCV8l1L9bVhKiRXDXRhvvlJJRJ79VaF98_ErgPG_v7EHnS9XHrvf4s0-1HDRgp3q3I1VdqIWNF33WIUvtyG6fnpkN6pQbLKJan1j1KWyHy9dsR7rE5gxpyl878B0OJj0R956SoKniO0UXpKqMBSp4ZFKhIqUNYkWSkax0UR9A8ul5TYw1gSRySRHogRaGZkLwi-TE0c9gMZiucBDYOX0F2ENR2JJQSBUThXjIRLH05ip7AjabjfmL1UQxny9Ecd_L59A0xWj0raeQqN4XeEZIXihz8vSfQIveJrr
link.rule.ids 310,311,783,787,792,793,799,27937,55086
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NS8NAFHzUerAgqG3Fb_fg0aSbr83GW60tUdsi2EJvZXezERFS0fSgv963ST9UPHgLCyFh32He7M7MA7hwHcWE1NqKGJbBl5xa0qfcwlK7PE019gDG7zwYsnjs302CSQUuV14YrXUhPtO2eSzu8pOZmpujshY3BMZjG7CJfTVnpVtrHZzZ6sTtx66Ra4W2yan-Ni6lQIveDgyW3ylFIi_2PJe2-vwVwfjfH9mF5tqXRx5WiLMHFZ3VYbs8eSOloagONdM_lvHLDfCun5_IjcgFGS3DWt8J9qnkx0tXpE06CGfEqAo_mjDudUed2FrMSbAE8p3cCphw3ZAp6okgFJ5IVCBDwT1fSSS_TkJ5QhNHJcrxVMSpRlIgheJpiAimUmSp-1DNZpk-AFLMfwkTRTXyJMcJRYo1o65Glid1JKJDaJjdmL6WURjTxUYc_b18DlvxaNCf9m-H98dQM4Upla4nUM3f5voU8TyXZ0UZvwB7xp42
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2017+IEEE%2FACM+International+Conference+on+Connected+Health%3A+Applications%2C+Systems+and+Engineering+Technologies+%28CHASE%29&rft.atitle=Big+Data+Techniques+for+Public+Health%3A+A+Case+Study&rft.au=Katsis%2C+Yannis&rft.au=Balac%2C+Natasha&rft.au=Chapman%2C+Derek&rft.au=Kapoor%2C+Madhur&rft.date=2017-07-01&rft.pub=IEEE&rft.spage=222&rft.epage=231&rft_id=info:doi/10.1109%2FCHASE.2017.81&rft.externalDocID=8010636