Cybercosm: New Foundations for a Converged Science Data Ecosystem

Scientific communities naturally tend to organize around data ecosystems created by the combination of their observational devices, their data repositories, and the workflows essential to carry their research from observation to discovery. However, these legacy data ecosystems are now breaking down...

Full description

Saved in:
Bibliographic Details
Main Authors Asch, Mark, Bodin, François, Beck, Micah, Moore, Terry, Taufer, Michela, Swany, Martin, Vilotte, Jean-Pierre
Format Journal Article
LanguageEnglish
Published 22.05.2021
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Scientific communities naturally tend to organize around data ecosystems created by the combination of their observational devices, their data repositories, and the workflows essential to carry their research from observation to discovery. However, these legacy data ecosystems are now breaking down under the pressure of the exponential growth in the volume and velocity of these workflows, which are further complicated by the need to integrate the highly data intensive methods of the Artificial Intelligence revolution. Enabling ground breaking science that makes full use of this new, data saturated research environment will require distributed systems that support dramatically improved resource sharing, workflow portability and composability, and data ecosystem convergence. The Cybercosm vision presented in this white paper describes a radically different approach to the architecture of distributed systems for data-intensive science and its application workflows. As opposed to traditional models that restrict interoperability by hiving off storage, networking, and computing resources in separate technology silos, Cybercosm defines a minimally sufficient hypervisor as a spanning layer for its data plane that virtualizes and converges the local resources of the system's nodes in a fully interoperable manner. By building on a common, universal interface into which the problems that infect today's data-intensive workflows can be decomposed and attacked, Cybercosm aims to support scalable, portable and composable workflows that span and merge the distributed data ecosystems that characterize leading edge research communities today.
AbstractList Scientific communities naturally tend to organize around data ecosystems created by the combination of their observational devices, their data repositories, and the workflows essential to carry their research from observation to discovery. However, these legacy data ecosystems are now breaking down under the pressure of the exponential growth in the volume and velocity of these workflows, which are further complicated by the need to integrate the highly data intensive methods of the Artificial Intelligence revolution. Enabling ground breaking science that makes full use of this new, data saturated research environment will require distributed systems that support dramatically improved resource sharing, workflow portability and composability, and data ecosystem convergence. The Cybercosm vision presented in this white paper describes a radically different approach to the architecture of distributed systems for data-intensive science and its application workflows. As opposed to traditional models that restrict interoperability by hiving off storage, networking, and computing resources in separate technology silos, Cybercosm defines a minimally sufficient hypervisor as a spanning layer for its data plane that virtualizes and converges the local resources of the system's nodes in a fully interoperable manner. By building on a common, universal interface into which the problems that infect today's data-intensive workflows can be decomposed and attacked, Cybercosm aims to support scalable, portable and composable workflows that span and merge the distributed data ecosystems that characterize leading edge research communities today.
Author Beck, Micah
Taufer, Michela
Vilotte, Jean-Pierre
Asch, Mark
Moore, Terry
Bodin, François
Swany, Martin
Author_xml – sequence: 1
  givenname: Mark
  surname: Asch
  fullname: Asch, Mark
– sequence: 2
  givenname: François
  surname: Bodin
  fullname: Bodin, François
– sequence: 3
  givenname: Micah
  surname: Beck
  fullname: Beck, Micah
– sequence: 4
  givenname: Terry
  surname: Moore
  fullname: Moore, Terry
– sequence: 5
  givenname: Michela
  surname: Taufer
  fullname: Taufer, Michela
– sequence: 6
  givenname: Martin
  surname: Swany
  fullname: Swany, Martin
– sequence: 7
  givenname: Jean-Pierre
  surname: Vilotte
  fullname: Vilotte, Jean-Pierre
BackLink https://doi.org/10.48550/arXiv.2105.10680$$DView paper in arXiv
BookMark eNotz7FOwzAUhWEPMEDhAZjwCyRc28R22KrQUqSKDu0eXTvXKBKxkRMKeXugMJ3pP9J3yc5iisTYjYDy3lYV3GH-6o-lFFCVArSFC7ZsZkfZp3F44C_0ydfpI3Y49SmOPKTMkTcpHim_Usf3vqfoiT_ihHz108zjRMMVOw_4NtL1_y7YYb06NJtiu3t6bpbbArWBopYGghDOCCURqIbgdCdd8F6iVi6QsgCVVrYmrTU431lTWzRGA1lvpVqw27_bk6F9z_2AeW5_Le3Jor4BhDBE5A
ContentType Journal Article
Copyright http://creativecommons.org/licenses/by-sa/4.0
Copyright_xml – notice: http://creativecommons.org/licenses/by-sa/4.0
DBID AKY
GOX
DOI 10.48550/arxiv.2105.10680
DatabaseName arXiv Computer Science
arXiv.org
DatabaseTitleList
Database_xml – sequence: 1
  dbid: GOX
  name: arXiv.org
  url: http://arxiv.org/find
  sourceTypes: Open Access Repository
DeliveryMethod fulltext_linktorsrc
ExternalDocumentID 2105_10680
GroupedDBID AKY
GOX
ID FETCH-LOGICAL-a670-9270f11b7132a0e90fb6d2bfcc2a63bfe380056389e6660bcd8798a7760e8c823
IEDL.DBID GOX
IngestDate Mon Jan 08 05:41:33 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a670-9270f11b7132a0e90fb6d2bfcc2a63bfe380056389e6660bcd8798a7760e8c823
OpenAccessLink https://arxiv.org/abs/2105.10680
ParticipantIDs arxiv_primary_2105_10680
PublicationCentury 2000
PublicationDate 2021-05-22
PublicationDateYYYYMMDD 2021-05-22
PublicationDate_xml – month: 05
  year: 2021
  text: 2021-05-22
  day: 22
PublicationDecade 2020
PublicationYear 2021
Score 1.8107417
SecondaryResourceType preprint
Snippet Scientific communities naturally tend to organize around data ecosystems created by the combination of their observational devices, their data repositories,...
SourceID arxiv
SourceType Open Access Repository
SubjectTerms Computer Science - Distributed, Parallel, and Cluster Computing
Computer Science - Networking and Internet Architecture
Title Cybercosm: New Foundations for a Converged Science Data Ecosystem
URI https://arxiv.org/abs/2105.10680
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV27TkMxDI1KJxYEAlSeysAakfimebBVpaVCApYidavimwQx8NBtQfD3-D5QWVgTZ4ijxOco9jFjFyZHFVPSwmpfCF2kLFwg4qojAFr0Stu6UPju3swe9e1iuOgx_lsLE6qv589WHxhXl8RHhsQvjSNSvgVQp2zdPCzaz8lGiquz39gRxmyG_gSJ6S7b6dAdH7XHscd66XWfjcbfmKrybfVyxelN4ZtORitOkJEHPq5Tv6unFHl31fh1WAc-oTWN0PIBm08n8_FMdJ0LRDBWCg9WZqWQCCAEmbzMaCJgLksIpsCcCldLcBJWSMQeJJbRWe-CtUYmVzooDlmfyH8aMK6BAAZq5ZGQjczBaxWVD5gzOIeFPWKDZr_L91acYlm7Ytm44vj_qRO2DXVuhhwKgFPWX1cf6YyC6xrPGw__ADRNeFY
link.rule.ids 228,230,786,891
linkProvider Cornell University
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Cybercosm%3A+New+Foundations+for+a+Converged+Science+Data+Ecosystem&rft.au=Asch%2C+Mark&rft.au=Bodin%2C+Fran%C3%A7ois&rft.au=Beck%2C+Micah&rft.au=Moore%2C+Terry&rft.date=2021-05-22&rft_id=info:doi/10.48550%2Farxiv.2105.10680&rft.externalDocID=2105_10680