DMOps: Data Management Operation and Recipes
Data-centric AI has shed light on the significance of data within the machine learning (ML) pipeline. Recognizing its significance, academia, industry, and government departments have suggested various NLP data research initiatives. While the ability to utilize existing data is essential, the abilit...
Saved in:
Main Authors | , |
---|---|
Format | Journal Article |
Language | English |
Published |
02.01.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Data-centric AI has shed light on the significance of data within the machine
learning (ML) pipeline. Recognizing its significance, academia, industry, and
government departments have suggested various NLP data research initiatives.
While the ability to utilize existing data is essential, the ability to build a
dataset has become more critical than ever, especially in the industry. In
consideration of this trend, we propose a "Data Management Operations and
Recipes" to guide the industry in optimizing the building of datasets for NLP
products. This paper presents the concept of DMOps which is derived from
real-world experiences with NLP data management and aims to streamline data
operations by offering a baseline. |
---|---|
DOI: | 10.48550/arxiv.2301.01228 |