Ranking Creative Language Characteristics in Small Data Scenarios
The ability to rank creative natural language provides an important general tool for downstream language understanding and generation. However, current deep ranking models require substantial amounts of labeled data that are difficult and expensive to obtain for different domains, languages and crea...
Saved in:
Main Authors | , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
23.10.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The ability to rank creative natural language provides an important general
tool for downstream language understanding and generation. However, current
deep ranking models require substantial amounts of labeled data that are
difficult and expensive to obtain for different domains, languages and creative
characteristics. A recent neural approach, the DirectRanker, promises to reduce
the amount of training data needed but its application to text isn't fully
explored. We therefore adapt the DirectRanker to provide a new deep model for
ranking creative language with small data. We compare DirectRanker with a
Bayesian approach, Gaussian process preference learning (GPPL), which has
previously been shown to work well with sparse data. Our experiments with
sparse training data show that while the performance of standard neural ranking
approaches collapses with small training datasets, DirectRanker remains
effective. We find that combining DirectRanker with GPPL increases performance
across different settings by leveraging the complementary benefits of both
models. Our combined approach outperforms the previous state-of-the-art on
humor and metaphor novelty tasks, increasing Spearman's $\rho$ by 14% and 16%
on average. |
---|---|
DOI: | 10.48550/arxiv.2010.12613 |