Assessing the Answerability of Queries in Retrieval-Augmented Code Generation
Thanks to unprecedented language understanding and generation capabilities of large language model (LLM), Retrieval-augmented Code Generation (RaCG) has recently been widely utilized among software developers. While this has increased productivity, there are still frequent instances of incorrect cod...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
08.11.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Thanks to unprecedented language understanding and generation capabilities of
large language model (LLM), Retrieval-augmented Code Generation (RaCG) has
recently been widely utilized among software developers. While this has
increased productivity, there are still frequent instances of incorrect codes
being provided. In particular, there are cases where plausible yet incorrect
codes are generated for queries from users that cannot be answered with the
given queries and API descriptions. This study proposes a task for evaluating
answerability, which assesses whether valid answers can be generated based on
users' queries and retrieved APIs in RaCG. Additionally, we build a benchmark
dataset called Retrieval-augmented Code Generability Evaluation (RaCGEval) to
evaluate the performance of models performing this task. Experimental results
show that this task remains at a very challenging level, with baseline models
exhibiting a low performance of 46.7%. Furthermore, this study discusses
methods that could significantly improve performance. |
---|---|
AbstractList | Thanks to unprecedented language understanding and generation capabilities of
large language model (LLM), Retrieval-augmented Code Generation (RaCG) has
recently been widely utilized among software developers. While this has
increased productivity, there are still frequent instances of incorrect codes
being provided. In particular, there are cases where plausible yet incorrect
codes are generated for queries from users that cannot be answered with the
given queries and API descriptions. This study proposes a task for evaluating
answerability, which assesses whether valid answers can be generated based on
users' queries and retrieved APIs in RaCG. Additionally, we build a benchmark
dataset called Retrieval-augmented Code Generability Evaluation (RaCGEval) to
evaluate the performance of models performing this task. Experimental results
show that this task remains at a very challenging level, with baseline models
exhibiting a low performance of 46.7%. Furthermore, this study discusses
methods that could significantly improve performance. |
Author | Kim, Tae-Ho Kim, Jaeyeon Kim, Geonmin Park, Hancheol Shin, Wooksu |
Author_xml | – sequence: 1 givenname: Geonmin surname: Kim fullname: Kim, Geonmin – sequence: 2 givenname: Jaeyeon surname: Kim fullname: Kim, Jaeyeon – sequence: 3 givenname: Hancheol surname: Park fullname: Park, Hancheol – sequence: 4 givenname: Wooksu surname: Shin fullname: Shin, Wooksu – sequence: 5 givenname: Tae-Ho surname: Kim fullname: Kim, Tae-Ho |
BackLink | https://doi.org/10.48550/arXiv.2411.05547$$DView paper in arXiv |
BookMark | eNqFjrsOgkAQRbfQwtcHWDk_AIKy0ZYQH42Fhp6sYcBJYNbsLih_LxJ7q3uLe3LPVIxYMwqxDAM_2ksZrJV5U-tvojD0Aymj3URcYmvRWuIS3AMhZvtCo-5UketAF3Bt0BBaIIYbur62qvLipqyRHeaQ6BzhhNwzjjTPxbhQlcXFL2didTykydkbjrOnoVqZLvsKZIPA9v_iA6KYPWs |
ContentType | Journal Article |
Copyright | http://creativecommons.org/licenses/by/4.0 |
Copyright_xml | – notice: http://creativecommons.org/licenses/by/4.0 |
DBID | AKY GOX |
DOI | 10.48550/arxiv.2411.05547 |
DatabaseName | arXiv Computer Science arXiv.org |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository |
DeliveryMethod | fulltext_linktorsrc |
ExternalDocumentID | 2411_05547 |
GroupedDBID | AKY GOX |
ID | FETCH-arxiv_primary_2411_055473 |
IEDL.DBID | GOX |
IngestDate | Tue Nov 12 12:28:55 EST 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-arxiv_primary_2411_055473 |
OpenAccessLink | https://arxiv.org/abs/2411.05547 |
ParticipantIDs | arxiv_primary_2411_05547 |
PublicationCentury | 2000 |
PublicationDate | 2024-11-08 |
PublicationDateYYYYMMDD | 2024-11-08 |
PublicationDate_xml | – month: 11 year: 2024 text: 2024-11-08 day: 08 |
PublicationDecade | 2020 |
PublicationYear | 2024 |
Score | 3.8785806 |
SecondaryResourceType | preprint |
Snippet | Thanks to unprecedented language understanding and generation capabilities of
large language model (LLM), Retrieval-augmented Code Generation (RaCG) has... |
SourceID | arxiv |
SourceType | Open Access Repository |
SubjectTerms | Computer Science - Computation and Language |
Title | Assessing the Answerability of Queries in Retrieval-Augmented Code Generation |
URI | https://arxiv.org/abs/2411.05547 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1LT8MwDLa2nbggEKDx9oFrYH2kj2M1MSakgUAg9VYljYt26VC78vj3uMkQXHaNrciyD99nOf4CcDUhbciEkaioJBEm2gglDQlVhjJOdVB6dkls8RDNX8P7XOYDwN9dGNV8LT-cPrBubxhevOsJI148hKHv90-27h5zN5y0Ulwb_z8_5pj26B9IzPZgd8PuMHPl2IcB1QewcINVxghktoVZ3X5S4-Sxv3FV4VPXiw23uKzx2f5vxcUXWfdm5TINTleG0KlD90k8hMvZ7ct0LmwAxbtTiyj62AobW3AEI-7paQwo00RRXOkqjdnOLU-io7BSniwDX6ogOobxtltOtptOYcdnzLWrcskZjNZNR-eMmWt9YRP3A7tfcho |
link.rule.ids | 228,230,783,888 |
linkProvider | Cornell University |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Assessing+the+Answerability+of+Queries+in+Retrieval-Augmented+Code+Generation&rft.au=Kim%2C+Geonmin&rft.au=Kim%2C+Jaeyeon&rft.au=Park%2C+Hancheol&rft.au=Shin%2C+Wooksu&rft.date=2024-11-08&rft_id=info:doi/10.48550%2Farxiv.2411.05547&rft.externalDocID=2411_05547 |