RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation
Large Language Models (LLMs) demonstrate human-level capabilities in dialogue, reasoning, and knowledge retention. However, even the most advanced LLMs face challenges such as hallucinations and real-time updating of their knowledge. Current research addresses this bottleneck by equipping LLMs with...
Saved in:
Main Authors | , , , , , , , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
21.08.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Large Language Models (LLMs) demonstrate human-level capabilities in
dialogue, reasoning, and knowledge retention. However, even the most advanced
LLMs face challenges such as hallucinations and real-time updating of their
knowledge. Current research addresses this bottleneck by equipping LLMs with
external knowledge, a technique known as Retrieval Augmented Generation (RAG).
However, two key issues constrained the development of RAG. First, there is a
growing lack of comprehensive and fair comparisons between novel RAG
algorithms. Second, open-source tools such as LlamaIndex and LangChain employ
high-level abstractions, which results in a lack of transparency and limits the
ability to develop novel algorithms and evaluation metrics. To close this gap,
we introduce RAGLAB, a modular and research-oriented open-source library.
RAGLAB reproduces 6 existing algorithms and provides a comprehensive ecosystem
for investigating RAG algorithms. Leveraging RAGLAB, we conduct a fair
comparison of 6 RAG algorithms across 10 benchmarks. With RAGLAB, researchers
can efficiently compare the performance of various algorithms and develop novel
algorithms. |
---|---|
AbstractList | Large Language Models (LLMs) demonstrate human-level capabilities in
dialogue, reasoning, and knowledge retention. However, even the most advanced
LLMs face challenges such as hallucinations and real-time updating of their
knowledge. Current research addresses this bottleneck by equipping LLMs with
external knowledge, a technique known as Retrieval Augmented Generation (RAG).
However, two key issues constrained the development of RAG. First, there is a
growing lack of comprehensive and fair comparisons between novel RAG
algorithms. Second, open-source tools such as LlamaIndex and LangChain employ
high-level abstractions, which results in a lack of transparency and limits the
ability to develop novel algorithms and evaluation metrics. To close this gap,
we introduce RAGLAB, a modular and research-oriented open-source library.
RAGLAB reproduces 6 existing algorithms and provides a comprehensive ecosystem
for investigating RAG algorithms. Leveraging RAGLAB, we conduct a fair
comparison of 6 RAG algorithms across 10 benchmarks. With RAGLAB, researchers
can efficiently compare the performance of various algorithms and develop novel
algorithms. |
Author | Wang, Yidong Ye, Wei Wu, Zhen Wen, Qingsong Xu, Wenyuan Zhang, Shikun Zhang, Yue Zhang, Xuanwang Li, Xinfeng Tang, Shuyun Dai, Xinyu Song, Yunze Zeng, Zhengran |
Author_xml | – sequence: 1 givenname: Xuanwang surname: Zhang fullname: Zhang, Xuanwang – sequence: 2 givenname: Yunze surname: Song fullname: Song, Yunze – sequence: 3 givenname: Yidong surname: Wang fullname: Wang, Yidong – sequence: 4 givenname: Shuyun surname: Tang fullname: Tang, Shuyun – sequence: 5 givenname: Xinfeng surname: Li fullname: Li, Xinfeng – sequence: 6 givenname: Zhengran surname: Zeng fullname: Zeng, Zhengran – sequence: 7 givenname: Zhen surname: Wu fullname: Wu, Zhen – sequence: 8 givenname: Wei surname: Ye fullname: Ye, Wei – sequence: 9 givenname: Wenyuan surname: Xu fullname: Xu, Wenyuan – sequence: 10 givenname: Yue surname: Zhang fullname: Zhang, Yue – sequence: 11 givenname: Xinyu surname: Dai fullname: Dai, Xinyu – sequence: 12 givenname: Shikun surname: Zhang fullname: Zhang, Shikun – sequence: 13 givenname: Qingsong surname: Wen fullname: Wen, Qingsong |
BackLink | https://doi.org/10.48550/arXiv.2408.11381$$DView paper in arXiv |
BookMark | eNqFzrsOgjAUgOEOOnh7ACf7AiAVSIhbNYKDxoToZkJO5KCN0JrDRX178bI7_cs3_H3W0UYjY2Ph2F7g-84U6KEae-Y5gS2EG4geO8Yy2sjFnEu-NWmdA3HQKY-xRKDTxdqRQl1hyg9aZaptSFDg3dCVZ4ZaV7WggdyS9bn4ygg1ElTK6CHrZpCXOPp1wCbhar9cW5-P5EaqAHom75_k8-P-Fy_lrUG7 |
ContentType | Journal Article |
Copyright | http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
Copyright_xml | – notice: http://arxiv.org/licenses/nonexclusive-distrib/1.0 |
DBID | AKY GOX |
DOI | 10.48550/arxiv.2408.11381 |
DatabaseName | arXiv Computer Science arXiv.org |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: GOX name: arXiv.org url: http://arxiv.org/find sourceTypes: Open Access Repository |
DeliveryMethod | fulltext_linktorsrc |
ExternalDocumentID | 2408_11381 |
GroupedDBID | AKY GOX |
ID | FETCH-arxiv_primary_2408_113813 |
IEDL.DBID | GOX |
IngestDate | Wed Sep 11 12:28:31 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-arxiv_primary_2408_113813 |
OpenAccessLink | https://arxiv.org/abs/2408.11381 |
ParticipantIDs | arxiv_primary_2408_11381 |
PublicationCentury | 2000 |
PublicationDate | 2024-08-21 |
PublicationDateYYYYMMDD | 2024-08-21 |
PublicationDate_xml | – month: 08 year: 2024 text: 2024-08-21 day: 21 |
PublicationDecade | 2020 |
PublicationYear | 2024 |
Score | 3.8574615 |
SecondaryResourceType | preprint |
Snippet | Large Language Models (LLMs) demonstrate human-level capabilities in
dialogue, reasoning, and knowledge retention. However, even the most advanced
LLMs face... |
SourceID | arxiv |
SourceType | Open Access Repository |
SubjectTerms | Computer Science - Computation and Language |
Title | RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation |
URI | https://arxiv.org/abs/2408.11381 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwdV1NSwMxEB1qT15EUanfc_AadNPNpustitsi1oIo7EFYku5UPKiltuLPd5LsopcekwxhmATmzWTmBeA8n-qZJpWJnPgYUienwloiHiqnlNJNHnL8kI2e07tSlR3AthfGLn7eviM_sPu68Pxb_tcR31u9IaUv2RpOyvg4Gai4Gvk_OcaYYeqfkyi2YatBd2jicexAhz524eXRDO_N9RUaHH_WvuwTOXzHtuZNTDzVMAM_ZPw3Y0SIRVswhYwoWc7_ecUXQpjV63uUjGTR3qZ7cFbcPt2MRNCnmkfyiMqrWgVV-_vQ5RCfeoBK65pSJS_rAaVZkrhayyzX1lrZp3xAB9Bbt8vh-qUj2JTsgn0GVCbH0F0uVnTCLnTpToMdfwGiw3WD |
link.rule.ids | 228,230,786,891 |
linkProvider | Cornell University |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=RAGLAB%3A+A+Modular+and+Research-Oriented+Unified+Framework+for+Retrieval-Augmented+Generation&rft.au=Zhang%2C+Xuanwang&rft.au=Song%2C+Yunze&rft.au=Wang%2C+Yidong&rft.au=Tang%2C+Shuyun&rft.date=2024-08-21&rft_id=info:doi/10.48550%2Farxiv.2408.11381&rft.externalDocID=2408_11381 |