Searching for Meaning Rather Than Keywords and Returning Answers Rather Than Links

Large language models (LLMs) have transformed the largest web search engines: for over ten years, public expectations of being able to search on meaning rather than just keywords have become increasingly realised. Expectations are now moving further: from a search query generating a list of "te...

Full description

Saved in:
Bibliographic Details
Published inThe code4lib journal no. 57
Main Author Kent Fitch
Format Journal Article
LanguageEnglish
Published Code4Lib 01.08.2023
Online AccessGet full text
ISSN1940-5758

Cover

Abstract Large language models (LLMs) have transformed the largest web search engines: for over ten years, public expectations of being able to search on meaning rather than just keywords have become increasingly realised. Expectations are now moving further: from a search query generating a list of "ten blue links" to producing an answer to a question, complete with citations. This article describes a proof-of-concept that applies the latest search technology to library collections by implementing a semantic search across a collection of 45,000 newspaper articles from the National Library of Australia's Trove repository, and using OpenAI's ChatGPT4 API to generate answers to questions on that collection that include source article citations. It also describes some techniques used to scale semantic search to a collection of 220 million articles.
AbstractList Large language models (LLMs) have transformed the largest web search engines: for over ten years, public expectations of being able to search on meaning rather than just keywords have become increasingly realised. Expectations are now moving further: from a search query generating a list of "ten blue links" to producing an answer to a question, complete with citations. This article describes a proof-of-concept that applies the latest search technology to library collections by implementing a semantic search across a collection of 45,000 newspaper articles from the National Library of Australia's Trove repository, and using OpenAI's ChatGPT4 API to generate answers to questions on that collection that include source article citations. It also describes some techniques used to scale semantic search to a collection of 220 million articles.
Author Kent Fitch
Author_xml – sequence: 1
  fullname: Kent Fitch
BookMark eNqtjMsKwjAQRYMo-PyH_ICQkqbapYiiqJvafRibaZuqiUwq4t-L4sa9q3s4HO6QdZ132GGDKI3FVM3UvM-GITRCJELEcsCyIwIVtXUVLz3xA4J7cwZtjcTzGhzf4fPhyQQOzvAM2zt9koULD6Twk-6tO4cx65VwCTj57oht16t8uZkaD42-kb0CPbUHqz_CU6WBWltcUKcJFijTmVGRik8nA9IAqqSMMMIEQcp_fr0ApABb_A
ContentType Journal Article
DBID DOA
DatabaseName DOAJ Directory of Open Access Journals
DatabaseTitleList
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
DeliveryMethod fulltext_linktorsrc
Discipline Library & Information Science
EISSN 1940-5758
ExternalDocumentID oai_doaj_org_article_96ece397d5154bbda3dae56f1e1e6ea3
GroupedDBID .4I
29F
2WC
5GY
5VS
6J9
ABDBF
ADBBV
AEGXH
ALMA_UNASSIGNED_HOLDINGS
BCNDV
EAP
EBS
EJD
ELW
ESX
FRJ
GROUPED_DOAJ
KQ8
M~E
OK1
OVT
RNS
TR2
ZBA
ID FETCH-doaj_primary_oai_doaj_org_article_96ece397d5154bbda3dae56f1e1e6ea33
IEDL.DBID DOA
IngestDate Wed Aug 27 01:30:45 EDT 2025
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 57
Language English
LinkModel DirectLink
MergedId FETCHMERGED-doaj_primary_oai_doaj_org_article_96ece397d5154bbda3dae56f1e1e6ea33
OpenAccessLink https://doaj.org/article/96ece397d5154bbda3dae56f1e1e6ea3
ParticipantIDs doaj_primary_oai_doaj_org_article_96ece397d5154bbda3dae56f1e1e6ea3
PublicationCentury 2000
PublicationDate 2023-08-01
PublicationDateYYYYMMDD 2023-08-01
PublicationDate_xml – month: 08
  year: 2023
  text: 2023-08-01
  day: 01
PublicationDecade 2020
PublicationTitle The code4lib journal
PublicationYear 2023
Publisher Code4Lib
Publisher_xml – name: Code4Lib
SSID ssj0060043
Score 4.454113
Snippet Large language models (LLMs) have transformed the largest web search engines: for over ten years, public expectations of being able to search on meaning rather...
SourceID doaj
SourceType Open Website
Title Searching for Meaning Rather Than Keywords and Returning Answers Rather Than Links
URI https://doaj.org/article/96ece397d5154bbda3dae56f1e1e6ea3
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1NS8QwEB1kT17ET_xamYN4W2y2bbDHVVxWRQ9lhb2VpJniKYpdEf_9ziQR3JMHvTYhCROSN6-ZeQNwziBBmpiWdJ08M16NrRwpN3KKxqpQJJgu0RZPevZc3C_KxY9SXxITFuWBo-EuK00tMWg6Bt7CWmdyZ6jUnSLFk5ig85lV2TeZinewlgeuNRX-ABfTbdhKfh5O4vg7sEF-F4YpSwAvMKUBiVkwna89qGPwL6MJcis-kpG_FlgHLw3nL8bjA319Ml_s0XiHNTFghC4T30u1s7WuQjP7fbib3s5vZiNZZPMWtSUaUXsOH9gGTbJB85sN8gMY-FdPh4Ds8DEs8fVnso6ZnrGllezY3LSVMm2hj-D67_Md_8cgJ7AppdtjMN0pDJbvHzRkgF_as7CXK95srxc
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Searching+for+Meaning+Rather+Than+Keywords+and+Returning+Answers+Rather+Than+Links&rft.jtitle=The+code4lib+journal&rft.au=Kent+Fitch&rft.date=2023-08-01&rft.pub=Code4Lib&rft.eissn=1940-5758&rft.issue=57&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_96ece397d5154bbda3dae56f1e1e6ea3