Multi-step Inference over Unstructured Data

The advent of Large Language Models (LLMs) and Generative AI has revolutionized natural language applications across various domains. However, high-stakes decision-making tasks in fields such as medical, legal and finance require a level of precision, comprehensiveness, and logical consistency that...

Full description

Saved in:

Bibliographic Details
Published in	arXiv.org
Main Authors	Kalyanpur, Aditya, Kailash Karthik Saravanakumar, Barres, Victor, McFate, CJ, Moon, Lori, Seifu, Nati, Eremeev, Maksim, Barrera, Jose, Bautista-Castillo, Abraham, Brown, Eric, Ferrucci, David
Format	Paper
Language	English
Published	Ithaca Cornell University Library, arXiv.org 24.07.2024
Subjects	Algorithms Cognition Generative artificial intelligence Inference Large language models Reasoning Reasoning programs Retrieval Task complexity Unstructured data
Online Access	Get full text

Cover

Loading…

More Information
Summary:	The advent of Large Language Models (LLMs) and Generative AI has revolutionized natural language applications across various domains. However, high-stakes decision-making tasks in fields such as medical, legal and finance require a level of precision, comprehensiveness, and logical consistency that pure LLM or Retrieval-Augmented-Generation (RAG) approaches often fail to deliver. At Elemental Cognition (EC), we have developed a neuro-symbolic AI platform to tackle these problems. The platform integrates fine-tuned LLMs for knowledge extraction and alignment with a robust symbolic reasoning engine for logical inference, planning and interactive constraint solving. We describe Cora, a Collaborative Research Assistant built on this platform, that is designed to perform complex research and discovery tasks in high-stakes domains. This paper discusses the multi-step inference challenges inherent in such domains, critiques the limitations of existing LLM-based methods, and demonstrates how Cora's neuro-symbolic approach effectively addresses these issues. We provide an overview of the system architecture, key algorithms for knowledge extraction and formal reasoning, and present preliminary evaluation results that highlight Cora's superior performance compared to well-known LLM and RAG baselines.
ISSN:	2331-8422