Scientific Hypothesis Generation by a Large Language Model: Laboratory Validation in Breast Cancer Treatment
Large language models (LLMs) have transformed AI and achieved breakthrough performance on a wide range of tasks that require human intelligence. In science, perhaps the most interesting application of LLMs is for hypothesis formation. A feature of LLMs, which results from their probabilistic structu...
Saved in:
Main Authors | , , , , , , , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
20.05.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Large language models (LLMs) have transformed AI and achieved breakthrough
performance on a wide range of tasks that require human intelligence. In
science, perhaps the most interesting application of LLMs is for hypothesis
formation. A feature of LLMs, which results from their probabilistic structure,
is that the output text is not necessarily a valid inference from the training
text. These are 'hallucinations', and are a serious problem in many
applications. However, in science, hallucinations may be useful: they are novel
hypotheses whose validity may be tested by laboratory experiments. Here we
experimentally test the use of LLMs as a source of scientific hypotheses using
the domain of breast cancer treatment. We applied the LLM GPT4 to hypothesize
novel pairs of FDA-approved non-cancer drugs that target the MCF7 breast cancer
cell line relative to the non-tumorigenic breast cell line MCF10A. In the first
round of laboratory experiments GPT4 succeeded in discovering three drug
combinations (out of 12 tested) with synergy scores above the positive
controls. These combinations were itraconazole + atenolol, disulfiram +
simvastatin and dipyridamole + mebendazole. GPT4 was then asked to generate new
combinations after considering its initial results. It then discovered three
more combinations with positive synergy scores (out of four tested), these were
disulfiram + fulvestrant, mebendazole + quinacrine and disulfiram + quinacrine.
A limitation of GPT4 as a generator of hypotheses was that its explanations for
them were formulaic and unconvincing. We conclude that LLMs are an exciting
novel source of scientific hypotheses. |
---|---|
DOI: | 10.48550/arxiv.2405.12258 |