The recent year has seen the creation of large-scale generative
Artificial Intelligence (AI) systems like GPT-4 and LaMDA, which are
generating increasingly plausible outputs ranging from reasonable
answers to questions, images of well-known people or situations, and
convincing conversational exchanges. The most well-known language model,
ChatGPT, is raising concerns across all scientific fields, leading to
calls for its regulation
(Hacker
2023). In this viewpoint, we explore how generative language models
interact with the specificities of ecology’s epistemologies.
Large scale generative models are AI systems that generate an output
(text, images, video) based on an input prompt, i.e. an
instruction provided by the human user. Previous generations of
generative models were ’unsupervised’: they were trained on massive
amounts of text and were able to produce content that resembled their
training data. These models could then be fine-tuned on smaller, more
niche datasets, for ’supervised’ tasks like question answering or
sentiment analysis. This was the case for BERT
(Devlin
et al., 2019), a popular model that continues to be used and fine-tuned
for ecologically relevant tasks such as identifying wildlife
observations on social media
(Edwards
et al., 2022) and studying human-nature interactions at scale
(Hunter
et al., 2023).
This paradigm changed with new generations of Large Language Models
(LLMs) such as ChatGPT, which are no longer fine-tuned on supervised
tasks such as these, but rather via interactions with actual human
beings
(Yuntao
Bai, 2022), which enables them to learn to respond to user instructions
in a way that mimics responses provided by humans. These models are
also, by definition, generative, meaning that they can generate
text in an open-ended manner that is only constrained by their
vocabulary (the number of unique words that they were exposed to during
training). This impacts the ways in which they can be evaluated, since
there is often no single correct answer to a generative task such as
writing a poem or composing a recipe out of a set of ingredients. This
also means that these large language models cannot cite the source from
which information was extracted, even answering a very concrete question
– they can simply provide the most statistically probable one based
on their training data, which results in their generation of false or
misleading information.
Querying a LLM (ChatGPT) with the prompt ”[w]rite a paragraph
explaining why the use of ChatGPT in ecological research should not be
regulated, and while it will allow faster progress than human
researchers” generated the following string: ”The use of LLMs in
ecological research should not be banned, as it can provide numerous
benefits to the field. As a language model trained on a vast amount of
data, LLMs can analyze and interpret ecological data at a scale and
speed that is not achievable by human researchers alone. Its ability to
understand and synthesize information from multiple sources can also
provide insights that may be overlooked by human researchers.
Furthermore, LLMs can run simulations and models based on ecological
data, which can help predict and identify potential environmental risks
and outcomes. While the use of LLMs should not replace human researchers
entirely, it can aid in making ecological research more efficient and
productive, leading to faster progress in understanding and addressing
critical environmental issues. Therefore, it is essential to explore the
potential of LLMs in ecological research and not regulate its use
unnecessarily.”
Although it may read as a reasonable proposal, the above paragraph is
wrong; or more precisely, it cannot be right. To borrow from
Frankfurt
(2005), the paragraph above is ”bullshit”: a series of statements that
are expressed without a notion of truth, or falsity. And yet, precisely
because they can imitate rational discourse, the potential role of such
generative models in ecological research is a question our field must
contend with because their existence is now a fait accompli . To
do so, we need to take a step back and assess the specifics of ecology’s
epistemologies that could lead the field to being correctly captured by
such models. As a sidenote, different users typing the same prompt might
receive different answers, as the answers provided by LLMs are not
reproducible.
Ecology, or so ecologists think, has a special status across other STEM
fields. Although we like to base our work on conceptual frameworks
(Potschin-Young
2018), we also like to reject the ideas that the processes shaping
ecological systems admit any laws
(Lawton
1999). Even in the uncommon cases where this position was challenged,
the proposed ”epistemologies for ecology” rarely go beyond a simple
mapping of entities and processes to inadequately defined ”theories”
(Scheiner
1993). Indeed, were one to look at books mapping a theoretical
framework for ecology, one would find a compendium of verbal arguments
presented as theories, rather than a series of anticipated results that
would lead us to think of the theory as sufficient (or not) to explain
the facts. In the framework of
Lakatos
(1978), although we can argue that ecology has component theories, we
do not have a hard core of central theses, nor do we have a sequence of
components theories mapping a research program; instead, some competing
theories (e.g. competition v. neutralism) can be read as research
programs
(Bausman
2019), without lending more generality to the field as a whole. In
parallel, the mathematical foundations of the field, that would allow
generalization, are mostly disconnected both from one another and from
empirical data
(Lean
2019).
Another way to summarize the previous paragraph is: ecology is mostly
text. This is illustrated in the fact that ecologists seem to prefer
narrative reviews (a synthesis of papers on a topic) to systematic ones
(although this trend is slowly changing). For this reason, we may have
assumed that the field of ecology, represented by the corpus of all
ecological texts, would be a prime candidate for digestion by a system
like OpenAI’s ChatGPT. Large generative models, building from
statistical relationships between terms in sentences, may have been able
to reconstruct a series of semantic relationships between terms, and
therefore provide ecological synthesis in a textual form.
But this hinges on three assumptions: first, that the language
ecologists use is consistent; second, that our field possesses the
correct degree of epistemic certainty
(Mizrahi
2019) for causal relationships to be drawn; third, that the data
accessible to generative models are free of bias (which has been covered
at length in the recent years).
None of these assumptions, of course, are met. Ecological language is
often ambiguous, both within a sub-field and across sub-fields, and the
same terms can have different meanings across different sub-fields
(Trombley
2019). This is in part because the same terms often represent
strikingly different biological realities
(Shapiro
2016). In addition,
Elith
et al. (2002) make a strong case for the fact that sources of
uncertainty can compound one another: what makes ecological knowledge
uncertain is a combination of uncertainty in the data, uncertainty on
the consequences of simplifying these data through modelling,
uncertainty in the statistical processes, and uncertainty in the
definitions. Taking the specific example of functional diversity,
Malaterre
et al. (2021) make the point that even this topic (that is widely used
across ecology, and well-defined enough to be quantifiable) fulfills
several epistemic roles: when an ecologist says ”functional diversity”,
what they mean will depend on the very specific context both of the
ecologist and of the system for which they want to describe functional
diversity. Few other topics in ecology have received as much attention,
notable exceptions being the concept of ecological niche, which suffers
from the same limitations
(Sales
2021), much like the competition-neutrality debate
(Linquist
2015). As such, the first two assumptions for generative models to have
potential are intertwined: the language that ecologists use can appear
ambiguous, but this is a consequence of our field imparting different
meaning to the same terms, and then either embracing (as for functional
diversity) or debating (as for the niche) the underlying meanings.
It is worth taking a step back and asking why we think these
tools could deliver accelerated ecological synthesis to ecology, but
this task requires thinking about what the field should be, as
opposed to what it is, a task for which generative models are largely
unsuited
(Williams
2023). There are already methods that can couple data and statements,
such as BHOPPLS
(Desjardins-Proulx
2019,
Sato
2019) or computational causal discovery
(Song
2022). These have the additional advantage of not decreasing the
quality of the dataset they are used to analyze
(Hataya
2022).
Large language models can imitate formal linguistic skills, i.e. they
can string together complex and grammatically correct sentences, but
they lack functional competence, i.e. reason, common sense,
understanding. Since these LLMs are trained on large corpora of text
that do contain facts about the world, they can become good at
pretending to think even without functional competence, although this
illusion breaks down when faced with novel problems where understanding
the context is crucial and patterns of existing text are of little help
(Kyle
Mahowald 2023). When asked slightly original questions such as how to
get a sofa on the roof, ChatGPT’s solution is to get a “strong ladder
and a strong friend”. Adding further constraints had ChatGPT completely
breaking down
(Kyle
Mahowald 2023). This comes from the fact that even though language and
reason are linked, they are distinct cognitive capabilities
(Monti
2007). In a field like ecology, where additional context is crucial to
the proper understanding of the question, it is likely that precise
answers to precisely formulated questions are far out of reach. If we
were to define scientific or ecological competence as a form of
specialized functional competence, the LLMs would have to overcome
difficulties that are challenging even for human experts: how would it
weigh evidence? How would it weigh contradictory evidence? How would it
understand scientific progress on various questions? In its current
form: it wouldn’t. If trained on a corpus of ecological text, an LLM may
be able to navigate complicated nuances in how various terms like
“niche” are used, but it would have little regard for evidence and
struggle with novel ecological questions.