top of page

Analyzing Discourse with Artificial Intelligence: What is Argumentation Mining?

What do political discourse and large language models have in common? They are fake... Or they can be. The true common denominator between these two is that they will tell you a statement and won't hesitate about it because their main objective is to either persuade you into doing something, for example, a candidate transmitting political ideology or a reason to vote for their party; or they simply can't differentiate whether what they are telling is the truth or some randomly generated, but confidently so, lie, for example, Chat GPT telling you a case law actually exists [1].

Since before the "non-release" of GPT-2 [2], which was just a marketing campaign looking back at it, there has been a lengthy discussion on the subject of AI and its extreme potential for generating "fake news". After all, the Cambridge Analytica scandal [3] resulted in a fallout that, among other things, shook Meta's (Facebook at the time) core business model and exposed the harsh reality of the fake news that plagued their flagship product [4].

Advances in AI have made it even harder to distinguish the nuances between human and machine-generated text [5], to the point where there has been active research to differentiate human from machine-generated data [6], a direct consequence of AI models requiring human-generated data for future versions [7].

However, there are other ways to deal with fake news, both human and machine-generated. One way is to analyze the discourse, understand whether what is being said makes sense, is logically consistent, and, most of all, is citing factual data rather than just using rhetoric for persuasion. This is not an area of AI but an area of natural intelligence. This is the science of argumentation: in order to conclude whether a claim being made is valid or not, we need to justify it by supporting or attacking it via premises (e.g., facts, citations, etc.) through logical reasoning.

And what does this study of human arguments have to do with Artificial Intelligence? Well, as you know, AI is a discipline that is not only limited to the plagiarism of images and text [8]. There are other areas, perhaps not profitable enough, being studied in AI, mostly in Academia. The use of AI systems for detecting arguments is called Argumentation Mining.

In this article, we will explore what argumentation mining is, also known as argument mining, and its purpose and use cases. We will also revise a little of the history of this fairly recent field within Artificial Intelligence.

Argumentation with a Chatbot. Source:
Argumentation with a Chatbot. Source:

What is Argumentation Mining?

Within the vast area of Artificial Intelligence, there are different fields that, at some point or another, collide into common objectives. The two main fields that gave birth to what we currently call Argumentation Mining are Natural Language Processing and Knowledge Representation and Reasoning. In simple terms, Argumentation Mining is an area that lies in the intersection of these two fields that deals with the automatic extraction of arguments from argumentative texts in natural language and the relationship these arguments have with the final objective of providing machine-processable structured data for computational models of argumentation [9]. The text to be "mined" must be, of course, of an argumentative nature (e.g., debates, science, case law, essays, etc.) since another type of text won't be possible to be analyzed by an Argumentation Mining model. The topic is still in constant development and started to attract the attention of the research community around 2014. It is currently one of the most promising research areas in Academia for Artificial Intelligence. As a sort of disclaimer, I am currently working on this subject for a postdoc project.

Argumentation Mining is aided by several research areas from Artificial Intelligence: Natural Language Processing provides the methods to process the natural language text where the arguments take place. It is necessary to identify the arguments and their components (e.g., premises and claims). The process of argument identification is complemented by Machine Learning, which also serves the purpose of predicting relationships among the argument components (e.g., attack or support). The field of Knowledge Representation and Reasoning contributes to the reasoning capabilities of the retrieved arguments and their relationships. It is essential to identify fallacies and inconsistencies in the argumentative structure automatically. Finally, another area of research is Human-Computer Interaction, the objective of which is to guide the design of good human-computer digital argument-based supportive tools.

Argumentation Mining is not the same as another, well-known, and closely related, task, which is opinion mining (also called sentiment analysis): while the latter focuses on understanding what a user thinks about some specific topic or product, argument mining's target is to analyze why a user has some opinion about a topic or a product.


The Process of Argumentation Mining

Although argumentation theory provides different approaches on how to represent argumentative structures, and there have been different approaches across this fresh area of research in AI, most of the current frameworks in Argumentation Mining revolve around two main stages for the process: Argument Extraction and Relationship Prediction. These tasks are usually studied in a complementary manner since the output of the first one is needed as input for the second one, especially if trying to work on a pipeline or an end-to-end architecture for the whole process of Argumentation Mining.

Argument Extraction 

The first stage is the identification of arguments within the input natural language text. This step may be further split into two different substages, such as the detection of argument components (e.g., claim, premises) and the further identification of their textual boundaries. 

Although there are plentiful variations to approach this stage, the current literature primarily uses a format very similar to that of named entity recognition using transformer architectures [10] like BERT [11].

Relationships Prediction 

The next stage, which requires the component extraction from the previous one, consists of the predictions of the relationships among the different components identified. This task is highly complex: for once, it carries any error that comes from the wrongful identification of argumentative components; on the other hand, it involves high-level knowledge representation and reasoning. 

The relationships of the arguments are of a heterogeneous nature, and although they usually are attack and support between argumentative components, they can be extended to other more complex or more granular types of relationships. It is also quite common that two argumentative components do not hold any relationship at all. 

Relationships are crucial when building the argumentative graph, in which each relationship corresponds to an edge, while each argumentative component corresponds to a node. The direct result of this stage can be used to find logical inconsistencies in the argument being analyzed.

An Example of Argumentation Mining

To better illustrate the process of argumentation mining, let's start with an example from the political debate between Trump and Clinton in September 2016. The following is the transcription "as is" from a point made by Trump. The statement between brackets "[]" is an argument (or claim), while the statements between parenthesis "()" are premises:

She talks about solar panels. We invested in a solar company, our country. [That was a disaster]. (They lost plenty of money on that one). Now, look, (I'm a great believer in all forms of energy), but (we're putting a lot of people out of work).

In this case, the argument being made is that solar energy is not a good policy for the government of the United States. There are two premises that support the claim. The first one is: "They lost plenty of money on that one", which should be based on the experience that the company that did the solar panels lost money (that fact should be provable). The second one is the last premise: "We're putting a lot of people out of work", which is also stated as a fact. The other premise is more difficult to classify, as it's more of a personal opinion: "I'm a great believer in all forms of energy". It's also not clear if it's an attack on the claim or if it doesn't have any relationship to it. Depending on the annotation guidelines, this last premise could also be classified as another type of claim (although in a minor role). This example reveals one of the main challenges of Argumentation Mining, which is the difficulty of annotating high-quality data.

Applications of Argumentation Mining

Several different fields can benefit from Argumentation Mining. Here, we'll cover some of the most common ones derived from the work of Cabrio and Villata [9]:

  • The analysis of essays, especially the ones written on controversial topics, is a prototypical case where argumentation mining can be applied. These types of texts aim to explain a specific topic and attempt to persuade the reader about the writer's point of view and why it is the most logical or informed one. 

  • The scientific articles (for example, academic papers), where the authors try to prove (or disprove) a hypothesis via the discussion of results of experiments carried out around such hypothesis and in relation to related work contrasting to authors' stance.

  • Within the scientific article domain, one very popular subdomain is the argumentation mining on medical articles, where the analysis can be used to better explain a diagnostic of a particular disease (the claim) with the aid of specific symptoms (the premises).

  • In the realm of web content, the discussion sections of Wikipedia articles provide an attractive target to be analyzed using argument mining. Particularly, IBM has worked on the task of automatically detecting context-dependent claims from Wikipedia articles [12], i.e., a general, concise statement that directly supports or contests the given topic.

  • Other platforms that provide much human-generated data for argumentation mining analysis are microblogging and web debating platforms like the ex-Twitter (now X) and Reddit, which had their subreddits specifically designed for carrying out argumentative debates.

  • Online product reviews, although mainly targeted by opinion mining and sentiment analysis, can also be analyzed with argumentation mining techniques that seek the underlying motivation a consumer is expressing a review over a product. It is helpful to get a general idea of why a product is successful or not.

  • In the legal domain, argumentation mining can be used to detect argumentation schemes in court sentences, judgments, etc. The main objective of doing this analysis is to ease the work of judges and law scholars in identifying similarities and differences among different cases.

  • The domain of political or public debates and speeches, deliberative democracy, etc., also benefits from the use of argumentation mining-based analysis. The use of these tools can detect fallacies and contradictory arguments in political debates, for example, which can help democracy by giving citizens better tools to analyze the stance of candidates for elections [13].

Most of the applications of argumentation mining are centered around the idea of analyzing whether or not an argument makes logical sense. These are building blocks to other tasks such as fact-checking and misinformation detection (for example, the contradictory attacks of specific arguments can be used to detect fake news).

Challenges of Argumentation Mining

As a growing field within artificial intelligence, there is a constant update of objectives and definitions in Argumentation Mining. Since it is based on argumentation theory, it has to deal with different challenges that arise from social sciences, such as the ambiguity of the arguments to be analyzed, problems in the expressiveness of human languages, and over all the different number of existing frameworks for argumentation mining, which sometimes varies between different sciences that study argumentation theory. Although, in recent years, the research community of argumentation mining has been consistently approaching a common framework, it is not necessarily the same framework provided in related domains that are not part of that community (for example, social sciences, linguistics, or philosophy).

Besides the framework to be used, which, in Argumentation Mining, is usually one of the most simplistic for the sake of easing the process of extracting arguments and their relationships, there's also the crucial problem of high-quality data availability. There have been numerous works on developing guidelines for annotations that differ a lot from domain to domain. For example, the guidelines for annotating a scientific article, in which argumentation is more based on fact-checking through data and experimentation, can be wildly different from the guidelines for annotating a political speech, which is mainly based on dialectics and sometimes rhetorics. Generally speaking, the process of annotation is also carried out by domain experts (e.g., medics, linguists, political scientists, etc.) who can identify the components and their relationships better than the people working in the machine learning part of argumentation mining (i.e., scientists from exact sciences like computer sciences or mathematics). This requirement in human-based annotation could also explain why large companies are overlooking the field.

Final Thoughts

Argumentation Mining is an exciting field within the Artificial Intelligence domain that has been in constant development, especially during the past decade. The objective is to understand the argumentative discourse and provide an automatic analysis of it that can benefit different stakeholders in their respective tasks. It can be applied to fields such as reviewing scientific literature, analyzing political debates, or countering the spread of misinformation. Although it lacks the resources that some other fields in AI have, it provides plentiful possibilities for the research community to explore.


[1] Neumeister, L. 2023. "Lawyers submitted bogus case law created by ChatGPT. A judge fined them $5,000". Associated Press.

[2] OpenAI. 2019. "Better language models and their implications".

[3] Confessore, N. 2018. "Cambridge Analytica and Facebook: The Scandal and the Fallout So Far". The New York Times.

[4] Thompson, N. Vogelstein, F. 2018. "Inside the Two Years That Shook Facebook—and the World". Wired.

[5] Casal, J. E., & Kessler, M. (2023). Can linguists distinguish between ChatGPT/AI and human writing?: A study of research ethics and academic publishing. Research Methods in Applied Linguistics, 2(3), 100068.

[6] Kirchner, J. Achmad, L. Aaronson, S. Leike, J. 2023. "New AI classifier for indicating AI-written text". OpenAI Blog.

[7] Nixey, P. @peternixey. Twitter. March 26, 2023.

[8] Pilth, A. 2023. "AI's Dreadful December: Lawsuits, plagiarism and child abuse images show the perils of training on data taken without consent". Tom's Hardware.

[9] Cabrio, E., & Villata, S. (2018, July). Five years of argument mining: A data-driven analysis. In IJCAI (Vol. 18, pp. 5427-5433).

[10] Mayer, T., Cabrio, E., & Villata, S. (2019, August). Acta: A tool for argumentative clinical trial analysis. In IJCAI 2019-Twenty-Eighth International Joint Conference on Artificial Intelligence (pp. 6551-6553).

[11] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

[12] Levy, R., Bilu, Y., Hershcovich, D., Aharoni, E., & Slonim, N. (2014, August). Context dependent claim detection. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers (pp. 1489-1500).

[13] Goffredo, P., Cabrio, E., Villata, S., Haddadan, S., & Sanchez, J. T. (2023, June). Disputool 2.0: A modular architecture for multi-layer argumentative analysis of political debates. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 37, No. 13, pp. 16431-16433).


bottom of page