WIBA 2.0 is released!

Processing...

WIBA: What Is Being Argued?

Advanced argument detection and analysis tool for researchers, educators, and analysts.

Enter text to analyze or try an example

Create Account

Fields marked with an asterisk (*) are required
Please enter your full name (at least 2 characters)
Please enter a valid email address
Password strength:
At least 8 characters
At least one uppercase letter
At least one lowercase letter
At least one number
At least one special character (!@#$%^&*)
Passwords do not match
Please enter your organization name (at least 2 characters)
Please describe your intended use case for WIBA (e.g., research, education, analysis)
Please provide a detailed description (at least 10 characters)
0 / 10 characters minimum

Installation

Important: Before using WIBA, you'll need to create an account to get your API token.

Once you have your API token, install the WIBA client using pip:

pip install wiba

Initialize the client with your API token:

from wiba import WIBA

# Get your API token from the Account tab after registration
analyzer = WIBA(api_token="your_api_token_here")

Don't have an account yet? Register here to get your API token.

Quick Start

Note: Make sure you have created an account and obtained your API token before starting.

Here's a simple example to get started with WIBA using comprehensive analysis:

from wiba import WIBA

# Initialize client
analyzer = WIBA(api_token="your_api_token_here")

# Example text
text = "Climate change is real because global temperatures are rising."

# Comprehensive analysis - get all fields in one call
result = analyzer.comprehensive(text)
print(f"Is Argument: {result.is_argument}")
print(f"Confidence: {result.confidence}")
print(f"Claims: {result.claims}")
print(f"Premises: {result.premises}")
print(f"Topic: {result.topic_fine}")
print(f"Stance: {result.stance_fine}")

Comprehensive Analysis Recommended

New in v0.2.0: The comprehensive() method provides complete argument analysis in a single call, powered by our unified Qwen3-4B model. This is now the recommended approach for all analysis tasks.

Get full analysis including detection, claims, premises, topics, stance, and argument type:

# Single text analysis
result = analyzer.comprehensive("Climate change requires immediate action to prevent damage.")

print(f"Is Argument: {result.is_argument}")      # True/False
print(f"Confidence: {result.confidence}")         # 0.0-1.0
print(f"Claims: {result.claims}")                 # List of claim strings
print(f"Premises: {result.premises}")             # List of premise strings
print(f"Topic (fine): {result.topic_fine}")       # Specific topic
print(f"Topic (broad): {result.topic_broad}")     # Broad policy area
print(f"Stance (fine): {result.stance_fine}")     # Favor/Against/NoArgument
print(f"Stance (broad): {result.stance_broad}")   # Favor/Against/NoArgument
print(f"Argument Type: {result.argument_type}")   # Deductive/Inductive/etc.
print(f"Argument Scheme: {result.argument_scheme}")

# Batch processing
texts = [
    "Climate change requires immediate action.",
    "The sky is blue today."
]
results = analyzer.comprehensive(texts)
for r in results:
    print(f"Text: {r.text}")
    print(f"Is Argument: {r.is_argument}, Stance: {r.stance_fine}")

# DataFrame processing
import pandas as pd
df = pd.DataFrame({'text': texts})
results_df = analyzer.comprehensive(df, text_column='text')
print(results_df[['text', 'is_argument', 'confidence', 'topic_fine', 'stance_fine']])

Output Fields Reference

The comprehensive() method returns a ComprehensiveResult object with the following fields:

Field Type Description Example Values
is_argument boolean Whether the text contains an argument (claim + premise) True, False
confidence float Confidence score for the analysis (0.0-1.0) 0.95
claims List[str] Extracted claim statements from the text ["Climate action is needed"]
premises List[str] Supporting premises/evidence for the claims ["temperatures are rising"]
topic_fine string Specific topic being argued about "Climate action policy"
topic_broad string Broader policy category "Environment"
stance_fine string Position on the fine-grained topic "Favor", "Against", "NoArgument"
stance_broad string Position on the broad topic "Favor", "Against", "NoArgument"
argument_type string Type of reasoning used "Deductive", "Inductive", "Abductive", "Analogical", "Fallacious"
argument_scheme string Classical argument pattern identified "If [action] then [result]"

Non-Argument Responses

When is_argument=False, fields are set to default values:

  • claims and premises: empty arrays []
  • topic_fine and topic_broad: "NoTopic"
  • stance_fine and stance_broad: "NoArgument"
  • argument_type: "NoArgument"
  • argument_scheme: "none_detected"

Argument Types Explained

  • Deductive: Conclusion follows necessarily from premises
  • Inductive: Conclusion is probably true based on evidence
  • Abductive: Best explanation for observed facts
  • Analogical: Reasoning by comparison to similar cases
  • Fallacious: Contains logical fallacy or flawed reasoning

Detect Arguments Deprecated

Deprecated: The detect() method is deprecated. Use comprehensive() instead for better results and more information. See Comprehensive Analysis →

The detect() method identifies whether a text contains an argument:

# Single text
result = analyzer.detect("Climate change is real because temperatures are rising.")
print(result.argument_prediction)  # "Argument" or "NoArgument"
print(result.confidence_score)     # Confidence score between 0 and 1

# Multiple texts
texts = [
    "Climate change is real because temperatures are rising.",
    "This is just a simple statement without any argument."
]
results = analyzer.detect(texts)
for r in results:
    print(f"Text: {r.text}")
    print(f"Prediction: {r.argument_prediction}")

# Using DataFrame
import pandas as pd
df = pd.DataFrame({'text': texts})
results_df = analyzer.detect(df, text_column='text')

Extract Topics Deprecated

Deprecated: The extract() method is deprecated. Use comprehensive() which returns topic_fine and topic_broad. See Comprehensive Analysis →

The extract() method identifies the main topic being argued about:

# Single text
result = analyzer.extract("Climate change is a serious issue because it affects our environment.")
print(result.topics)  # List of extracted topics

# Multiple texts
texts = [
    "Climate change is a serious issue because it affects our environment.",
    "We need better healthcare because current systems are inadequate."
]
results = analyzer.extract(texts)
for r in results:
    print(f"Text: {r.text}")
    print(f"Topics: {r.topics}")

# Using DataFrame
df = pd.DataFrame({'text': texts})
results_df = analyzer.extract(df, text_column='text')

Analyze Stance Deprecated

Deprecated: The stance() method is deprecated. Use comprehensive() which returns stance_fine and stance_broad. See Comprehensive Analysis →

The stance() method determines the stance towards a specific topic:

# Single text
text = "We must take action on climate change because the evidence is overwhelming."
topic = "climate change"
result = analyzer.stance(text, topic)
print(f"Stance: {result.stance}")  # "Favor", "Against", or "NoArgument"

# Multiple texts
texts = [
    "We must take action on climate change because the evidence is overwhelming.",
    "Climate change policies will harm the economy and cost jobs."
]
topics = ["climate change", "climate change"]
results = analyzer.stance(texts, topics)
for r in results:
    print(f"Text: {r.text}")
    print(f"Topic: {r.topic}")
    print(f"Stance: {r.stance}")

# Using DataFrame
df = pd.DataFrame({
    'text': texts,
    'topic': topics
})
results_df = analyzer.stance(df, text_column='text', topic_column='topic')

Discover Arguments

New in v0.2.1: The discover_arguments() method now uses sliding windows with comprehensive analysis. It returns full analysis results (topics, stance, claims, premises, argument type/scheme) for each discovered segment.
How It Works
  1. Splits text into sentences
  2. Creates overlapping windows of sentences (based on window_size and step_size)
  3. Evaluates each window with comprehensive analysis
  4. Selects the highest-confidence non-overlapping argument segments
  5. Fills gaps with "NoArgument" segments

The discover_arguments() method finds argumentative segments in longer texts:

# Single text
text = """Climate change is a serious issue. Global temperatures are rising at an
unprecedented rate. This is causing extreme weather events. However, some argue
that natural climate cycles are responsible."""

results_df = analyzer.discover_arguments(
    text,
    window_size=2,  # Number of sentences per window
    step_size=1     # Number of sentences to move window
)

# Each row contains full comprehensive analysis
for _, row in results_df.iterrows():
    print(f"Segment: {row['text_segment']}")
    print(f"Is Argument: {row['is_argument']}")
    print(f"Confidence: {row['argument_confidence']}")
    if row['is_argument']:
        print(f"Topic: {row['topic_fine']} ({row['topic_broad']})")
        print(f"Stance: {row['stance_fine']}")
        print(f"Claims: {row['claims']}")
        print(f"Premises: {row['premises']}")
        print(f"Argument Type: {row['argument_type']}")
        print(f"Argument Scheme: {row['argument_scheme']}")
    print("---")

# Using DataFrame for batch processing
df = pd.DataFrame({'text': [text1, text2]})
results_df = analyzer.discover_arguments(
    df,
    text_column='text',
    window_size=3,
    step_size=1
)
Output Columns
ColumnDescription
idSegment identifier
text_segmentThe extracted segment text
start_index, end_indexSentence indices in original text
is_argumentBoolean - contains argument?
argument_confidenceConfidence score (0-1)
claims, premisesExtracted argument components
topic_fine, topic_broadIdentified topics
stance_fine, stance_broadStance on topics
argument_typeDeductive/Inductive/etc.
argument_schemeArgumentation scheme
overlapping_segmentsIDs of overlapping windows not selected

Batch Processing

Native Batching: All WIBA methods handle batching internally. Simply pass a DataFrame or list of texts - no manual batching required.

All methods support native batch processing for efficient handling of multiple texts:

# Process large datasets with comprehensive analysis
import pandas as pd

# Read data
df = pd.read_csv('texts.csv')

# Comprehensive analysis - batching is handled automatically
results_df = analyzer.comprehensive(
    df,
    text_column='text',
    batch_size=100,    # Number of texts per API batch
    show_progress=True # Show progress bar
)

# Results include all fields: is_argument, confidence, claims, premises,
# topic_fine, topic_broad, stance_fine, stance_broad, argument_type, argument_scheme

# Save results
results_df.to_csv('results.csv', index=False)

# For discovering arguments in longer texts
results_df = analyzer.discover_arguments(
    df,
    text_column='text',
    window_size=3,
    step_size=1,
    batch_size=5,      # Texts to process concurrently
    show_progress=True
)
Batch Processing Parameters
ParameterDefaultDescription
batch_size100Number of texts per API request (comprehensive)
show_progressTrueDisplay progress bar during processing

Error Handling

WIBA provides robust error handling and validation:

from wiba import ValidationError, WIBAError

try:
    # Missing required column
    bad_df = pd.DataFrame({'wrong_column': ['test']})
    analyzer.detect(bad_df)
except ValidationError as e:
    print(f"Validation error: {str(e)}")

try:
    # Empty DataFrame
    empty_df = pd.DataFrame({'text': []})
    analyzer.detect(empty_df)
except ValidationError as e:
    print(f"Validation error: {str(e)}")

try:
    # Invalid stance input
    analyzer.stance("test text", None)
except ValidationError as e:
    print(f"Validation error: {str(e)}")

# Handle API errors
try:
    result = analyzer.detect("some text")
except WIBAError as e:
    print(f"API error: {str(e)}")

Performance

Last Updated: 12/31/2025

Note: The comprehensive() method is the recommended approach for all argument analysis. It provides complete analysis in a single call with optimized performance.

Model Speeds

WIBA-Comprehensive Recommended 2.11 it/s
WIBA-Discover 0.45s / 100 chars
WIBA-Detect Deprecated Use comprehensive()
WIBA-Extract Deprecated Use comprehensive()
WIBA-Stance Deprecated Use comprehensive()

WIBA-Discover speed varies based on document length, window size, and number of argument segments found.

Model Performance

WIBA-Comprehensive Recommended
Unified Model
Qwen3-4B - Updated: 12/31/25 | Provides detection, extraction, stance, and argument type in one call
WIBA-Detect Deprecated
F1: 82.23%
Legacy model - Use comprehensive() instead
WIBA-Extract Deprecated
F1: 73.3%
Legacy model - Use comprehensive() instead
WIBA-Stance Deprecated
F1: 71.26%
Legacy model - Use comprehensive() instead
v0.2.0 Released: The new unified comprehensive() method provides complete argument analysis in a single API call, replacing the need for separate detect, extract, and stance calls.

Argument Mining Resources

Research Paper

WIBA: What Is Being Argued? A Comprehensive Approach to Argument Mining

Arman Irani, Ju Yeon Park, Kevin Esterling, Michalis Faloutsos (2024)

A novel framework and suite of methods that enable the comprehensive understanding of "What Is Being Argued" across contexts. The approach develops a comprehensive framework that detects: (a) the existence, (b) the topic, and (c) the stance of an argument, achieving F1 scores of 79-86% for argument detection, 71% similarity for topic identification, and 71-78% for stance classification across diverse benchmark datasets.

Research Paper

Terminal Veracity: How Russian Propaganda Uses Telegram to Manufacture 'Objectivity' on the Battlefield

Mark W. Perry, Arman Irani (2023)

This article investigates over 130,000 Telegram messages, 15,000 Telegram forwards, and 750 news articles from Russian-affiliated media to assess the information supply chain between Russian media and Telegram channels covering the war in Ukraine. Using machine-learning techniques, this research provides a framework for conducting argument and network analysis for disambiguating narratives, channels, and users, and mapping dissemination pathways of influence operations. The findings indicate that a central feature of Russian war reporting is actually the prevalence of neutral, non-argumentative language. Moreover, dissemination patterns between media sites and Telegram channels reveal a well-cited information laundering network with a distinct supply chain of covert, semi-covert, and overt channel types active at seed, copy, and amplification levels of operation.

Journal of Information Warfare View Journal
Research Paper

ArguSense: Argument-Centric Analysis of Online Discourse

Arman Irani, Michalis Faloutsos, Kevin Esterling (2024)

A comprehensive framework for analyzing arguments in online forums, featuring unsupervised topic detection, argument visualization, and content quantification through similarity and clustering algorithms. The study demonstrates its effectiveness through analysis of GMO-related discussions across Reddit communities.

Research Paper

Overview of DialAM-2024: Argument Mining in Natural Language Dialogues

Ramon Ruiz-Dolz, John Lawrence, Ella Schad, Chris Reed (2024)

First shared task in dialogical argument mining, exploring the integration of argumentative relations and speech illocutions in a unified framework. The study presents results from six teams working on identifying propositional and illocutionary relations in argument maps.

Research Paper

Detecting Argumentative Fallacies in the Wild

Ramon Ruiz-Dolz, John Lawrence (2023)

A groundbreaking analysis of the limitations of data-driven approaches in real-world argument mining scenarios. The study introduces a validation corpus for natural language argumentation schemes and provides crucial insights for deploying argument mining systems in practical applications.

Upcoming Conference

The 12th Workshop on Argument Mining (ArgMining 2025)

July 31st or August 1st, 2025 | Vienna, Austria

A premier workshop co-located with ACL 2025, focusing on computational linguistics and argument mining. The workshop aims to broaden its scope by incorporating perspectives from social science, psychology, and humanities while creating synergies between argument mining and natural language reasoning.

Dataset

DialAM-2024 Dataset

Ruiz-Dolz et al. (2024)

A comprehensive dataset for dialogical argument mining, featuring annotated natural language dialogues with both argumentative relations and speech illocutions. Perfect for developing and evaluating dialogue-based argument mining systems.

Dataset

UKP Sentential Argument Mining Corpus

UKP Lab, TU Darmstadt

A large-scale argument mining corpus containing over 25,000 annotated arguments from heterogeneous sources. Perfect for training and evaluating argument mining systems.

Tutorial

Getting Started with WIBA

WIBA Team

A comprehensive guide to using WIBA for argument analysis. Learn how to analyze texts, visualize arguments, and interpret results using our web-based platform.

WIBA: A Unified Framework for Argument Detection, Topic Extraction, and Stance Classification with vLLM-Compatible Model Deployment

Arman Irani
University of California, Riverside

Abstract

This paper presents WIBA 2.0 (What Is Being Argued), a comprehensive framework for training and deploying language models specialized in argumentation mining. We describe the complete pipeline from dataset organization through DSPy-based reasoning generation, supervised fine-tuning with LoRA adapters, optional reinforcement learning refinement via GRPO, and final model merging for vLLM deployment. The framework leverages formal argumentation theory from multiple academic sources including Walton's argumentation schemes, Toulmin's warrant model and an examination of argumentation reasoning. Our approach produces vLLM-compatible models capable of hierarchical argument analysis with structured JSON outputs enforced through guided decoding.

Keywords: Argumentation Mining, Large Language Models, DSPy, LoRA Fine-tuning, vLLM, Stance Classification, Topic Extraction

1. Introduction

Argumentation mining, the automatic identification and analysis of argumentative structures in text, represents a critical capability for applications ranging from fact-checking to deliberative democracy platforms. This technical report documents the complete methodology for creating the WIBA argument analysis model, a vLLM-compatible system that performs three interrelated tasks:

  1. Argument Detection: Binary classification determining whether text contains an argument (claim + premises) or not
  2. Topic Extraction: Hierarchical identification of fine-grained and broad topics being argued
  3. Stance Classification: Determining the position (Favor/Against) taken toward identified topics
  4. Argument Scheme Classification: Determining the argumentation scheme a text uses to put forward a perspective
  5. Argument Type Classification: Determining the type of argument that is being made

The distinguishing features of our approach include:

  • A multi-stage DSPy pipeline grounded in formal argumentation theory for human-in-the-loop data augmentation and labeling
  • Hierarchical topic and stance modeling (fine and broad granularity)
  • Optimized deployment via vLLM with guided JSON decoding

2. Theoretical Foundations

2.1 Formal Argumentation Framework

Our approach synthesizes several theoretical frameworks from the argumentation literature:

Walton's Argumentation Schemes (Walton, Reed & Macagno, 2008): We implement 11 classical argument scheme patterns including:

  • argument_from_authority: Citing expert testimony
  • argument_from_analogy: Reasoning from similarity
  • causal_argument: Claiming causal connections
  • argument_from_consequences: Arguing based on outcomes
  • practical_reasoning: Means-end reasoning
  • moral_argument: Value-based reasoning

Toulmin Model (Toulmin, 2003): Our system extracts and reconstructs warrants—the general rules licensing inferences from premises to conclusions. Warrants may be explicit or implicit, requiring reconstruction when not stated.

Defeasible Reasoning (Pollock, 1987): We analyze defeasible justification structures, identifying epistemic hedges, modal qualifiers, and potential defeaters (undercutting and rebutting) as additional argumentation indicators.

2.2 Task Definitions

Following Mohammad et al. (2016) for stance detection conventions:

Definition 1 (Argument): A text contains an argument if and only if it includes at least one claim (conclusion/assertion) AND at least one premise (evidence/reasoning supporting the claim).

Definition 2 (Hierarchical Topics):

  • topic_fine: The specific issue being argued (e.g., "vaccine mandates")
  • topic_broad: The broader policy domain (e.g., "Healthcare")

Definition 3 (Hierarchical Stance):

  • stance_fine: Position on the specific topic (Favor | Against | NoArgument)
  • stance_broad: Position on the broader policy area (may differ from stance_fine)

3. Dataset Organization

3.1 Training Data Sources

The primary training data derives from multiple annotated corpora:

UKP Lima Training Data: Primary source containing sentence-level annotations with columns for sentence, annotation, topic (broad), wiba_topics (fine), and wiba_stance. Binary label mapping: "NoArgument" preserved, all others mapped to "Argument".

IBM ArgQual Dataset: Used for validation, filtered for test=True. Provides additional NoArgument examples.

3.2 Hierarchical Annotation Structure

Each training example contains:

{
    'text': str,           // Input text
    'label': str,          // 'Argument' | 'NoArgument'
    'topic_fine': str,     // Specific topic (1-3 words)
    'topic_broad': str,    // Policy domain
    'stance_fine': str,    // 'Favor' | 'Against' | 'NoArgument'
    'stance_broad': str    // May differ from stance_fine
}

4. DSPy-Based Reasoning Generation

4.1 Architecture Overview

The system employs a "Separated Concerns Architecture" implementing a multi-stage DSPy pipeline. Each stage is defined as a DSPy Signature with typed input/output fields.

4.2 Stage 1: Structured Argument Analysis

The StructuredArgumentAnalysis signature performs initial decomposition:

Discourse-Level Analysis: arguer, epistemic_stance, speech_act

Claim Analysis: claim_text (JSON list), claim_type (factual | evaluative | policy | definitional | causal | comparative)

Premise Analysis: premises (JSON list), premise_types per Walton's taxonomy

Warrant Analysis: argument_scheme, warrant_explicit, warrant_reconstruction

4.3 Stage 2-3: Verification and Evaluation

The ArgumentConstructionVerification signature applies formal validity checks including premise-claim independence, inference validity, proof burden assessment, and defeater handling.

The FormalArgumentEvaluation signature produces final classifications with gate check logic: IF claim_text has ≥1 element AND premises has ≥1 element, then is_argument = "Argument".

4.4 Stage 4-7: Topic, Stance, and Synthesis

Subsequent stages handle topic extraction, stance reasoning chains, broad stance mapping, and final hierarchical synthesis with consistency validation.

5. Training Data Formatting

5.1 Schema Types

Three output schemas are supported:

DETECT Schema (Minimal):

{"is_argument": true, "confidence": 0.95}

COMPREHENSIVE Schema (Core Fields):

{
    "is_argument": true,
    "claims": ["Claim text here"],
    "premises": ["Premise text here"],
    "topic_fine": "specific topic",
    "topic_broad": "Policy Domain",
    "stance_fine": "Favor",
    "stance_broad": "Favor",
    "argument_type": "Inductive",
    "argument_scheme": "argument_from_example",
    "confidence": 0.92
}

6. Fine-Tuning Pipeline

6.1 Base Model Selection

The framework supports multiple Qwen model variants including Qwen2.5-3B-Instruct (default), Qwen3-4B-Instruct-2507, and Qwen3-8B.

6.2 LoRA Configuration

Parameter-Efficient Fine-Tuning via Low-Rank Adaptation with r=32, lora_alpha=64, targeting attention layers (q_proj, k_proj, v_proj, o_proj) and FFN layers (gate_proj, up_proj, down_proj).

6.3 Training Hyperparameters

ParameterReasoning ModeNon-Reasoning Mode
Epochs73
Learning Rate1e-45e-5
LoRA Dropout0.250.15

7. Model Deployment

7.1 Merging Process

The merge_and_save_for_vllm() function prepares the model by loading the base model in FP16, merging LoRA adapters, and saving the merged model and tokenizer.

7.2 vLLM Deployment

The --guided-decoding-backend outlines flag enables JSON schema enforcement at generation time, ensuring outputs conform to expected structure without post-processing failures.

8. Evaluation Methodology

8.1 Topic Evaluation: BILUO Sequence Tagging

Topics are evaluated using sequence labeling methodology with span finding, token alignment, BILUO to BIO conversion, and token-level F1 calculation.

8.2 Task-Specific Metrics

TaskPrimary Metric
DETECTF1 (is_argument)
EXTRACTBERTScore (topic similarity)
STANCEF1 (stance classification)

9. Conclusion

This paper has presented the complete methodology for creating WIBA, a vLLM-compatible model for argument detection, topic extraction, and stance classification. Key contributions include:

  1. Theoretically-Grounded Architecture: Multi-stage DSPy pipeline implementing formal argumentation theory from Walton, Toulmin, Prakken, and Pollock.
  2. Hierarchical Analysis: Fine and broad granularity for both topics and stances, enabling nuanced argument understanding.
  3. Flexible Output Schemas: Three schema types (detect, comprehensive, full) supporting different deployment requirements.
  4. Optimized Training Pipeline: LoRA fine-tuning with auxiliary heads for clean gradients, optional GRPO refinement with asymmetric reward shaping.
  5. Production-Ready Deployment: Automated merging and vLLM deployment with guided JSON decoding for reliable structured outputs.

References

  • Christiano, P. F., et al. (2017). Deep reinforcement learning from human preferences. NeurIPS.
  • Copi, I. M., et al. (2016). Introduction to Logic (14th ed.).
  • Dung, P. M. (1995). On the acceptability of arguments. Artificial Intelligence.
  • Gordon, T. F., & Walton, D. (2009). Proof burdens and standards. Argumentation in AI and Law.
  • Mohammad, S., et al. (2016). SemEval-2016 Task 6: Detecting stance in tweets. SemEval.
  • Pollock, J. L. (1987). Defeasible reasoning. Cognitive Science.
  • Prakken, H. (2010). An abstract framework for argumentation. Argument & Computation.
  • Toulmin, S. E. (2003). The Uses of Argument (Updated ed.).
  • Walton, D., Reed, C., & Macagno, F. (2008). Argumentation Schemes.