TextAttack-Multilabel

A professional extension of TextAttack for multi-label adversarial example generation, with focus on toxicity classification. Generate adversarial examples that flip multiple labels simultaneously while preserving semantic meaning and grammatical correctness.

Features

🏗️ Modular Architecture: Support for multiple models (Detoxify, custom HF models) and datasets
⚙️ CLI Interface: User-friendly command-line tools for all major operations
📊 Configuration-Driven: YAML configuration for flexible attack parameters
🔬 Professional Testing: Comprehensive test suite with coverage reporting
📚 Rich Examples: Jupyter notebooks and tutorials
🚀 Easy Installation: Pip-installable with proper dependency management

Installation

Quick Install (Recommended)

# install from source
git clone https://github.com/QData/TextAttack-Multilabel
cd TextAttack-Multilabel
pip install -e .

Development Installation

# Install with development dependencies
pip install -e ".[dev]"

# Run setup script (optional, creates conda environment)
python install_env.py

Quick Start

Using the CLI (Recommended)

# Show help
textattack-multilabel --help

# Run a basic attack on benign samples
textattack-multilabel attack --attack benign

# Preprocess and analyze data
textattack-multilabel preprocess --data data.csv --analyze --sample benign

# Run test suite
textattack-multilabel test --coverage

Using Scripts Directly

# Download data
python example_toxic_adv_examples/download_data.py

# Run main attack script
python example_toxic_adv_examples/attack_multilabel_tae_main.py

# Run baseline example
python example_toxic_adv_examples/baseline_multiclass_toxic_adv_example_attack.py

# Run ACL23 example
python example_toxic_adv_examples/multilabel_acl2023.py

# Run tests
python test/run_tests.py --coverage

File Structure

TextAttack-Multilabel/
├── .gitignore                          # Git ignore file
├── pyproject.toml                      # Package configuration
├── install_env.py                      # Environment setup script
├── LICENSE                             # License file
├── README.md                           # This file
├── example_toxic_adv_examples/         # Example scripts and configs
│   ├── attack_multilabel_tae_main.py   # Main attack script
│   ├── baseline_multiclass_toxic_adv_example_attack.py  # Baseline attack example
│   ├── download_data.py                # Data download script
│   ├── multilabel_acl2023.py           # ACL23 multilabel example
│   ├── config/                         # Configuration files
│   │   └── toxic_adv_examples_config.yaml  # Configuration for examples
│   └── __pycache__/                    # Python cache
├── textattack_multilabel/              # Main package
│   ├── __init__.py                     # Package initialization
│   ├── attack_components.py            # Attack components
│   ├── goal_function.py                # Goal function implementations
│   ├── multilabel_model_wrapper.py     # Model wrapper for multilabel
│   └── multilabel_target_attack_recipe.py  # Multilabel attack recipes
└── test/                               # Test suite
    ├── __init__.py
    ├── run_tests.py                    # Test runner
    ├── test_model_wrapper.py
    ├── test_multilabel_attack_recipes.py
    └── test_shared.py

Configuration

The package uses YAML configuration files for flexible setup:

# config/attack_config.yaml
defaults:
  model:
    type: "detoxify"  # or "custom"
    variant: "original"
  dataset:
    name: "jigsaw_toxic_comments"
    sample_size: 500
  attack:
    recipe: "MultilabelACL23"
    labels_to_maximize: []  # maximize all toxic labels
    labels_to_minimize: []  # minimize currently toxic labels
    # ... more options

Examples

Python API Usage

from textattack_multilabel import MultilabelModelWrapper, MultilabelACL23

# Load your model
model_wrapper = MultilabelModelWrapper(your_model, your_tokenizer, multilabel=True)

# Create attack recipe
attack = MultilabelACL23.build(
    model_wrapper=model_wrapper,
    labels_to_maximize=[0, 1, 2, 3, 4, 5],  # maximize all toxic labels
    labels_to_minimize=[],
    wir_method="delete"
)

# Run attack
attacker = textattack.Attacker(attack, dataset)
results = attacker.attack_dataset()

Jupyter Notebook Tutorial

See examples/getting_started.ipynb for a complete walkthrough including:

Model setup and testing
Creating attack recipes
Running attacks on sample data
Results analysis
Configuration examples

Scripts Overview

Core Scripts

attack_multilabel_tae_main.py: Modular attack generation supporting multiple models and datasets
download_data.py: Secure Kaggle dataset download with environment variables

Utility Scripts

install_env.py: Cross-platform environment setup with verification
run_tests.py: Comprehensive test runner with coverage and parallel execution

Tests

Run the complete test suite:

python test/run_tests.py --coverage

Test structure:

Unit tests: Individual function/component testing
Integration tests: Full pipeline testing
Coverage reporting: HTML coverage reports generated

Setup Environment

python install_env.py

This creates a conda environment (py3.8) with required dependencies: textattack[tensorflow,optional], detoxify, kaggle, sentence-transformers. Uses Python subprocess for secure command execution.

Download Data

Download the Jigsaw Toxic Comments dataset from Kaggle (requires KAGGLE_USERNAME and KAGGLE_KEY environment variables):

python example_toxic_adv_examples/download_data.py

This uses environment variables for Kaggle API credentials instead of plaintext files for improved security.

Generate Adversarial Examples

See the example scripts in example_toxic_adv_examples/ for running attack examples, such as multilabel_acl2023.py.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TextAttack-Multilabel

Features

Installation

Quick Install (Recommended)

Development Installation

Quick Start

Using the CLI (Recommended)

Using Scripts Directly

File Structure

Configuration

Examples

Python API Usage

Jupyter Notebook Tutorial

Scripts Overview

Core Scripts

Utility Scripts

Tests

Setup Environment

Download Data

Generate Adversarial Examples

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
example_toxic_adv_examples		example_toxic_adv_examples
examples		examples
test		test
textattack_multilabel		textattack_multilabel
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
install_env.py		install_env.py
pyproject.toml		pyproject.toml

License

QData/TextAttack-Multilabel

Folders and files

Latest commit

History

Repository files navigation

TextAttack-Multilabel

Features

Installation

Quick Install (Recommended)

Development Installation

Quick Start

Using the CLI (Recommended)

Using Scripts Directly

File Structure

Configuration

Examples

Python API Usage

Jupyter Notebook Tutorial

Scripts Overview

Core Scripts

Utility Scripts

Tests

Setup Environment

Download Data

Generate Adversarial Examples

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages