Skip to content

isa-group/pricing-intelligence-labpack

Repository files navigation

Pricing Intelligence Labpack

This repository contains the experimentation and evaluation framework for the Pricing Intelligence: Rethinking IS Engineering in Volatile SaaS Environments paper. It includes scripts to generate instantiated questions from templates, run experiments against a Pricing Intelligence agent (HARVEY), and evaluate the results.

Prerequisites

Installation

  1. Clone this repository.

  2. Install the required Python dependencies:

    pip install -r requirements.txt

Workflow

The typical workflow involves three main steps: Generation, Experimentation, and Evaluation.

1. Generation

Generate instantiated questions from templates and pricing data.

Script: Experimentation/generate_instantiated_questions.py

Usage:

Run from the project root:

python3 Experimentation/generate_instantiated_questions.py \
  --templates Experimentation/pi_task_templates.json \
  --spec Experimentation/instantiation_spec.json \
  --output instantiated_questions.json

Arguments:

  • --templates: Path to the JSON file containing question templates.
  • --spec: Path to the JSON file containing instantiation specifications (placeholders, overrides).
  • --output: Path where the generated questions JSON will be saved. We recommend saving it to instantiated_questions.json in the root directory so run_experiment.py can find it easily.
  • --expected (Optional): Path to an existing file to verify equality.

2. Experimentation (HARVEY)

Run the generated questions against the HARVEY agent.

Important: You must first launch the HARVEY project. Follow the instructions in the Pricing-Intelligence-Interpretation-Process repository to start the server. By default, it is expected to be running at http://localhost:8086/chat.

Script: Experimentation/run_experiment.py

Usage:

Run from the project root (ensure instantiated_questions.json is present in the root):

python3 Experimentation/run_experiment.py

Configuration: You may need to modify Experimentation/run_experiment.py to adjust configurations:

  • API_URL: The URL of the HARVEY agent (default: http://localhost:8086/chat).
  • INPUT_FILE: The input file containing questions (default: instantiated_questions.json). Ensure this matches the output from the Generation step.
  • OUTPUT_FILE: The file where results will be saved (default: experiment_results_gpt_5_nano.json).

Note: The script supports checkpointing. If interrupted, it will resume from where it left off, skipping already processed questions found in the output file.

3. Evaluation

Analyze the experiment results and generate a report.

Script: Evaluation/generate_evaluation_report.py

Usage:

python3 Evaluation/generate_evaluation_report.py \
  --input Experimentation/experiment_results_gpt_5_nano.json \
  --output_dir Evaluation/reports \
  --outfile evaluation_report.json

Arguments:

  • --input: Path to the experiment results JSON file.
  • --output_dir: Directory where the evaluation report will be saved.
  • --outfile: Name of the output report file.

Project Structure

The repository structure reflects the evaluation workflow described in the PIIP paper:

1. Templates Elicitation ($T_0$)

Raw PI tasks/: Contains the initial pool of templates ($T_0$) collected from humans and LLMs.

  • human_tasks.md: Raw templates elicited from human participants.
  • ai_gemini3Pro_tasks.md & ai_qwen3Max_tasks.md: Raw templates generated by external LLMs.

2. Classification & Normalization ($T_2$, $AT$, $UT$)

Classified PI tasks/: Represents the curated and classified set of templates ($T_2$).

  • human_tasks.md & ai_tasks.md: Normalized templates tagged as "Answerable" ($AT$) or "Non answerable" ($UT$).

3. Strategy Definition & Instantiation ($Q$)

Experimentation/: Artifacts and scripts for generating the concrete dataset ($Q$) and defining ground truth strategies.

  • pi_task_templates.json: Contains Answerable Templates ($AT$) with their ideal strategies (ground truth plans).
  • instantiation_spec.json: Specifications for instantiating templates with real values.
  • generate_instantiated_questions.py: Script to generate the 150 concrete PI tasks ($Q$).
  • run_experiment.py: Script to execute tasks against HARVEY ($RQ_1$).

4. Evaluation ($RQ_1, RQ_2, RQ_3$)

Evaluation/: Tools to analyze HARVEY's performance.

  • generate_evaluation_report.py: Computes precision and recall metrics for $RQ_2$ and $RQ_3$.
  • statistical_evaluation.py & visualization.ipynb: Statistical analysis and visualization tools.

5. Data Source

data/: Contains the real SaaS pricing data (from the TSC'25 dataset) used to instantiate the templates.

Modifying Scripts

  • URLs: To change the target API URL, edit the API_URL constant in Experimentation/run_experiment.py.
  • Input/Output Files: Default file paths are defined as constants (INPUT_FILE, OUTPUT_FILE) in Experimentation/run_experiment.py. You can modify these or update the script to accept command-line arguments.

About

This is the labpack for the about PI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •