PIE-Bench++: Dataset for Multi-aspect Image Editing

Department of Computer Science and Engineering
University at Buffalo, State University of New York

Latest News 📢

Release of PIE-Bench++ Version 1.0 🚀

We are thrilled to announce the launch of PIE-Bench++ Version 1.0, a comprehensive benchmark dataset for multi-aspect image editing evaluation.

What is PIE-Bench++?

PIE-Bench++ builds upon the foundation laid by the original PIE-Bench dataset introduced by (Ju et al., 2024), designed to provide a comprehensive benchmark for multi-aspect image editing evaluation. This enhanced dataset contains 700 images and prompts across nine distinct edit categories, encompassing a wide range of manipulations:

  • Object-Level Manipulations: Additions, removals, and modifications of objects within the image.
  • Attribute-Level Manipulations: Changes in content, pose, color, and material of objects.
  • Image-Level Manipulations: Adjustments to the background and overall style of the image.

While retaining the original images, the enhanced dataset features revised source prompts and editing prompts, augmented with additional metadata such as editing types and aspect mapping. This comprehensive augmentation aims to facilitate more nuanced and detailed evaluations in the domain of multi-aspect image editing

Dataset Structure

  • Images
    • 0_random_140
      • 000000000001.jpg
      • ...
      • 000000000140.jpg
    • 1_change_object_80
      • 1_artificial
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
      • 2_natural
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
    • 2_add_object_80
      • 1_artificial
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
      • 2_natural
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
    • 3_delete_object_80
      • 1_artificial
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
      • 2_natural
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
    • 4_change_attribute_content_40
      • 1_artificial
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
      • 2_natural
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
    • 5_change_attribute_pose_40
      • 1_artificial
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
      • 2_natural
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
    • 6_change_attribute_color_40
      • 1_artificial
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
      • 2_natural
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
    • 7_change_attribute_material_40
      • 1_artificial
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
      • 2_natural
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
    • 8_change_background_80
      • 1_artificial
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
      • 2_natural
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
    • 9_change_style_80
      • 1_artificial
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
      • 2_natural
        • 1_animal
          • ...
        • 2_human
          • ...
        • 3_indoor
          • ...
        • 4_outdoor
          • ...
  • annotation.json

Data Annotation Guide

Overview

Our dataset annotations are structured to provide comprehensive information for each image, facilitating a deeper understanding of the editing process. Each annotation consists of the following key elements:

  • Source Prompt: The original description or caption of the image before any edits are made.
  • Target Prompt: The description or caption of the image after the edits are applied.
  • Edit Action: A detailed specification of the changes made to the image, including:
    • The position index in the source prompt where changes occur.
    • The type of edit applied (e.g., 1:change object, 2:add object, 3:remove object, 4:change attribute content, 5:change attribute pose, 6:change attribute color, 7:change attribute material, 8:change background, 9:change style).
    • The operation required to achieve the desired outcome (e.g., '+' / '-' means adding/removing words at the specified position, and 'xxx' means replacing the existing words.
  • Aspect Mapping: A mapping that connects objects undergoing editing to their respective modified attributes. This helps identify which objects are subject to editing and the specific attributes that are altered.

Example Annotation

Here is an example annotation for an image in our dataset:

{
  "000000000002": {
    "image_path": "0_random_140/000000000002.jpg",
    "source_prompt": "a cat sitting on a wooden chair",
    "target_prompt": "a [red] [dog] [with flowers in mouth] [standing] on a [metal] chair",
    "edit_action": 
      {"red":{"position":1,"edit_type":6,"action":"+"}},
      {"dog":{"position":1,"edit_type":1,"action":"cat"}},
      {"with flowers in mouth":{"position":2,"edit_type":2,"action":"+"}},
      {"standing":{"position":2,"edit_type":5,"action":"sitting"}},
      {"metal":{"position":5,"edit_type":7,"action":"wooden"}},
    "aspect_mapping": {
      "dog":["red","standing"],
      "chair":["metal"],
      "flowers":[]},
    "blended_words": [
      "cat,dog",
      "chair,chair"
    ],
    "mask": "0 262144"
  }
}
      

BibTex

@misc{PIE-Bench++,
  author = {Mingzhen Huang, Jialing Cai, Shan Jia, Vishnu Lokhande, Siwei Lyu},
  title = {PIE-Bench++: Dataset for Multi-aspect Image Editing},
  year = {2024},
  howpublished = {https://mingzhenhuang.com/projects/piebenchpp.html},
  note = {Accessed: yyyy-mm-dd}
}
      

Acknowledgement

Our dataset is the extension of the original PIE-Bench dataset introduced in paper PnP Inversion: Boosting Diffusion-based Editing with 3 Lines of Code, thanks to the contributions.