Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create datamodules from custom data #2515

Open
LorenzoF6 opened this issue Jan 17, 2025 · 7 comments
Open

Create datamodules from custom data #2515

LorenzoF6 opened this issue Jan 17, 2025 · 7 comments

Comments

@LorenzoF6
Copy link

I want to try Anomalib with my custom data.
It is correct to orgnazie my data in a MVTEC like structure like the image below, beacuase for my category of object i have 4 indipendent type of defect

CustomData/
├── category_1/
│ ├── train/
│ │ ├── good/
│ │ │ ├── image1.png
│ │ │ ├── image2.png
│ │ │ └── ...
│ ├── test/
│ │ ├── anomaly_type_1/
│ │ │ ├── image1.png
│ │ │ ├── image2.png
│ │ │ └── ...
│ │ ├── anomaly_type_2/
│ │ │ ├── image1.png
│ │ │ ├── image2.png
│ │ │ └── ...
│ ├── ground_truth/
│ │ ├── anomaly_type_1/
│ │ │ ├── image1_mask.png
│ │ │ ├── image2_mask.png
│ │ │ └── ...
│ │ ├── anomaly_type_2/
│ │ │ ├── image1_mask.png
│ │ │ ├── image2_mask.png
│ │ │ └── ...
└── ...

And the use the class Folder to create the custom datamodules and use the option to automaticaly split the normal data for train and test? Or it is convinient to put some normal images in a dedicate normal directory under test?
Thanks very much

@abc-125
Copy link
Contributor

abc-125 commented Jan 18, 2025

Hello, Folder dataset uses just one type of defect, so you can either move all defects to one folder or create a custom class.

In both cases, you can use automatic split or put some normal images in a separate folder.

@LorenzoF6
Copy link
Author

thansk, but if i want to separate my defects ( i have fout type) how the model can know the difference from the different effects?
and if i want to do classification, i think that's correct to put all all defects in one folder but if i want to do segmentation it isn't correct to maintain the different defects separate?

@LorenzoF6
Copy link
Author

infact, if you see in the 2022 documentation (v1 realease) in the folder class page, it is shown that every type of defetc have it own folder

@LorenzoF6
Copy link
Author

So now, with the new realese, if i want to train a model with a custom dataset, the dataset should have a directorty tree like this:
DatasetName:

  • Normal
  • Abnormal
    Or it is possible to maintain a mvteclike structure? Because in the documentation it is not well explained

Thansk

@abc-125
Copy link
Contributor

abc-125 commented Jan 19, 2025

how the model can know the difference from the different effects

It won't. All the models available in anomalib are trained with normal images only. Anomalous images are used only for testing.

if i want to do segmentation it isn't correct to maintain the different defects separate

All defects in one folder for either classification or segmentation should be fine.

Or it is possible to maintain a mvteclike structure?

You can make your version of the MVTecAD class. I think you need to replace categories with category_1, also make sure you added your class to all __init__.py. After this, you can initialize it like this:

dataset = MyDataset(
        ...     root=Path("./CustomData"),
        ...     category="category_1"
        ... )

@LorenzoF6
Copy link
Author

ok thanks.
So if i have one "category" is better using the folder class
instead, if i have multiple "categories" of defect is better to customize the mvtec class.

But, this it isn't the same thing as using folder class and change the source of the abnormal_dir every time i want to perform the analysis on the different defects? In this, i using the folder class and every time i change the destination of the abnormal_dir, mainting at the same time, the mvtec like structure

@abc-125
Copy link
Contributor

abc-125 commented Jan 20, 2025

change the source of the abnormal_dir every time i want to perform the analysis on the different defects?

I think it should work in your case (if you want to calculate results per type of defect).

MVTecAD class calculates results for all defects together as if they are in the same folder; it just allows you to have a different folder structure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants