sd_hassan

Stable Diffusion 2.1 NSFW training update

Added 2023-01-18 23:10:15 +0000 UTC

CONTEXT

So as you know from a previous update, I've run a test of training NSFW content into SD2.1 and it worked well on a small dataset of 300 images across 6 different types of content.

Since the last update I have been gathering more datasets, captioning and here's where I'm at:

CURRENT STATE

- I've started training SD2.1 again, this time the "real" training, not a test run.

- I'm training one dataset at a time, ensuring it's coming through properly before then moving to the next dataset. This is to ensure better quality. The reason is, I found during some tests that training multiple datasets at the same time can work well but there can be a bunch of them that seem to not be trained as "evenly" as others in the batch. As such putting a focus on one dataset at a time will spread the quality better overall.

- I will train each dataset, download the model as a backup, then start the next training run immediately.

- In parallel to this, I am continuing to grab more datasets and setting them to 768 resolution and manually captioning. I think this process will continue even when the model is released I think it will continue to be finetuned with more and more datasets to make it an awesome NSFW model.

Datasets:

I have the same datasets as the testrun, just additional amounts in each dataset:

- BSDM male/female

- BJ (Male and female performing)

- Female body/anatomy - Various body types, breast types, nipple types, muff types etc

- Male body/anatomy - various body types, penis shapes/states/muff types etc

- Buttplugs

- Dildos

- Men on Men

- Various types of penetration M2M, M2F, F2F etc

- Upskirts

If you want to recommend more datasets or have datasets at a higher than 768 resolution, feel free to reach out or share

- Sharing a source website is good
- Sharing manually captioned images is very helpful as I need to manually caption each image, I'm not satisfied with CLIP captioning

FAQ:
- I expect the first release in 1-2 weeks if all goes well in training, this is based on 24/7 cloud training
- Patreons will gain first access, maybe 1 week early access to test it - provide feedback before I open it up, may limit this to tiers to prevent leakage and allow us to make sure it's ready

- You can recommend datasets

- The model will be released free like all my models, once testing and early access is complete for supporters

- This model will not be a merge of any sorts, only trained content but other users can freely merge once it's out in the wild. I want to ensure I fully control the quality of what's going into the base model which I cannot do when it's merged

Thanks for your continued support, the 24/7 cloud training adds up, at some points I've multiple clouds in parallel testing various settings to try get this right so your support really goes a long way for this!