Mr Wisp

The Rest: January 2024 - Process Notes

Added 2024-01-09 17:52:26 +0000 UTC

Hi all! As promised in the last post, here are the detailed process notes for the final Re-AImagined Gallery. Note that I won't be addressing the Evelyn images in detail here since they weren't part of the main gallery. If/when I do another gallery that features touched up images from that session, I'll include notes about it then.

I've decided to condense the session descriptions here to one per character, as otherwise there would be a lot of process notes for Lilith that are now outdated (since those sessions were done using an outdated model). I'll briefly note any significant differences between sessions, but otherwise I'll be talking about each character's images as part of one group. As usual, these sessions are ordered chronologically.

Lilith Sessions - Mid December 2022

These sessions were all done over the course of a few days in December of 2022, making them among the oldest sessions I've done. If you check the zip folders from the last post, you might notice that Lilith makes up over half of the images in them; this is the only reason I was able to represent her substantially here, as the average image quality throughout each of these sessions is not particularly impressive. But that was to be expected at the time since A) I still didn't really know what I was doing, and B) Getting accurate images of Lilith from an AI image generator was at the time a tall order (haha, pun. Because she tall. haha.) There are several parts of her design that were never guaranteed to translate well into AI art, to such a degree that I was willing to compromise and accept representations that weren't one-to-one as long as they were close enough. I started with a 0.75 power setting, as I usually did at the time, and once I decided that there was potential for Lilith to actually resemble Lilith, I started increasing the power. I believe I settled somewhere between 0.9 and 0.99. It's possible I didn't stay all the way at 0.99 due to incompatibilities between Lil's design and more generic anime characters, which I've noted below. I wasn't messing around with steps and prompt guidance very much at this point, and I don't think I had the noise set very high (between 0.2 and 0.5 was my go-to back then.)

Her tail was the most obvious issue I had to deal with throughout each session: you can see that many of the images in the main gallery alone feature her tailless, which I just had to accept as 'no listen, it's behind her, you just can't see it' like with any given Selkie session. Unlike with Selkie though, there was no comparison I could think of that would convey to NovelAI how Lilith's tail was supposed to look without drastically altering her design. The closest I got were 'Lizard tail', 'Dragon tail', and 'short-dragon tail' (I experimented with dash-combining prompts during this session, though the tags I combined weren't very well represented in the first place, so it's hard to say what difference it made.) In the end, NovelAI did what it wanted with the tail, which I was at least used to from the neko sessions. Multiple tails were common, as were tails coming in from offscreen. Sometimes Lil ended up with what I could only term 'Mewtwo tail', aka a tail that's thin in the middle but bulges at either end (usually accompanied by it dipping offscreen or otherwise out of view.) It's worth noting that I haven't tried any tailed characters using the newer models, so I can't say if these are still relevant issues. I'd definitely like to see how the new models handle Lilith at some point though.

The other most prominent difficulty I had with these sessions was with Lilith's skin colour. As was typical of sessions at the time where I attempted to represent unusually-coloured characters, NovelAI didn't always stick to the plan, resulting in some parts of Lilith's body being white/pink-skinned rather than blue. With other characters, this had been an infuriating issue (there are Carrie sessions that will likely remain unreleased for many reasons, this being one of them.) But with Lilith, it ended up being manageable because most of her body is covered up anyway. Thus, the areas that were most commonly discoloured in a given image were her neck and wrists, and blue-ifying these areas by hand after the fact wasn't too much of an inconvenience if the rest of the image was solid.

Happily, the other issues in translating Lilith's design through NovelAI were minor enough to either work around via prompt or touch up quickly post-generation. For instance, Lilith's lack of feet wasn't such a problem when NovelAI typically cut her feet out of the shot anyway, and facial details such as her head gem and eyebrow scars weren't any trouble to adjust or add by hand. The head gem was a wild card, sometimes appearing exactly as it needed to and sometimes becoming something completely different. You can see a wide variation in 'forehead jewelry' designs throughout the raw images from the Lilith sessions, which was neat to see at least.

If there was one other issue with these sessions, it was with Lilith's midriff, as the single most important facet of her character design is that she cannot go around with her belly exposed. Doing so will literally induce death in those who look upon it. So yes, I died several times while working on this session. Occupational hazard. The sessions that featured Lilith in a button-up shirt generally succeeded in keeping her tummy concealed (with the obligatory straining buttons that I may or may not have encouraged via the prompt.) The session featuring Lilith in a bodysuit was a bit weird in that sometimes NovelAI didn't really distinguish properly between her suit and her bare skin, resulting in some cases where it wasn't clear which was which. This was often the case with her belly, leading to some images not making it into the final gallery due to just looking too awkward and needing too much of a touch-up. That was a slight shame, though the 'bodysuit session' still ended up being one of the most solid Lilith sessions otherwise, so it wasn't a huge problem. (Fun fact! The setting for that particular session was inspired by Belial from Axugaem1, the moon where Lilith once tried to stop her sisters from committing genocide! ...She wasn't successful.)

I must of course mention the blue-skinned elephant in the room (sorry Lilith, you have every right to be offended by what I just said.) You'll notice the last Lilith session, which shockingly did not make it into the main gallery, features her in nothing but an apron. This was, uhhhh, an experiment??
Ok, ok, I wanted to see her in an apron. I thought it would be sexy ;~; Not entirely out-of-canon either, as Lilith does like to bake, and when she's alone or with a theoretical significant-other-that-she-doesn't-actually-have, she'd probably dress in nothing but an apron at some point! She do be feelin' herself like that sometimes!
I like some of the results from this session, and I think if I'd done it at a later date it might have turned out better, but it just ended up being a fish out of water compared to the other Lilith content, and full of inconsistencies at that. The first part of the session had Lilith in a kitchen, holding a spatula. That's how I learned that NovelAI isn't very good with spatulas. The second part had her on a bed with a large tummy, because I have no excuse. The problem is that I used the tag 'naked apron', which was the best I could do to convey what Lil was wearing in this session. But NovelAI, possibly thanks in part to using Danbooru as a point of reference, often took that to mean 'wearing an apron, but with her chest exposed.' This was a little bit too much fanservice for me to justify including it in the main gallery, and the images that did have Lil's chest covered up really weren't comparable to the rest in terms of quality, so I decided the whole 'apron session' was best left raw (god, that just sounds like I was shooting a porno, doesn't it.)

Overall, as I mentioned earlier, I'm genuinely curious as to how a Lilith session would go if I did it now versus back then. Both NovelAI and I have improved our knowledge bases since late 2022, and as the next sessions' notes will show, getting consistent, high-quality images of monstergirls using the utility is no longer a super-ambitious effort. If I do decide to work on the Re-AImagined galleries again in the future, Lilith will definitely be near the top of the list of characters to revisit.

Nimue Sessions - Mid December 2023 and Early January 2024

Back when I wasn't late in getting this month's gallery organized, the first Nimue session was done to fill in a gap in the gallery, which at the time consisted of Lilith and Evelyn content (in hindsight, 'Angels and Demons' would have been a killer gallery theme. Not sure why that didn't occur to me until now...) But after completing said session, I suddenly felt insecure about the rest of the gallery. Why? Because NovelAI had released the 'Diffusion Anime V3' Model, and it kicked ass to such a degree that I felt pressure to raise the rest of the gallery's image quality to match it. This is part of what led to the Evelyn content getting shelved, and it's absolutely part of what kept the 'Lilith apron' session out of the final gallery. I will say though that because my only points of reference for what the new model is capable of are the Nimue and Smish sessions, I'm not certain how representative these sessions are of NovelAI's current capabilities, particularly with regard to monstergirl content (though I'd say the evidence is pretty damn encouraging.) Some of the credit might have to go to the tags I used, which encouraged NovelAI to generate pretty, sparkly, bubbly images (haha, pun. Because watergirls. I slay myself, truly.) Tags such as 'refraction', 'sea sparkle', '[[sparkling aura]]', 'idol', 'cinematic lighting', and 'can't be this cute' probably helped give these images their vibrancy, and if that is the case, then hell yeah, I'm competent!

That said, I noticed a more cartoony, cel-shaded style with these sessions that I didn't explicitly ask for, which has me wondering if the tags prompted that or if the new model just does things differently at different settings. For reference, I went with a baseline of 0.92 power, 0.5 noise, 50 steps and 7.5 prompt guidance, with both 'SMEA' and 'DYN' enabled (I haven't discussed those latter two settings in these notes, but they're optional versions of whatever sampler you're using, designed to make high-resolution images better and more interesting. I've been using them since I started generating larger images back in the summer of '23, and they seem to help more often than not.) These settings are all typical of the last few sessions I've done, with the exception of the prompt guidance value, which I've raised and/or lowered several times out of curiosity. I don't want to say that the results from the Nimue and Smish sessions look more like they were generated at 0.75 power, because that isn't really true, but the simple style definitely does remind me of generating images at that setting. What's more is that increasing the power to 0.99 actually made the results less appealing by virtue of trying to add more complexity but landing in an awkward middle stage that I would expect more from a 0.85 value in previous models. Furthermore, I didn't play around with enhancing images by 0.99 too much like I did in some previous sessions (because the cost of doing so with larger images is ridiculous), but when I did, the results ended up looking a little more like what I'd expect from generating images using a 0.99 power value in the first place. Does this mean it takes more to get to a super-high level of detail with this model, or am I reading into it too much? Ultimately, I shouldn't be making assumptions given how little experience I have with the new model, but it's a point of curiosity at the very least.

That point does bring me to somewhat of a bittersweet note on my relationship with NovelAI, which at the time of writing this has gone on for nearly a year and a half. I can confidently say two things: 1) that the average quality of AI-generated images (using NovelAI's img to img method of generation) has gone up significantly within the last four months alone, and 2) the release of new models toward the tail end of my interest in generating new AI content has meant I don't have as much experience with those models as I do with 'Diffusion Anime V1', leaving me less well-equipped to comment on how to generate good monstergirl content in the present day (the good news is, it seems a lot easier in general.) In a sense, I feel a bit behind the times now, despite having used NovelAI at least once a month for each month since October 2022. I think this is representative of how quickly AI image generation is advancing as a whole, though I do have to add the disclaimer that NovelAI is just one image generating service out of many at this point. But in terms of average image quality, the fact that NovelAI has gotten so much better so quickly doesn't feel like an isolated phenomenon. I'd have to play around with other image generation services to confirm that, which I'm not inclined to do, so take that opinion with however much salt you think is necessary. This is all to say that part of me wishes I'd started doing this a little later, as I'm left at an awkward point where I'm both less and more interested in AI content than ever before: on the one hand, I want to see what the new model is capable of and what difference it makes with several of my characters compared to previous models; on the other hand, putting together these galleries and process notes every month has become more of a chore than I can justify doing at this point, and I'd genuinely prefer to use the energy I put into it for Axugaem2-related content instead. Not to mention that NovelAI isn't free to use, and if I'm not posting AI content regularly then I'd prefer not to spend much more time or money using the utility. That's not a complaint, just the reality that I need to be using my money wisely, for things that directly benefit the rest of my content... like stickers, clearly! Right?? RIGHT??????

Anyway, there will be times in the future where I go back to NovelAI and experiment, but they won't be a guaranteed thing, and whatever AI art I do post in the future won't be guaranteed either. But I do hope to play around with it from time to time, with enough success to warrant the occasional mini-gallery. I think that would be fun, and not having it be a monthly duty relieves the pressure of providing enough content to fulfill expectations (most of which are imposed idiosyncratically by myself, of course.) So here's to the occasional bit of Re-AImagining down the road!

...What's that? I still have to finish the notes for this session and the Smish session? Right... should I move that whole prospective segment to the end of the notes so that there's a logical and satisfying conclusion, rather than an awkward jump back into the usual process notes? .........Naaaaaaaaaahhhhhhhhhh.

Fun Fact! The second of these two Nimue sessions (the one done in 'early January 2024', aka a few days ago), came about when I logged into NovelAI and discovered I had unused anlas. I dumped most of those anlas into the Smish session, which I'll discuss momentarily, but I also asked a friend for opinions on what they wanted to see. The answer they gave was 'Nimue eating KFC.' Hence, the fried chicken. Now you know!

Oh, yes, and before I forget, the reason why Nimue is wearing a T-shirt in a lot of these images is because... I could not for the life of me convince NovelAI to stop putting her in a T-shirt. Nimue is a watergirl and does not have nipples and such. She does not wear clothes 90% of the time. I tried to convey this through tags ('no shirt', 'shirtless', 'naked', etc.) and through negative tags ('shirt', 't-shirt', 'clothes', 'wearing clothes'... should I have included 'wet t-shirt' as well. I kind of feel like I should have.) But NovelAI would not be swayed from its goal to repeatedly make the water elemental in a wet t-shirt joke, and so it did. Granted, as far as NovelAI being stubborn goes, this instance worked out just fine, as I could definitely see Nimue going around in just a t-shirt and making the same exact joke to every unfortunate soul who makes eye contact. I just found it curious, since the utility was clearly capable of not putting her in a shirt, as it demonstrated throughout these sessions. Thus, I can only conclude that NovelAI is approaching Nimue levels of intelligence, which, no matter how you look at it, is utterly terrifying.

Smish Session - Early January 2024

As I mentioned above, this session wasn't planned for inclusion but was rather an impromptu means of using up my remaining anlas. Once I got the idea to use those anlas for a Smish session though, it wasn't long before I made the formal decision to reorganize the theme for January's gallery. For one thing, Nimue and Smish are like peanut butter and chocolate (or peanut butter and that other thing, I guess.) They should not be separated if one is able to include both. Additionally, the quality of this session was good enough that I felt it was worth inclusion, moreso than the Evelyn content, which I find especially impressive given how small a session it was by comparison. That's another thing I like about the 'Diffusion Anime V3' model: more consistent quality means I don't have to spend as much time or resources just to generate good looking images, which is especially useful in cases like this one where my resources (anlas) were decidedly limited.

And look, I know I've been dunking on Evelyn for the past several paragraphs, but to be clear, I don't think the cut Evelyn session was bad at all. I'm sure some of you would have loved to see some of the results from that session touched up and included in the main gallery. Thus, I have to admit that the rest of the reason why I cut that session was purely practical: it would have taken much longer to touch up a bunch of images from that session than it did to touch up the same number of images from this Smish session. That's partly due to the simple style of the Smish images, which carried over from the Nimue sessions (I used a lot of the same tags, so again I can't say if that's typical of the new model or not), but it's also because the Evelyn images were flawed in the way that most of the other sessions have been. Think of the early Anya sessions, or the Selkie sessions, if that helps. Those sessions were among the most work for me to touch up, for varying reasons (Anya having multiple tails... Selkie having multiple tails... the perspective being wack... me not knowing what I was doing... malformed background sharks... the usual!) In the case of the Evelyn stuff specifically, there was a lot of awkward posing and anatomy, some blurriness in the linework that would have necessitated I spend a lot of time firming it up, and a myriad other minor issues that, again, were just part of the territory with that particular model. Comparing the Evelyn and Lilith sessions to the Nimue and Smish sessions has been the biggest indicator to me that the new model is legitimately several steps up in terms of quality, as while there were still aspects of these more recent sessions that needed touching up, I didn't feel like I had to perform emergency surgery on the images just to get them up to snuff. There were still classic AI errors such as there not being enough and/or there being too many fingers/toes, but overall I felt much less overwhelmed looking at these sessions than I did looking at Evelyn. The touchups I did make were largely fun to do! For instance, I added all the shine to Smish's eyes and sometimes gave her a belly button when she didn't have one. That's just a lot more appealing than redoing part of a table or whathaveyou.

Back to the Smish session specifically: there isn't a ton more to add here, as most of the settings and tags I used were carried over from the Nimue session, so everything related to the background and Smish's watergirl-ness prettymuch stayed the same (I did add the 'lagoon' tag midway through the session to encourage a more swampy or brackish setting, but I was perfectly content with what NovelAI was putting out so I didn't try to force it.) The only obvious issue I could find with this session was that Smish's hat changed colour a lot, but I can't really say if that's a similar problem that previous models had with clothing since I didn't actually specify Smish's hat colour via tags. In some other cases, I might have been bothered, but I don't think her hat being brown is a make-or-break aspect of her design. Truthfully, I consider it part of her lore that she doesn't own one specific hat; instead, she just likes 'cabbie hats' and steals them from people she eats, then leaves them around everywhere like some mass-murdering hermit crab. So if that lore works for you, then yeah, the hat inconsistencies check out.

Finally, it must be said that this session has been by far the most consistent to date in terms of my use of the tag 'eating ______' actually resulting in that character eating the thing! I'll let you figure out what food I specified Smish eating. Whether this success is due to the new model or simply a larger sample size of burger-related images, I can't say, but it was nice to get a bunch of content featuring Smish doing Smish things - and the random floating sea-burgers were honestly just charming more than anything else, so I'm willing to overlook their presence.