Detailed Guide for Text to 3D to 2D Animation (Mootion + AnimateDiff)
Added 2023-11-27 13:54:54 +0000 UTCThis is a detailed text guide for the video guide on Youtube found here: https://youtu.be/LynYh_qrs8o
Go to https://discord.com/and sign up for an account if you don’t have one. Login and then click the link https://discord.com/invite/ZStwxNxnDT to join Mootion’s Discord.
Scroll down and click on one of the creation channels like #creation-1
Type /motion to get started and select what you want to see.
Prompt: This is what you want the character to do.
Character: Select preferred character.
Inplace: True will keep your character and camera static.
After finished, select your desired saving format. If you’re unsure, select .mov for video and .fbx for 3D application usage.
Convert your .mov file to .mp4 for use in ComfyUI.
https://new.express.adobe.com/tools/convert-to-mp4
Download the preferred workflow for video2video generation in ComfyUI from the attachments below this post or https://civitai.com/articles/2379/guide-comfyui-animatediff-guideworkflows-including-prompt-scheduling-an-inner-reflections-guide.
I recommend 1 or 2.
1 Vid2Vid will utilize 1 ControlNet, mainly providing the skeleton for the character.
2 Vid2Vid Multi will utilize 2 ControlNets, providing both the skeleton and the outlines of the character.
Drop your workflow .json into ComfyUI.
Reminder:
If you need to download and Install ComfyUI, here’s a guide for that:
If you use the Vid2Vid Multi ControlNet like in the original video guide (not the ComfyUI linked above), it will look something like this:
IMPORTANT: If you have RED nodes, you need to install missing custom nodes from the manager.
Double click and type video, select the VHS_LoadVideo node:
Select the video you want to load (your converted .mp4) and drag a blue line from image to image in the upscale image node.
Frame_load_cap: The maximum amount of frames to load from the video. I usually set this at 12 in the beginning to find my preferred visual style, then back to 0 to complete the animation.
Skip_first_frames: The amount of frames you would like to skip in the beginning.
Select_every_nth: Skips frames over time. For example, set at 2, this will skip every other frame, or only use every other frame. If you select 3, it will only use every third frame.
Set your preferred size. I recommend 512x512 to start.
Select your preferred model and vae. Find your models on https://civitai.com/ or in the ComfyUI manager. For VAE, 840000 is a good default one.
OPTIONAL: If you want to use a Lora, load one by double clicking and typing lora.
OPTIONAL: Connect model and clip from checkpoint. And connect model from lora to animatediff.
Select your preferred AnimateDiff model. These can be find in the manager or on Civitai.
Context_length is the amount of frames used for each “chain” of animation. Context_overlap is how many frames these chain will overlap while merged together.
Motion_scale is simply put how much motion you will see in the animation. Use with care.
Write your positive and negative prompt. You can leave default negative prompt and only write in the positive what you want to see.
OPTIONAL (good to know): The seed is currently set at fixed. If you want a different generation with the same settings, you must change your seed, either manually or by selecting random.
ControlNets are setup to use depth and openpose as default, this will give you a skeleton and an outline of your input character. You can adjust the strength of these with the strength value. You will see a preview of your preprocessed ControlNets in the preview boxes.
The KSampler will contain your generational settings.
Steps: How many sample steps each image will go through. For regular samplers, a value between 15-50 is good.
Cfg: Use a value between 3-7. Lower values will make your animation steer away from your prompt. Higher values will break the generation and look overcooked.
Sampler_name: The specific sampler or “hammer for your nails”. Euler_ancestral is popular, dpm++ 2m karras as well.
OPTIONAL: Use LCM sampler with 8 steps and CFG 2 for speed. Make sure you use the LCM Lora.
Denoise (denoising strength): How much of your original input video will be retained. Higher value = less retention.
In Video Combine you can select your output frame rate and video format.
Frame_rate: Good options are 8, 12, 24.
Loop_count: How many loops. 0 is infinite.
Filename_Prefix: Will add this name in front of the files.
Format: Output file format.
Pingpong: Will reverse the animation when reaching the end if set to true.
Crf: Constant Rate Factor. Leave default, advanced setting for video compression. Google if you want details on this. Not necessary for 99,9% users.
Start your generation by pressing Queue Prompt. Your generation will take quite some time, be patient. Use lower resolutions and lower frame rates to start with and good luck!