Transcript of Wan 2.1 ComfyUI for Story-Driven Motion First + Last Frame to Video
Video Transcript:
Using one tool's first last frame video model, I use just two images to create seamless motion, making it feel as though the AI filled in the story scenes. So, in this video, I'll show you how to install the one two model, build a comfy UI workflow that generates believable in between frames, and at the very end, how to upstill those results using an open-source tool that outperforms the original render. And this is not interpolation. This is imaginative storytelling through gaps. So to get started first let's download the one two model from the hagging face page. There are quite a few models here from one two. We have the vase model text to video models image to video models but for this video we are going to focus on using the first last frame video model up here. Download the FP8 version which is uh 16 gigs smaller size for your hard drive space. Then save this into comfy UI. Go to models diffusion model folder. I have created a folder here to organize my models. If this is your first time using one, you need additional models for one two which are the clip model, VAE model and text encoder. So in order to get these additional models, please go to my previous video in which I explained in much more detail how to install them. And once you have all the additional models installed, let's open up Comfy UI. Let's start with the first workflow. Right click to add the first node we need. Go to loaders load diffusion model. Choose the one two first last frame model. And this should be the model we just downloaded. Then change the weight type here to FB8. The next node is the load clip node. Go to advanced loaders. Select load clip. Drop this down. Then select the T5 XL FB8 text encoder model. Then also change this here to one. After that double click, search load VAE. I select the node here. I'll bring this down. Then choose the one two VAE model. Someone might ask why we load these separately. These three nodes are similar to the usual checkpoint loader model clip VAE. However, since the models are not merged together, we have to load them separately. So, drag out the clip. Select clip text and code. Let's change the color to green as always as the positive node. Then make a copy of this color to red as the negative node. Connect the clip here as well. Then to connect the VAE, I always use the anything everywhere node. This auto connects to every VAE you may use in the workflow. Uh for now I will not join the diffusion model. I will connect it a bit later for a good understanding. So let's highlight all of this and place the nodes into a group. So far so easy, right? I'll zoom out from here. Below this group, let's import the two images we want to animate. I'll right click to add a node. Go to image. Select load image. I'll drag my first image into this node. Repeat this node here once again. Then my second image goes here as well. And I have created these two scenes using in painting with a character Laura to get my final two images. Let's rename the first one to start image. Then change this as well to end image. This all feels simple, right? I'll position this below. Next, search for image resize. I'll choose the node here from Comfy UI Essentials from the custom node. Then I'll use a landscape dimension. 1280 for the width. Then 720 for the height. I'm using this node to avoid any errors. This will make sure the image sizes are all equal and not too large using the required dimensions for the one two model. After that, we need to encode this into the latent space. So choose clip vision encode. Let's make a copy of this down for the second image as well. Then I'll change this node to none. Then let's connect these as well. Next, I'll zoom in here. Then drag out the clip vision input. Select load clip vision. From your list here, choose the one two model clip vision_h. Then link the same model into the second clip and code node for the end image. I'll zoom out from here. And let's place all these notes once again into a group. This all looks easy, right? Now we have two groups here. So how do we connect these? Right click to add node. Go to conditioning. Select video models. We have so far covered how to use the one image to video node. One fun control to video node. So for this workflow, we need the one first last frame to video node. Then we can connect both of these groups to work with the node. The prompt will go into the positive. The negative prompt will go into the negative. The VAE will be autoconnected because we use the everywhere node. We have to join the clip vision start image here. Then we do the same clip vision end image here. We also need a start image and the end image as we rename them. So I'll create a path one into the clip vision. Then a second into the start image input. After that, connect the end image from the resize node as well into the end image input. So, isn't this easier than you thought? Now, panning to the right, let's connect all these into the K sampler to generate the animation. The positive into the positive, negative into the negative, the latent into the latent input. I'll pull out from here. Move the node up. Now, we can drag the model from the diffusion model. Go to search then type model. Select model sampling SD3. Change the shift to 8. Then join this into the K sampler. We are almost there. Let's drag out the latent from the K sampler. Select VAE decode. This will create the pixel video from the latent space. Then from the VAE decode, search video. Select video combine from the video helper suite. Zooming out from here, I'll highlight these notes as well to create our third group. This is not too tricky, right? And before we animate, let's change the default settings. Inside a positive prompt, describe a detailed description of your character or start image, the action you want to see, and even the way you want the character to move for the video or animation direction. Paning to the negative prompt. I'll paste the generic negative prompt here from one and let's move to the one settings. Make sure to change the width and height to match the resized image. This is important so you do not get this like I experienced before. I'll change this to 33 frames as the duration, but you can always increase this. Next, let's focus in on the K sampler. I'll change the C 466. Change this to fixed. CFG I'll use six sampler I'll be using uni PC recommended for one two then theul to simple den noising I'll leave at one let's also modify the video combine settings I'll use a frame rate of 16 recommended for one two video generations then I'll create a folder for the file name to save my videos format I'll change this to H264 everything else will be fine so are you ready to see the outcome of orders. Let's move up here and let's hit run to see what this will look like using one two. If you run this for the first time, loading the models will take some time, maybe longer. So, be patient and you can check this once a while during the rendering process. So, I'm going to skip this to the results to see what we get. All right, this is all done. And uh let's go ahead to right click to resume the preview here. And let's see the results from one two. Do you see how smooth this is? We can see there are no warping between the starting and ending images. I am particularly impressed with the background consistency and how one two is simply animating the character in the scene. Moving on. Have you used upscaling to smooth frames and make your whole animation feel more intentional? To do this, visit the page here on GitHub. Scroll down and this is a basic workflow process. Move down on the page. Copy the link here to install the custom node. Go to your comfy UI directory. Open the custom node folder. Up here in the directory path, let's type cmd. Paste the link we copied here. Then press enter to clone the custom node. You will see done. Once this is completed, then you can close the command terminal. You should see the custom node here in your folder. if this is installed correctly. So let's close everything from here. Close Comfy UI as well. And do not forget to close the command terminal. Then restart Comy UI. Are you ready to use an easy way of upscaling? For the first node, search load video. Click to load a created video by one two. Double click next. Then search video upscale. I'll select the node here from the custom node we just installed. Then go ahead to connect the video node into this. Next, select the upscale model you want to use in here. Let's drag this out. Then search free. Select the node here also from the same custom node we installed. Now drag this out. And this links to the video combine node. I'll still use the frame rate of 16. Rename this to what you prefer. Change the format to H.264. I'll pull out from here. Then we are ready to see the upscale results. So go up here, then click run. I'm sure this was easier than you thought, right? All right, cool results. This is done. I'll zoom in here to get a closeup look. So we can view a much better clarity and cleaning up in the details. I'll go ahead to check the upscale video size as well. Uh we can see this is higher than 1080p but not 4K. It's sitting between 2K and the 4K resolution, but we can go into a full screen here even with a lower resolution from what we have now. So, this was all simple with the upscale technique and easy, right? I hope this is all valuable. And the trick here is to make sure your background is close to the same when you impaint without many changes. I realize using Tcash might be faster, but the results might not be very decent. However, I'll provide both workflows using TCash and without TCH in the description. I always appreciate your likes if you found this beneficial. And if AI video generation sound a little impossible to you, you can watch this video to learn how to get more control over things. And I look forward to seeing all of you in the next video. [Music]
Wan 2.1 ComfyUI for Story-Driven Motion First + Last Frame to Video
Channel: goshnii AI
Share transcript:
Want to generate another YouTube transcript?
Enter a YouTube URL below to generate a new transcript.