Transcript of Wan Move: Precise Motion Path Control for AI Video|Master Wan Move Trajectories & Camera Control
Video Transcript:
Today I'm introducing a new trajectory-guided video generation technique called Wan move The biggest characteristic of this model is its wide range of functions Firstly it supports the movement of a single object Furthermore within the single-object movement we support movements across a very large scale and it is better able to follow the laws of physics At the same time it also supports the movement of multiple objects When there are multiple objects in your frame you can draw the motion trajectory for each individual object guiding its manner and direction of movement Additionally it supports action transfer 3D object rendering and even camera animation Therefore, overall the functionality of this model is very powerful Using trajectories to guide animation generation is not considered a new technique Previously ByteDance released a technology called ATI which also guided video generation by drawing trajectories We once used the ATI technology to generate some very dreamy videos In those videos we kept the subject still while the surrounding environment moved which also looked very dreamlike However similar technologies we discussed before had two major drawbacks The first was that it wasn't easy to draw the trajectory and the second was the lack of precise control capability However all of these drawbacks have been significantly improved in this new model Take a look at this video: it can precisely control certain actions in the frame and even camera transitions Moreover it can precisely control the changes in characters within the frame even if the trajectory is extremely complex What you are seeing now is the Wan move page on GitHub From this page we can see some useful information For example Kijai's Wan video extension already supports this model You can open Kijai's Wan video extension and then click on the example workflow You will find a Wan move workflow Regarding model downloads Kijai still provides two versions: one is the FP16 version and the other is the FP8 version You can choose a different version based on your computational power I will use the FP8 version for this demonstration You simply need to update the Kijai Wan video extension to the latest version and then drag this example workflow into your ComfyUI and you are ready to go Similarly I have built this workflow onto Running Hub You can access this platform via Running Hub In the ComfyUI domain Running Hub is an excellent online workspace because it keeps up immediately whenever a new model or extension appears You can register for Running Hub using the invitation link in my video description By doing so, you will receive 1 000 free points Furthermore you also get a bonus of 100 points just by logging into Running Hub daily allowing you to try out your own workflows First you need to be clear that although the model claims to offer many capabilities the content we can reproduce within ComfyUI is still limited We will select a few representative examples to explain to you For the first example let's look at controlling a single object using a trajectory to generate a video of relatively good quality This workflow is my reconstructed version but its basic structure is the same as Kijai's original Now let's look at the model loading For the main model we are using the FP8 model we just mentioned I have enabled Sage Attention here If you haven't installed Sage Attention you can select SDPA We also have a Light X to V acceleration model here and block swapping is enabled I set the number of swapping blocks to 25 Then we have our encoder and VAE Below that we have a reference image We first need to resize this reference image The resized image is first passed through a node called the Spline Editor This editor is used to draw the trajectory If you hover your mouse over this node you can see information including how to add splines and how to add nodes Let me briefly explain We can generate a new canvas using "New Canvas". Then you can right-click and select "Background Image" to upload an image you like to use as the background image Also note that this node cannot be zoomed It cannot be dragged larger or smaller; you can only control its actual width and height using the width and height parameters here Currently mine is set to 480 by 382 here So, first we can create a new canvas After creating the canvas we can change the background image Note that after you change it it will first display the image at its original size and only after a while will it change to the size you set Now I will switch it back We wait a moment and it returns to the original size you can choose not to change the background image manually Since my current image size is not 9: 16 it is distorted In this case you can connect the resized image to the BG image input here which is the background image To do this you need to run the workflow once to get the desired background image Let's run it now so you can see The image has now been replaced The downside of this method is that you must run it once first to get the image The best way is if you have a standard 9: 16 image; you can use that image as a reference image here and use it as the background This way you don't need to run it in advance Next let's look at drawing the spline After you create a new canvas you will have a standard spline It looks like this For example we can adjust the starting position here and you can create a trajectory Next let's look at the basic operations of the nodes Firstly there are two ways to add a node The first way is to hold down the Ctrl key and use the left mouse button to add the node Note that this adds a node in the middle of the spline You can also hold the Shift key to add a node This adds a new node on either side of the spline If you want to delete a node just select it and press the right mouse button to delete it Now let's quickly draw a spline Once drawn we will get a string representing the trajectory which we will use later Next is our reference image As we mentioned after resizing it we first use the visual encoder to encode it once We connect the encoding result to the clip embed input of the Wan video image to video encode node Similarly the image itself is connected to the start image input After passing through this node we get an image embedding vector You can consider this vector a Latent The most crucial node here is the one you see now Its function is to inject our trajectory information into our Latent After processing we get a new image embedding vector Then we can proceed to sampling Since the current image embedding vector already contains the trajectory information we can directly connect it to our image embed For the prompt you can write something simple; I wrote something very general here The output of the prompt is connected to our text embed Below that is a standard sampler Note that I set the sampling steps to 4 Let's look at the generated result The finger slowly moves downwards along our trajectory Therefore you can see that this type of control is highly effective This is the simplest workflow Next let's look at the second example Let's examine the reference image This reference image shows a hummingbird perched on a beautiful woman's finger I drew the spline onto the bird's body I wanted it to fly away from the woman's hand so I wrote my prompt accordingly Let's look at the final generated result First we cannot deny that the guidance provided by the trajectory is present and it is very strong You can see that the bird is indeed moving along our trajectory However my prompt did not take effect It didn't fly away Instead it moved while still attached to the woman's hand This is certainly not what we wanted This brings us to our second type of video which is more complex We will control multiple objects moving simultaneously Note that I now have two splines Some may ask how to add splines It's simple You just right-click and then click "Add new spline". You will then have a new spline If you don't need it you can right-click and delete it Currently I have only these two splines One is placed on the bird's body and one is placed on the hand To indicate that the hand and the bird need to separate one of my trajectories points upward and the other points downward The prompt remains unchanged Let's look at the final generated result You will notice that the bird indeed flew away However the way it flew is not very aesthetically pleasing because it didn't flap its wings Therefore you find that although this model can precisely control the movement of objects the inherent movement of the object itself tends to be suppressed This is one of its major issues If the object were a simple ball I think this would be fine But if it is a bird the level of detail in the bird's performance is insufficient let's look at camera control Here, we use even more splines We used about four of them They essentially extend outward from the woman's face in four directions What I intended to convey was that the image is zooming in meaning the camera is pushing forward In the prompt I did not write any information about the camera I only stated that the woman was doing a runway show Let's look at the final generated effect This camera zoom-in effect is quite prominent However I feel that the character's inherent movement is still lacking In fact this drawback is the same as the problem we observed with the bird earlier Through the examples above we have tested this model's control over single objects multiple objects and camera movements including the basic usage of the node I encourage everyone to try it out themselves Overall the model's control strength is adequate and the precision is also sufficient However it suppresses the subject's inherent movement within the video Therefore we can see that certain issues arise That's all for today Follow me to become someone who understands AI
Wan Move: Precise Motion Path Control for AI Video|Master Wan Move Trajectories & Camera Control
Channel: Veteran AI
Share transcript:
Want to generate another YouTube transcript?
Enter a YouTube URL below to generate a new transcript.