Transcript of Lynx ComfyUI Workflow: Realistic, Consistent Characters, No LoRA Needed
Video Transcript:
Look at this. I just made a realistic video inside comi using the new links workflow and the face stays perfectly stable in every frame. No training, no Laura setup, just one workflow. So let me show you how you can do this step by step. First upload the links workflow that I created. You'll find the link on my website aistudnow.com. So once you open it in confi you'll see the full layout. This workflow is built to keep the same face across every generated frame while keeping natural motion. So we are using the ven 2.1 model FP8 here. The same base model I used in my earlier ven 2.1 workflow. For the VA, we also use VEN 2.1 VA. And for the text encoder, I am using U NXL NCBF16. So these are the same model files we used in our old V 2.1 workflow. And for LoRa models, I am using two Loa model files. First one is light x2v lura which you can set between 0.8 to one strength for balance and another one is HPS 2.1 lura which is 0.5 and it stands for human preference score. This HPS lura improves colors and gives the result a clean humanlike tone. And if your result looks too saturated, you can lower the strength to around 0.3. This workflow adds a few new model nodes inside the when video model loader section. So here you'll see when video extra model select where we can add full IP layer FP16 and links full reference layer FP16 and another one is full resampling FP32. So let me explain what each one does here. Resampling model. It reads your face, crops it and converts it into a clean feature that the van model can understand. This keeps your facial identity stable across frames. If I talk about reference layer model, it enhances the fine texture of the image like skin, hair, shadows and small fabric details. And the last one comes IP layer model. It's the identity preserving layer and it tells the model when to keep the same person through the whole video. So you will need to download three models. IP layer, reference layer, resampling model. For resampling and IP layer, you'll find both full and light versions. If you have less VM, go with the light model. And if you have enough VRAM, use the full FP16 models. After downloading, save all these three inside your confis models, the fusion model folder. Now in the image load section, upload the image whose face you want to keep the same in the generated video. I always connect everything through resolution master. This keeps all images in the same size. Select one of the v supported presets for best results. Now what's new in this model is a node called when video add links embed. Inside it you will find two key things IP scale and reference scale. So these are the settings. IP scale controls how strongly the system preserves the same face. If you set it too high the expressions may freeze. So a good value is between 0.6 to 0.7. And reference scale it copies small details like skin texture, hair edges, eyebrows, moles, fabric details near the collar etc. If I talk about CFG, the link CFG scale controls the overall motion strength. I usually set it to 1.6 6 for active motion and if you prefer natural movement go for 1.4 to 1.8 above two motion starts to look unnatural so I won't recommend that start percentage tells links when to start working during denoising and end percentage tells it when to stop. So a good ending value is around 0.9. You will also find a test node where you can add your image and prompt. Everything else works the same as in when 2.1. I also added a simple duration calculator section. If you want a 5-second video at 24 fps, just enter those values. The system automatically calculates the correct number of frames. You can change it to 16 seconds or 30fps if needed. Right now when ports up to 5second clips. So I am using that length. So if I talk about example uh in my first example I want to create a cozy cafe scene. So here is my prompt for this first example. A person sits beside a large window in a cozy cafe. City lights blur and shimmer through the steady rain outside. They glance at their phone and laugh with someone off screen. And here I hit run. And you can see the result. The face is a 100% match with our reference image. The motion follows the prompt perfectly. The reflections, the lighting, and the laughter all feel natural. In my second example, I want a rainy platform scene. So, here my prompt is a closeup on a rainy platform at night. The person faces the camera, closes their eyes, and feels the wind as a train passes behind. When I hit run, the video looks incredible. The face remains identical to the reference image. The wind motion feels real and the colors have that perfect cinematic grade. If your reference image has special features like a tattoo near the eye and you want them to stay, so here is what you can do. Go to when video add links embed and set reference scale as one. This keeps every small detail like skin texture, posture, eyes and tattoos exactly the same. If you want the AI to modify the look slightly, reduce reference scale 0.8 to 0.9. that gives a bit of creative variation while keeping identity stable. So that's how you can use links in confi to generate realistic videos with a consistent face. So that's all for today's video. If this helped you, subscribe to the channel for more Comfy UI tutorials and updates.
Lynx ComfyUI Workflow: Realistic, Consistent Characters, No LoRA Needed
Channel: ComfyUI Workflow Blog
Share transcript:
Want to generate another YouTube transcript?
Enter a YouTube URL below to generate a new transcript.