Transcript of Create HYPERREALISTIC Consistent AI Characters - FREE & LOCAL! [Full ComfyUI Masterclass 2025]
Video Transcript:
Creating hyperrealistic, consistent characters is now easier than ever. All you need is one input image in any style, and my workflow will automatically generate the perfect data set, high resolution, upscaled images, and different settings and poses, complete with detailed captions. If you want, you can even load in pose references or dress your character using the virtual tryon section of the workflow. All of this is based on entirely free open source models that you can run on your own computer. And by the end of this tutorial, you'll know how to create perfectly realistic humans or stylized characters, how to train your own Laura models of these characters if necessary, and how to generate extremely realistic and perfectly consistent HD or even 4K videos of your characters all locally on your own computer. So, make sure to subscribe and let's get started. All right. So, this is made possible by the new instructionbased image models like Seedream 4 or Nano Banana by Google. These models let you input one or multiple images and then say something like, "Hey, show me a side view of this character or put this character in this scene." And this works really well, but these tools are heavily censored. Generating images can be a bit tedious and it can get expensive since these are closed source commercial tools. So, let's look at the free option, Gwen image edit. This is an open source model by Alibaba, and it lets you merge different characters into one scene, put characters in all sorts of different backgrounds, make them wear all kinds of clothes, or even extract poses from one image, and then apply them to your characters. Here, I'm using my free Gwen image edit workflow for Conf UI. And I'm going to show you how to set it up in a minute. But first, let me show you how easy it is. All you need to do is just drag and drop in the images that you want to reference. So let's say we want to marry these two characters. And maybe you remember this guy. This is Hans Schmankal from my first ever consistent character tutorial. And then you just need to connect them to this text and code note here like this. And then you can write a simple prompt in natural language. They are holding hands. Getting married. Wedding photography Pixar style. Let's do that. This worked perfectly. They're holding hands. And now let's make them kiss. I just change out the prompt to create an extreme close-up of the faces. The characters are kissing. And now I'm going to use only one reference, the image that we just created. Let's see what happens. Well, that's an awkward kiss. Eyes are closed. Okay, that's much better. So, you can see how easy it is to create like entire stories using this technique. For realistic characters, this will work as well, but you can see it still has a little bit of that plastic AI look. I'm currently working on fixes for that, but in most cases, training a custom model for your character, Allora, is still the best way for maximum quality and character consistency. It also gives you flexibility with the model you want to use for generating images or videos, which is something that this model when image edit can't do, of course. So, I've built this automated workflow for a comfy UI that'll let you input one image and then create the perfect data set, including captions with one click. But, of course, you still need to set it up one time. So, let me show you how to do that. First, install Comi if you haven't already. We have a step-by-step guide on that on our website explaining how to do that. Once you have it installed, download the workflow file from the link in the description and drag and drop it into the Comfy UI interface. Now you'll need to install a few custom nodes. To do that, simply go to the Confui manager, click install missing custom nodes, select all of them, and install. Restart Confui and the workflow is here. Now, we need to download the required models for this workflow. And you can find all of them to the left here in these yellow boxes. It shows you the link where to download this model. And it will also tell you where to put this model in your comi folder structure. Right here, I just copy the link. And to make sure this workflow works for most of you, I'm using the GGUF version of the main model here. GGUF is basically a way to compress the model size, make them smaller at the expense of some quality. And you want to select the version that comfortably fits within your GPU's VRAMm. I have a 4090, so I could easily use the Q8 version or even like the full FP8 version. But to show you what kind of quality you can expect, I used the Q5 version for this whole workflow and all the demonstrations. Everything you've already seen was made with the GGFQ5 version. So download the version you want to use. Go to your Comfy folder, models. in this case unit and I created an extra folder called GGF and I put my GGF versions in there. You don't have to create this folder but in any case make sure once you copied it there to click R to reload the Confi window and then select it again here in this node. And this is basically the step that you're going to do for all of these models. And these are the models for the first part of the workflow where you generate all the images. This second part here is the data set creation group and this will create the captions and the upscaled images of your character. For this you also need a few models and you can also find them in the notes here. These are direct download links. So you can just click and you can see the model starts downloading. I already have that so I cancel it. And then you need to put it in the right location. Again check if you have the correct model selected. This model should download automatically once you first run the workflow. And this one right here, the Ultrasharp, you can get that in the Comf UI manager. So just go to manager, model manager, and search for Ultrasharp. And there it is. And then you can just click install, refresh Confui, and you have that. Okay. So now we can start using this workflow. We're going to work from top left to bottom right. So let's start at the top left here. First, we need to input an image. And you can just drag and drop that in there. And this could be an image in any style. Next, you want to come down here and the you have the option to create a prompt here for the character. The default is make an image in this style of the character. That will work fine in most cases, but maybe if you have like a stylized character, it would make sense to name the style here. Next, and this is really important, you need to give your character a name. This will also be your trigger word for Laura training. More on that later, but it will also be the name of your output folder where all the images that you create here will be saved out to. It doesn't have to be something as cryptic as I'm using here, but make sure to use something that is unique. That's pretty much all we need to do. Let me just click run and then I explain what happened. And now it's done. So, let's check what happened here. Let's start at the top left here. You can see it created a character turnaround sheet. The problem is with the GGUF versions, it will sometimes double the poses here, but that's not that big of a deal because we're generating all these poses separately later in the workflow. But what is important in this group here is that this is an opportunity to add additional clothing detail for your character. Think about this. You're only giving a part of your character here. So somehow the model needs to know what what the rest should look like, right? Just add that right here at the end of the prompt. I added wearing chunky sneakers because otherwise she would wear black shoes and I just wanted her to wear sneakers. Then the different views of this character are created based on these prompts. Feel free to change these prompts. They're not set in stone. You can uh refine them if you want. All of these groups are modular. So what you can do if you need more images, you can just copy like a group like this here and change the prompt. So maybe let's add a prompt like this. Run the workflow again. And you can see it remembers that all of these groups are already calculated. So it will start right here at the new group that you just created. And it created this beautiful image right here. Let's move to the bottom row here to the left here. You can see the different emotions of your character will be created. And you can change them by just changing the slider here. Let's make her say ah and lower her eyebrows. Click run again. And you can see we changed the emotion. These emotions will be applied to your portrait image that we created up here in front of white. But if you want, you can come in here to the get close-up group and switch the image out. And we could, for example, uh load in the input image and click run. And then it will do the same process but with our input image. To the side here, we have additional poses. a tea pose, character laying down, side view, back view, walking in nature, and then we have these green groups here. The left one is the virtual tryon group. And this is super easy. You just drag and drop in a clothing item. And I would recommend in front of a simple like a monochrome background, but other references will work as well, as long as you name it in the prompt. So here I said, make the character wear the gray winter coat. And this worked very well. To the right here, we have the post transfer group. And here you can just load in a reference image of a certain post you want to have. This will be extracted and applied to your character. Okay, we generated a bunch of images here. You can find these images in your comfy UI output folder under the name that you gave the character. Now, we played around here a little bit with like generating different images. Maybe you want to use all of them for your data set. Maybe you only want to use these images that are currently displayed. And to save out only them, we just need to change the name of the character. That's why I always add this number here. Click run again. And it will create a new folder in your comi output folder with only the current images in there. It's important that you run this workflow in two parts. So first generate the images and then come down here and click this button right here data set creation. All you need to do for this group here is come to the prompt and you can add a prompt here. Then all you need to do is click run. So now it's done. So what happened here? Well, first of all, all the images we created in the top groups were loaded in here. And you can see we have 24 images. Then detailed captions will be created for each of the images and the trigger word will be added to the caption. Then the images will be upscaled to the resolution that you set here. In this case 2K resolution um based on the prompt you give it here combined with the captions that were created here. And you can compare the before and after here. So let's compare A15 with B15. And you can see this adds a lot of um quality, a lot of detail. A common problem with um Gwen image edit is that the skin looks a bit plastic. But look at this. Now using this upscale setup, we not only get like a lot more detail in the eyes and the eyebrows, but we also get much more natural skin uh tones here. The upscaling process is done with Flux in a combination with Yuzo which is a model that helps the character stay consistent during upscaling. So compare your character and if it changes too much what you can do is increase the start step here. So we have 20 steps in total. So we could do something like 18 and this will only do like minor changes. If you want to add more detail, what you can do is lower that number to something like 12 or 13. You can find all the images that you created in the folder that is named after your character. And in this you will find the upscaled images folder. And this is your full character data set. We have all the individual images. And then next to that we have a file with the same name with the caption for this specific image. Now with this data set, it is super easy to train your own Laura for pretty much any kind of AI model. Ara is basically a small add-on model that gives more context for a character location or anything that you train it on. In my last video about creating consistent characters, I showed you how to train Allora for the Flux image model using Flux Gym. Flux Gym is super easy to install using Pinocchio, a tool that lets you install AI tools with one click. And then you just need to start that drag and drop in your data set. And the cool thing about training Allora for Flux is that it works even on older GPUs uh with only 12 GB of VRAM. But today I want to show you something different. I'm going to train a model for the one video model. This allows you to not only generate amazing videos of your character, but also images with insane detail and realism. For this, I'm going to use AI toolkit. This is an easytouse Laura trainer for most AI models. You can install this one locally following their installation steps, but there's also a one-step installer that will set everything up for you. However, training a one Laura is very resource inensive. Even on my 1490, I had to activate low VRAM mode and I could only train on images 512* 512 pixels and even then it took a few hours. So, I mostly use runpot to train my Auroras, which also has the benefit that it's also a lot faster. Now, this will cost you a few dollars, but in my experience, it's worth the price. So, once you have an account, you need to go to billing and add um a few dollars, and 10 is more than enough. You can train a few Loras with that. And once we have our money, we can go to pot templates and search for AI toolkit. And then we have this here, ostress. This is the creator. We can just click on that and then click deploy pod. Now, we can select a GPU that we want to run this on. and uh for 5090 works fine. I'm going a bit overkill and I'm going to take this one right here just because it's just a little bit faster. And once you select that, you can go to change template and then you have to scroll down, click edit, and the only thing that you can change here is you can go to environment variables and uh change the password. That's all you need to do. Then you can click on deploy on demand. And once that's done, you can click on HTTP service and this will launch AI toolkit. Now you need to put in your password and this is the interface. Now we need to create a data set. For that I go to data sets and click new data set. You just need to give it a name and then you can select your images and just drag and drop them in here. Now we don't need all of these images. Take some time and look through them and the ones that are look slightly weird or off just just delete them. Also, we have a lot of these close-up shots here that are really, really similar. So, we don't need all of them. We also have the back view and the side view. So, maybe we don't need this image right here. That is a bit weird. Delete that. Okay. So, this is the final data set that I'm going to use. And now we can just click on new job. And now we can just name this Lara. I usually name it the same as the trigger word, but doesn't matter. And next, we're going to choose the model that we want to use. Now one 2.2 is currently the newest one open source model that we can use. However, it consists of two separate models that work together. So what I actually want to do to be more flexible with how I can use this Laura, I want to train a Laura for the older version, the 2.1 version because this is also compatible with the new version. So I'm going to select this one 2.1. AI toolkit will save out all the different versions during training. So you have intermediate versions of the Laura because sometimes the you can overtrain and the the newest version does not actually have to be the best version of the Laura though usually it is. Um and we can say at how many steps it will save out a version and I'm just going to say 500 here. For data set here we need to select the one that we just created. Next we can come down here to samples. So during training, AI toolkit will generate example images with the Laura it just created. So you can track the progress and see how well it trains and you can set at how many steps these samples will be created. And I will match this with the saved out version of the Lara that we set here. And currently it's set that will actually generate a video instead of images which will just take a long time. So you can just set this number to one. Finally down here we have all the prompts for the example images. And what I want to do is I want to take the trigger word and then just adjust them a little bit to fit our character. A woman holding a coffee. A horse is a DJ. We don't want to see a horse. So I'm going to do this. And then I'm also going to delete some because these are just a lot of images and we don't need so many. Okay. So these are my example prompts. And that is all we need to do. Now you can just click create job. And then you can click this start button right here and it will just start training. Now if you have any questions for Laura training or you want to use this data set we just created uh to train a different kind of Laura, I highly recommend you check out Ostrus AI YouTube channel. He's the creator of AI toolkit and he has a bunch of amazing walkthroughs and tutorials showing you how to use this tool to train all sorts of different Loras. After one and a half hours, Laura training is done and you can check the samples. So, so here are the first samples that are generated before training even starts. And you can see the character is not looking like our character at all. But then each iteration, every 500 steps, it gets closer and closer to our character. And then even things like the sweater start to shine through. Go back to overview. And here you can find all the iterations. And I usually download the final one. In this case, it would be this one right here. Download that. And maybe I'll take the version before that. I mean, we spent like $4 to train this, so we should grab everything we can, right? And once it's downloaded, don't forget to stop the part so you don't burn through any more money than you have to. You can use any one 2.2 or 2.1 image generation or video generation workflow with this Lara. I'm just going to show you my free workflow that you can download via the link below and install in the exact same way. So, you just drag and drop that in there. install missing custom nodes. And then you need to download these models right here. And since I'm aiming for realism with this specific workflow, you need these two Loras right here, the Lenovo Ultra Real and Insta Real 2.2. And then make sure to go to the high noise model where the high noise model is loaded. And here we want to load in our character Laura. So this is just the one that I downloaded from Runport. So now we're going to the low-noise model and here again we are using the same character Laura that we trained. Now all you need to do is create a detailed prompt like this. So I have my trigger word here for the character and I just have basic um descriptions of my character and the style that I'm going for and in the end here you can see insta reel and Lenovo. These are the trigger words for these two loras right here. Then there are the samplers. Currently, you don't have to change anything there. And then I also added this post-processing group right here. This will add chromatic aberration. And this just means that the color channels are a bit separated, creating this imperfect look around the edges. Here, it just makes it a bit more realistic. Next, I'm artificially sharpening the image a little bit. You can deactivate that if you don't want using CtrlB. I have um a little bit of bloom. So um bright parts of the images will glow a little bit. And then finally and most importantly we add a little bit of grain. And this just really helps to make it more cinematic. If you have a prompt that you like you can just change the seed and hit generate again. And now we have a different variation of this image. Also looks really really good. Since these prompts are so long, it might be helpful to use a large language model to help you with prompt creation. So, I just copied in the prompt and said, "Use this prompt structure to put this character in different situations, highlighting different kinds of framings." And now we can just copy over this prompt and see what Claude wants to create for this character. So, now we have something like this. Now, these images look really cool, but they take some time to generate. But there's actually a way to reduce that time by a lot. And for this, you can use these light x lauras. There's one version that lets you generate images and videos with only four steps. Right now, we're using 30. I personally like to use this one still. So, download this. And then you need to load that in right here in both the high noise and the low noise. Laura. And now you can actually set the maximum steps to eight. In the first sampler, we want to stop that at three. We need to put the CFG to one. do the same thing here. Steps eight, CFG1 and start at step we want to do we want to put in three here. And now using CtrlB, we're going to activate this right here. So let me just run that. So instead of 162 seconds, this only took 53 seconds and it still looks pretty great. Now the cool thing is that one 2.2 is actually a video model. So, we could use this exact workflow to also create videos of our character. All we need to do for that is increase the length right here to maybe like let's do 41 frames. And of course, we don't want to save images. Instead, you could just add like a video combine note like this. And I'm going to bypass these effects right here. And now we could take this prompt. Ask Claude in this case to rewrite the prompt to be a video prompt like this. And now we can run this and it's done. And we have a pretty cool video with a very interesting camera move. But it's actually pretty close to what Claude had in mind. So I guess it worked really well. Now, of course, you don't have to switch out all these nodes in the image workflow. Again, I prepared a free video workflow for you as well. This is already preconfigured. So you can just load in your character Laura, the other Loras, put in a prompt and yeah, generate video with it. And now is a good time to mention that I also have advanced versions of this and all the other workflows that I showed you today on my Patreon. These feature extra upscalers and other experimental features for maximum quality. However, for all the demos that I showed you today, I only used the free version, so they are enough to try everything out. As an additional thank you for your support, you'll also get access to all the Loras that I trained, the data sets I used, including the prompts, and you'll also get access to our amazing Discord community. So, thank you for making these videos possible. And let's take a quick look at the advanced video generation workflow that I use to generate 4K videos of my character. So, this workflow is very similar, but it features an upscaling setup that will break down the video into small parts and then upscale each one by one, keeping VRAM usage pretty low. So, yeah, this was how I was able to generate these 4K videos. So, don't forget that even though I showed you this realistic character, using this workflow I showed today, you can generate all kinds of styles and characters. So, have fun generating and feel free to tag me in your work. I always love to see what you come up with. Thanks again to our lovely Patreon supporters who make these videos possible. If you want to get access to the exclusive example files, advanced workflows, and Discord community, consider supporting us. And see you next time.
Create HYPERREALISTIC Consistent AI Characters - FREE & LOCAL! [Full ComfyUI Masterclass 2025]
Channel: Mickmumpitz
Share transcript:
Want to generate another YouTube transcript?
Enter a YouTube URL below to generate a new transcript.