YouTube to Text Converter

Transcript of Wan Alpha in ComfyUI - Videos with Transparency

Video Transcript:

Hello friends, welcome back to the comfy org stream. I'm here with Julian and I am empers and we're back and we're going to hang out and talk about Juan Alpha today. Um so yeah, you know, uh that's not really the elephant in the room today. The elephant in the room today is that Sora 2 came out yesterday. Um, we are going to get to one alpha, but uh, we should talk about how crazy Sora 2 is and sort of what a uh, what's the word? Divergent it is from, you know, uh, our idea of what a video model is. Um, have you had a chance to play with it much, Julian? Uh, a little bit. Yeah. So, what do you think of this social media aspect of Sora? Like I I think it's very interesting that they've gone this this way with it. I think like my brain is seeing it more like as a uh it's not a professional tool, but it's a fun thing, you know, like and so it's it's I don't know like I'm not approaching it like a I'm I'm approaching it more as a a a tool where you create your own little social media videos, but it's not professional. It's just fun. So, it's good for that. Yeah, it's it's interesting, right? Because um it does one of the things that we always talk about, which is um just putting creation in the hands of everyone and then best idea wins, you know, and and there's a beauty in that chaos. Uh and of course, people are going to make absolutely insane things. They have been making insane I've been making insane things. It's super fun. And the uh you know the Cameo stuff's really cool too because you can uh if you consent you can upload your face and you can even let other people start remixing it. So I've I've already had some people uh do some crazy stuff at Persbeats if you want to remix my face. Uh it's pretty fun. Uh but then it's cool because you can then insert yourself with your other friends who've done it as well. And this brings a whole new aspect to the sort of thing that we all enjoy about uh AI video is the the iteration process. You get to iterate but with your friends. It's like it's like multiplayer. Uh yeah, multiplayer slop generator and and for that man, it is cool. And um I I can't wait to see what people do with it. Yeah. Uh Neco in the chat says it reminds them of Vine. Yeah, it's definitely got that sort of um fast thing. It's really cool. Anyway, I don't want to talk about the whole I don't want to talk about Sora for the whole stream, but it is a very notable moment in uh history. And uh yeah, uh you know, um go play with it if you can get in and uh if um uh if I if you Yeah, ask a friend. I'm out of invite codes, so don't bug me, but uh uh ask a friend who's in and they might have some invite codes. Um mine uh got used up on day one, but yeah. Uh, so let's jump over to one alpha. Uh, and yeah, go play with Sora. It's very cool. There it is. One alpha. So this Let's uh bring our heads back up. So So cool. Hey. All right. Yeah. So one alpha um is what it what it says is a high quality text to video generation with an alpha channel. Uh that sounds um that sounds like nothing uh nothing special. But then when you think about uh the fact that you can now make layers um so you can make a background with anything you like and then make a subject for that background and then another thing to put in front of that and another thing to put in front of that. You can start thinking in layers like a compositor or a you know a movie film compositor would. And you know, the point being here, we can see through the backgrounds of everything so that you can place stuff in layers on top of each other. Um, that's very cool. As you can see, it it does a very good job with um uh like translucency as well as transparency. So like the uh the bubbles um can actually see through them and stuff. It's really really cool. Uh, this is based on one 2.1 14b, so the res is as high as you can render with it. Uh, but it was probably trained around 480p, 720p. I I don't know. Um, but, uh, you know, I'd love to see your experiments if you get stuff out of here that's like super high- res. Uh, I'd love to see it. One 2.114b is like it's a good sized model to be doing this with to go up. And there might be uh there might be a way to do like tiled upscaling and stuff like that for the videos, but um we'll get into it later. But the issue you'll run into there now is this is a dual VAE setup. You're an RGB VAE and a alpha VAE. But we'll talk about all that. I don't want to get too into the weeds right now. Uh but yeah uh basically you can either run it um by just downloading the models and running it in like Python torch diffusers all that stuff like like uh like the research research data scientists do or uh you can just run it in Comfy. So uh the link to this is in the description below and so you'll just scroll down to where it says official Comfy UI version and it tells you exactly what files you need, which models you need and it tells you exactly where those models go. So click the link and follow these instructions. Put these models in these folders and it will work when you grab this workflow right here. So it's all right there on the page and it all works right out of the box. So the only um thing uh that uh is special about this specific workflow. Let me try and um zoom in a bit here. You'll see install our custom RGBA video previewer and PNG frame zip backer. You will need this to actually use the stuff you make. So grab this rgba save tools python file and put it in your comfy UI/custom nodes folder. So you know this is a little more intermediate workflow to get set up, but once it's set up I'll show you it's very straightforward uh video generation workflow. Uh all the difficulty, all the complexity uh revolves around alphas uh which we'll get into alphas and masking uh channels, what that means and what that means like implicationwise with like how you're running your videos in Comfy. Uh an alpha channel is not something Comfy is equipped to deal with out of the box. Uh it's something you will always have to deal with as a secondary measure. So we'll talk about that as well. Um, and the, uh, the other thing I want to touch on before we get too deep into the, uh, workflow is that, uh, I don't actually process these layers with Comfy. Um, I actually process them in a video editor. So, uh, you can use like an old boomer video editor like I do, like Resolve or, um, uh, Premiere. Uh but if you use Cap Cut or if you use um uh you know any of the any of the the the new stuff uh for editing uh Canva that'll that'll all work too. So what it's going to give you is a zip file. Then when you unpack that zip file it's going to be full of PNG files and then those PNG files are an image sequence. So I know a lot of you already know what image sequences are so you can just ignore me here. But for those of you who don't, an image sequence is all the frames of a video in a in an ordered PNG sequence in a folder. So it'd be like frame one, frame two, frame three, all the way through. So um you can use those um PNG sequences in video editors if you use them as image sequences. I'll show you how to drag one into Resolve and etc, etc. So you can see what I'm talking about. But the point is you're getting the PNG file and the PNG file contains the transparent layer that you need to that the software needs to see that it's a transparent video. MP4s by their very nature are not they don't have an alpha layer. So you can't just save these videos as MP4s and drag them into Resolve and you'll get them on like a matte gray or black background. So um Oh, another Yes, that's a good point. Uh uh oh I'm going to screw this up. Gil Gulamu Laro says um you can uh output these in ProRes. Go ahead and do that as well. Uh ProRes does have an alpha layer. You just have to make sure that alpha layer is being respected um in regards to what your output is and how it's being chunked. But yeah, so this comes with a custom uh node for um saving these outputs. Uh this thing right where did I put it? Sorry. Down here. Yes. This custom RGBA video previewer and PNG sequence maker. You need this to make the little transparent videos. So all that to say, don't skip this step. Yeah, it's a must. It's a must. All right. So when you load that workflow up, it will look a lot like this. And it's actually very well laid out. I love this workflow. It's very linear. Uh I I I I prefer this sort of like long workflow where not a lot of stuff is squished underneath. It's just sort of stuff that's running. Each sort of section is sectioned off. It's very good, very good workflow. Um but yeah, let's talk about what it's doing. Um we are loading the one 2.1 text to video 14B model. So you will select that model after you've put it in the folder. And again, all this stuff is right here. Download the files, put them in the folders. Comfy/models blah blah blah blah blah. Okay, so uh the first one is the one 2.1 text to video 14B FP16. Pop that in there. These will already be selected. All you'll have to do is just click on it and look for it. One, two, T2V. Let's go T2V uh 14B. I can barely read this myself, so I don't know how you guys are getting on with this, but 12.1 T2V14B. So, you just select the model in your folder, and if they're not here because you just put them in and didn't restart Comfy, just hit that R key anywhere in Comfy. And once you see that update thing come up, you're good to go. Um, it'll update all these folders. And then, uh, the first Laura we're going to use is the Light X Laura, the 480p 14B image to video one. I know we're using the texttovideo model in the image to video Laura. I don't know. It works. Don't ask questions. Uh this is a special Laura that comes with it called the epoch 13500 changed. I don't know what it does. Just grab it. It's it's in the GitHub. And um that's the the model setup. And then we need the UMT5 XXL FP8 E4M 3FN scaled text encoder. Pop that in there. And then uh we can use this empty hunuan latent video uh node. This will just give us a latent uh latent like empty latent with the right um tensor shape uh for this uh animation. Uh don't worry about the fact that it says huan. It's just the same. They just use the same tensor shape for their latent uh canvas. We'll call it a canvas. the space that it's going to draw in. Uh, this model sampling S3D, uh, I just left it at five. I think if you turn it up, it might play with the motion a bit. I'm not actually sure. I wouldn't touch it if you don't know what you're doing. Or touch it if you don't know what you're doing. I've made a career out of that. Um, one alpha 2.1 VAE RG RGB channel. This is the special uh, uh, RGB VAE that it comes with. And then there's also uh actually it's it's this is the same the same one for both decoders and then it comes with these special um VAE decoders, the RGB decoder and the um alpha VAE decoder. So this is where we're creating our RGB channels, which is the stuff that we see as humans, and then the alpha channel, which is the thing that is the black and white mask that tells the computer where the transparency is in the image. So, um, what we can do is before we run it, um, and yeah, obviously the prompt, positive and negative prompts in the middle here. Uh, I find this thing works best if you tell it you want your images in a, um, on a transparent background. Uh, it it does a better job. Sometimes it'll make a background and then just cut out the back of the background. But you kind of just want the object for this stuff because you can make a background any old way you like uh with with you know even even one two two text to video uh image to video whatever you like. Uh you might already have backgrounds you just want to stick this stuff on too. That's another thing I want to stress today. We are thinking in layers. We are making assets. We're not considering these as final pieces. they are just components of a later final piece. So, um that whole like AI being a oneshot solution for a final thing, that's not what we're doing here today. Today, we're making samples to then play with later. So, um yeah. Uh yeah, for this the prompt was medium shot. A soap bubble flies through the air on a transparent background. Pretty straightforward. Uh it does the run and everything goes fine. And the K sampler settings are uh four steps, CFG of one, uni PC, and simple. That's because we're using the light X lighting. The lightex. Yeah. Yeah. So, we're we're getting the distilled nice light cooked version. So, yeah, this is a nice nice little baby workflow. So, what you can do once you actually have the workflow all set up to make yourself a little less insane all the time is we can kind of group all this stuff together. You know what? First, let's explore what's going on with the um things before I what I was going to do is turn this into a subgraph, but let's wait a minute. So, let's find out what it's combining to make this image. So, let's pull a video combine out of this node. Video combine. We're going to set that to 24 frames per second. MP4 12 CRF and save output. We don't need to because we're these are just um consider these to be like uh preview nodes basically. So, we just want to see what's coming out of these two nodes. Alpha VE decode and this RGB VAE decode. So, yeah, we got our two little uh previews here. And we'll just do another run. And it'll it'll probably do another run because I have another seed in there. Magical seed. Yeah, it's doing a new run. Uh yeah, if you're wondering why you can see the um encode before it's done on mine and you can't on yours, um that's a very easy fix. It is up in comfy settings up here. Comfy settings. VHS when you have VHS nodes and is this one right here. Display animated previews when sampling. If you turn that on and either just wait a bit or you know re refresh the page or whatever, Comfy will then start showing you what's happening during the animation, which is great because if you're going for a specific animation, you're not getting what you want, you can kill it early, which saves you time. So, here's what we got. This is we get what we're getting out of the VAE. So our our RGB VAE is giving the red, green, and blue outputs, you know, the colors. And then the alpha one is giving us the mask. So everywhere it's white, there's going to be stuff. And everywhere there's black, there's going to be um nothing. Like it's going to be transparent. So basically that's um the the the whole concept here with this final piece here is it's just pulling everything out where it was black. So yeah, very very cool. Let's delete these because we don't need them right now. Move this back here. So we've got this RGB alpha and all this stuff. Let's make this a lot easier on our brains by selecting everything here. and right click and say convert to subgraph. Okay, now we're going to double click on our subgraph that we just made. This is everything. This is the whole workflow, right? Double click on that. Now we're going to see our workflow is now connected to these little inputs and outputs on the end. We're going to use these to promote widgets out into the real world. So, uh, widgets are just these little inputs here. Uh, clip name, type, device. These are all anything with a little dot when you mouse over is a widget. Um, so we want to expose the stuff that we actually care about and and not expose the stuff we don't care about to make this an easier workflow to look at. So let's add um, one thing we're definitely going to want to have access to is the prompt. So we're going to plug that into the prompt here. Another thing we might want to have access to is the amount of steps in the run. Another thing we may want to have access to is the width, the height, and the length of the animation. Mhm. Yep. And for now, I think that's good. And another thing you can do here is rename these outputs. So this is our RGB out, and this is our mask out. I did that by just double clicking on the name. And now when I go back to my workflow. So it's so clean. Look at it. Come on. It's just a node. This is a nice little node. And we can call it uh one alpha alpha. Yeah, it's so good. Of course it's you can hide so much. This make it like production ready for people who don't really understand comfy. They just use this. Yeah, absolutely. So now that we have this stuff, let's talk about how we can composite these in comfy a little bit. There's a node in Comfy called image composite masked. Sorry, this I can't resize this node. It's a custom node and it's very large and when you try to resize it, it just gets bigger. It's like a it's like a bug. So, I'll just zoom in on this stuff here. Just got to quit. There we go. Stop getting notifications. All right, cool. Um, that's the whole workflow. Yeah. So, what I wanted to do now is show you uh what this image composite mask does. So, our we don't want our destination actually. It's so our RGB out that's our our video out is going to be the source. The destination is actually going to be black. We're going to put our source on a black mat so you can see what it looks like just against a black contrast. Mhm. So, let's uh we're going to use the fill image blank node. Um I'm trying to think of the Okay, we're also going to use the get image size and count node. This is in the KJ nodes pack. These these are so useful. Yeah, because I want to basically use the width and height of our output to determine how big the the black canvas is behind it. It's just an easy way to never have to do math. Uh and then count is actually really important as well because um this is this mask is only one frame that I'm generating here. This is the FL image blank node fill and fill nodes. There's actually a ton of image blank nodes. I just happen to have fill nodes installed. Um, so use whatever one you like. They all do it pretty much exactly the same. Select your color, select your width and height, and then it just generates a frame of that color in that size. So, um, oh yeah, I meant to say Phil's on a plane right now. Otherwise, he would be with us. Um, but he'll be here with us back on Tuesday, probably. So, yeah. Um, yeah, we miss him. All right. Uh, 00 for uh, black. What I need now is a repeat image batch node. Um because again it's one frame and I need to make a video. So I need to actually repeat this image over and over again for the background for each frame. That's what this count output is really handy for. So we're going to plug our count into the amount. So this is the amount of frames, the amount of videos that yeah video frames that is coming out of one alpha into this get image size node. It's getting the width and height making that the same size as the canvas. And then the count is going into the repeat. And that's all you have to do. Destination. The mask is just a little bit trickier. We're going to grab the mask out and we're going to convert it. Convert image to mask. Mhm. And then we're just going to drag that in here. So that should do the thing. And then we're going to hit that resize source. So if we did this right, and I can open a video combine 24 frames MP4. Yep. Wh the reason I change the CRF to 12 is that increases the quality of the outputs. Um if you want uh you know higher quality outputs, uh yeah, 12 is good. Um the lower you go, the higher the output, the less compression it adds. But after, you know, 14, 15, 13, 12, you're not getting, you're just making the files bigger for no reason. If you want really, really good quality files, you want to actually use ProRes here. The second last option. Yeah, ProRes is like the highest quality you can pretty much get out of Comfy. It's basically lossless, if not lossless. Um, so yeah, if we run this again, this nice little chunky one alpha node is going to do all the hard work now. And then we've got our little um compositor up here and it should make a uh a copy of this but on a black background. So yeah, just so you guys can see that that setup again. Just remember the destination is where the background the source is the foreground. Um, I know when you think of destination and source, they can actually kind of mean the same thing either way. And this confused me for a very long time. Um, I think we've we've all been through this one. One of the Yeah. One of the best the the things I can remember is, you know, the destination is like what it's landing on and the source is what's landing on it. So, if you think of it that and the mask is where it lands. Um, and in this case, it did exactly what we wanted. It made I mean, look at this. An image on a black background. Now, one of the really cool things about this is what we get for free out of this is something that we were talking about last night. But, um, in a lot of diffusion, there is no inky black. And in fact, in a lot of diffusion, there is no contrast in the sense that everything is always happening all the time in every pixel. it's dreaming a entire scene with every frame. So, there's always a little bit of stuff going on everywhere. So, with this alpha thing, it's actually really nice to just put stuff on a black background. Like, just saving these little videos with a black background are such a stark contrast to most of the types of videos that we get out of diffusion. So, even if you're not using this to do alpha stuff, you can totally use this to like just like make your stuff stand out because there's not a lot of content that comes out just on this like dark dark background. Uh, and this is like especially cool when you do things like um I was doing this last night, but a flower blossoms. Yeah, these are super cool. into this into the sky on a transparent background. Sky might actually break this. I might have to just say a flower blossoms. But yeah, the point being like you get this really nice stark contrast between I've said this about 50 times, but yeah. Um I love it. Yeah. What you were showing me last night and what we're going to see here pers is uh think about what you got somewhat quickly and easily. And what we used to do like a year and a half ago. Yeah. We would see people do this. We like whoa, how how was this made? This is amazing. And now you literally just do it in 40 nodes. Or in this case, two once you have it all set up, you know. Yeah, we could actually make this a We could make this a subgraph as well. Black matte. Okay, that's not at all what I asked for, but uh it's pretty neat. That's like a mask on its own. And again, we are only doing 33 frames here. This is one. I think you do like 121. Mhm. and try 121 frames. Let's go. Uh, all right. So, let's uh while this one's going, I'm going to jump over to Q&A because I see the chat's been going uh going off while I've been uh on this little adventure here. Uh, you can make microscopic stuff like bacteras and microorganisms. Oo, I like that. Yeah. Looks like popcorn explosion. Yeah. Or how about like a fractal? The fractal would be cool. Mhm. Good morning. Hey Kez, what's up? Safe Phil. Yes, Safe Phil. Uh, most image supports image formats support grayscale for alphas, partial transparency, opaqueness. this whole um uh this whole like uh like transparency alpha thing. This is like a all we could do a whole stream on what that means in Comfy because in Comfy alphas and masks are kind of the same thing but they they they mean different things but they they they're kind of related and they kind of are the same thing. So like you know um basically whenever you're having problems it's probably the alpha it's probably the mask. You probably have to invert it probably have to uninvert it. Yeah, I probably have to uh you know use a mask from something else. Um and then like it's just like Photoshop. You got layers and layers and layers and you got blend modes. But okay, but that's sort of that's sort of where I wanted to go after this. Um uh let's jump back over to um the demo because I think it's almost done. Come on, baby. Uh, someone was just saying something or I was saying something and it's gone. And it's gone. Um, but yeah. Uh, oh yeah, we were going to get into the video editor stuff, too. Uh, I've still got time. So, let's make a couple things and then we'll start stacking them in a video editor so you can see what I'm talking about. Do previews now show up in subgraphs? They absolutely do. Um, they absolutely do. Thanks to the hard work of Austin. Um, thank you Austin. Everybody give a big round of applause. A question. When you export a transparent video, what format is it? Because MP4s don't have alphas. That's a good question, Jimbo. We did cover this a little earlier, but I'll cover it again. this special node that comes with this workflow that will save it as a PNG sequence and that PNG sequence will be every frame with a transparent background. You can then take that PNG sequence and import it into any video editor as a video and it'll just have a transparent background so you can just put it on top of other videos and stuff. I'll show you what I mean by that when we get back into the uh thing. This looks great. Looks amazing. I was just going to say this looks absolutely stunning. Pers love it. So, let's make a bunch of videos the same. Yeah, make another couple home 21s. Let's do a blue flower. Oh, you're going to make a bouquet. Yeah, we'll we'll we'll make a little bouquet of flowers on the uh on uh on a video editor. Oh, no worries, Jimbo. Yeah, it's a it's a it's a strange concept. And you're right, MP4 does not handle transparency well. Um there are a few uh you know other things you can do. If you're into touch designer you can use HAP HAP uh HAP has a transparency layer. If you're into Da Vinci or not Da Vinci, uh Resolum, uh Resolum, uh the VJ software, they use DXV, uh use you can use Ali to convert those to DXV files. Uh the the PNG sequences and both DXV and HAP and there's another one called Notch LC. It's a lossless mostly lossless um video format. Uh, but what those video formats let you do is actually load those videos onto your GPU in real time. So, if you make HAP videos, Notch LC videos, or even ProRes, I think could do it. Uh, but I no, maybe not. Uh, what did I say? H, not Oh, and DXV. Those videos when you load them into like VJ software or Touch Designer or Resolum or anything like that, they load like you can just like seek in them anywhere you want. like they they are they are loaded like right onto the GPU. They are they are completely malleable in every way. Um and it is a really cool way to work with video. Um this is a really cool way to level up your stuff. Uh take these videos that we're making these these sequences, convert them to something like um HAP or whatever, take them into Touch Designer and just start layering stuff and and and glitch it out, loop it, and all of a sudden you're making just insane glitch art. So like you know uh have this this is to me like what this little workflow that you're showing us right now is uh it's almost like a uh like a paintbrush and you can literally just make stuff that is Yeah. That doesn't even look AI anymore. That is just uh this is very exciting. Yeah. I I've been really dorking out about over this since I started playing with it. I mean look at this red flower. It's I know it the the on the black. It's so nice. That's the other thing. Just just having these nice little like the the black ones like the black background ones look awesome. Uh you absolutely can. Benjamin asked no output for ProRes. You absolutely can. Uh all you have to Oh uh yeah sorry there is no output for ProRes on this node uh this this this particular save PNG zip node but you are getting a straight PNG sequence and then you can use um like uh I used ffmpeg to could like auto batch convert all my uh folders into like hap files and then you've got lossless. So, you could do the same thing for ProRes. You just ffmpeg uh convert a batch of them into ProRes. However you want to bulk handle your videos, you know. This looks cool, too. And let's do a purple flower. And by the end of this stream, you might even propose to me. This is so cool. Yeah, we'll see. You got to buy me a few more dinners first, buddy. I like how the petals pop up and Oh, man. This is this is beautiful purse. It is really cool. Yeah, I'm super stoked. I think this is how I'm going to make my happy birthday cards for and you know like Christmas cards using this workflow. So like you're you're running this on your 5090 right now. So yeah, it's doesn't seem to be too slow. Honestly, it's not bad. Yeah, I mean forund 121 frames we're getting uh just looking at my log here 155 seconds uh start to end uh about a minute minute and a minute and 15 seconds for 121 frames. That's pretty sick. Uh don't resize for pixel art native resolution retro games. Heck, that's a good point. I I don't know how low I don't know how low one goes. That's interesting. I wonder if you can make like 256 by 256 ones. Let's try that after this flower. I'm I'm super curious now. You gone done and sparked sparked me. Do you think it can do an 8-bit character just like no background, low res? Make little um like sprite sheets animations? That'd be pretty cool. Problem would be getting them back to their initial state, I guess. Yeah, it's funny. The the render is actually done very quickly. It's the um the saving the PNG zip that takes forever. The PGs just take forever to save like in general. Oh yes, thank you for mentioning that, Jimbo. Um I meant to mention this on Tuesday. Right now on Hugging Face, they are having a Laura frenzy. Uh AI toolkit by Ostus, uh which we've talked about a bunch of times on the stream, is free to use on Hugging Face this week. if you want to go train stuff. So, go train stuff for free on HuggingFace. I think you have to use their API or form input boxes to to to get it going. But again, free compute is free compute. So, get out there, friends. And that's uh Laura Frenzy. You know what? I'm going to get the link for y'all so you don't end up on some crazy website. Frenzy. Where is it? There it is. Yep, here it is. Uh, I'm going to copy the link and drop it in our chat. One moment, friends. There you go. Uh, multimodal art AI toolkit hugging face. Yes. Uh, if you go there, um, this is free, I guess, for a couple more days. So, if you need to train some Lauras, uh, Quen, uh, Juan, uh, Flux, uh, SDXL, I think anything you can train with, um, AI toolkit, uh, you can go train. So, please go take advantage and go train some stuff. And if it's cool, drop it in the Discord for everybody else to play with. Uh, Peter says, "This would be cool if it could output looping clips." Tell me about it, buddy. Tell me about it. Um, I don't think we're super far away from loops with one. Uh, if context windows are now a thing, uh, how do we get there? I don't know. I'm not I'm not that guy, but um, we got to be close, right? Got to be close. Okay, we got our flowers. Um, let's make um, uh, a hand reaching up into frame. Actually, I already did that. Um, the three flowers is probably fine. We're running out of time, so let me get uh Resolve open. Uh I'm only using Resolve because it's free. Um you guys use whatever you like. Uh there's a million video editors out there. There's a million ways to do this. Uh I just um yeah, I'm just going to use Resolve. So in Resolve, I'm going to go over and click on the edit field, which is going to bring me into the edit pool. Uh the other thing I have to do very quickly is just move these files over from my PC to my uh Mac. if you all just give me two seconds. Um, h have you seen anything on Sora that made you laugh your your face off yet, Jillian? Uh, yes, I definitely have the uh Oh, yeah. The one that made surprised me was uh I think it was I believe it was Rick and it was Morty. No, Rick, sorry, that got arrested by the police. It was just like what's going on here? It was just completely out of nowhere. Yeah, that that's one of the things that's kind of funny is like the the um the sort of uh blatant disregard for copyright is kind of hilarious. I'm surprised about this. It's uh I I guess it's an optin thing. So, you know, if you want to make some stuff, I think you have to make it now, basically. Yeah, that may just go away eventually, right? Yeah, that's what I think. Okay. And then I just put those in my temp folder and we'll put them in. I know it's really fun to stare at a empty editor, everybody. Just give me two seconds. I need to find a more efficient way to move files back and forth between my machines. It's got to exist, right? It's gota downloads paste. Okay, I'm just going to unzip these um uh image sequences once they make their way across my network. There we go. Two, three, and four. Okay, so I think I can actually just So these are the uh zip files that it it made and then uh I just unzipped those zip files and I if I recall correctly, you can actually just drag Yeah, I think so. folder of of PNG sequence right in to resolve like I'm doing right here. So there's my three clips. So let's close this and have a look. If I go back and hit play, yeah, they are doing the thing. So um again um it's got a black background because it's got a black um like there is no background video. If I were to add a video of like you know a field or whatever, then the flower would be in front of that field. Uh but here's another cool thing. Uh the videos on top of each other are literally on top of each other. So now we can uh scale the flowers down a bit. Reposition them. Make a little make a little scene out of it. Zoom position down. So, by just putting the clips all on top of each other like this, we're just basically sticking everything together. And then all of a sudden, we're making like video collages with our assets. Um, it's so cool, man. Actually, this was much faster than I thought it would be. Uh, but yeah, like what I was doing there was just playing with the transform controls for each video, right? But, um, it doesn't end there, right? Like we have um there's a whole series of uh in this com uh composite mode here. You can actually try all the different um uh blend modes for the different layers. Uh let me actually just grab a real quick piece of footage to put behind it. pixels video field. That's terrible. Let's find a cool video. Something that looks like the flowers could actually be in front of maybe that one. Oh, these are too high. This is a really fun part is actually thinking about compositing and and you know like where would this actually make sense? How about the need something where it's like lower to the ground so that it like would make sense for those flowers to be the size that they are cuz if it's just like a shot of a field right it's not going to really work. And maybe like a blurry back. You also kind of want it to be a static shot too. Maybe this one. So, let's download this. So, um here I'll just make this a little. There we go. So, these are our three video layers. I actually need to move them up because I want to put a layer underneath. So, and again, apologies to everybody if this is like like you know, you already know how to do this. Like, that's cool. Some people don't. Yeah. You just show like a way to something unique, you know? Pretty crazy. I mean, it doesn't look great, but it would be, you know, you can you can totally take this and run with it and then like do all kinds of crazy stuff with it. And it doesn't necessarily have to have a background either. Yeah. You could then take you could composite the scene you want and then literally dream over it. Yeah, exactly. And then it's going to blend things together, add reflections and shadows and super cool. The world is your oyster at this point honestly or your flower I should say to be in team say clam. The world is your clam. Look at that. Uh this is so cool. Uh, and also what you can do is you see because the one on the left is moving at the base, you can actually fix the base and move the head because of the wind. You can go crazy with this. Yeah. Well, that's the other thing is because these are transforming like we like maybe I actually like this one angled like that over here over there. Maybe I want it bigger. Maybe I want it smaller. Maybe it doesn't make sense. Maybe I can look at it now with that background and say, "Oh, you know what? These are actually these are a little big. So, let's let's bring them down. Just like get them get them planted into places where they might make a little more sense. Zoom it down. Position over here. Maybe put it down here. You know, maybe it's not perfect, but it's pretty cool. But yeah, you know, like just go like Yeah. Like I don't know. The This is a bad example, but the the possibilities are are pretty cool. I'll show you guys something I made with just a bunch of hands. Uh oh man, my hard drive is just full. You'd think someone with just just a nonsense brain had my computer. Oops. Play. Can you loop? Every uh video player feels like a regression from the last video player every time I use one. But yeah, this is like four, you know, four flowers and three hands with some hands reaching down to the flowers. like for like music videos. Like come on, this is going to be like Yeah. And you could build a completely automated system that just makes assets and just run it all night with an LLM in front of it and like Yeah, this is um this is super cool. Yeah, you can really like go crazy and make super stoked very usable. It's usable. It's usable stuff here in this point, you know. I'm super stoked to see what everybody makes with it. Um which we I have an idea. I mean, you know, we can talk about it later, but that'd be a cool concept for one of the weekly uh to see who makes the best composite. Yeah. Let's try a Friday sandwich on a transparent background. That's now we're talking. That's right there. Uh while this is rendering, uh I'd like to talk about the Comfy Challenge. We did the uh comfy challenge uh last week uh heartbeat uh in conjunction with Juan. I believe the Juan side of it is actually still going on too. So you can still enter uh on their on their side. Uh our side's done and we're moving on to a new challenge uh this week. So I am going to show you all uh the uh trailer for the next challenge and then we're going to watch the entries for uh last week's challenge. And uh let me tell you, I was absolutely blown away. I always say this, but last week Oh, this is cool. This is great for just VJ material. There's stuff to throw on like a dubstep set. Just be like sandwich and and you make a dodger active. Oh my god. So yeah. Um yeah, that that sandwich came up at the most inopportune time. Just talking about how moving everyone's pieces were. Um, so yeah, let's have a look at the uh the the trailer for uh next week. Uh, next week is called um The Impossible Feast. The Impossible Feast. Ready? [Music] So friends, make something very weird. Mildly edit edible. It doesn't have to be edible. It just has to sort of be food like the most weirdest dish you can think of. The most exotic dish you can think of. Bring it. Bring it this week. Uh concrete sushi. uh you know uh yeah yeah yeah go nuts. Um that's the point. It's supposed to be weird. Okay. And now uh I am going to uh be serious and let's check out this amazing montage of uh of pieces based on the term heartbeat. Uh this was a storytelling uh challenge this week. So um keep that in mind when you check these out. So check this out. [Music] [Music] Good job. [Music] [Music] Woo! [Music] [Applause] [Music] [Music] Wow. Woo. [Music] Heat. [Music] [Music] Heat. [Music] coming. [Music] [Applause] [Music] Heat. Heat. [Music] Fore! Fore! Fore! [Music] [Applause] [Music] [Applause] [Music] [Music] Heat. Heat. Fore. [Music] Thank you. [Music] Holy cow. Are you kidding me? It makes my sandwich uh makes my sandwich look a little stupid. The They were amazing. But the ink more one, you know, with the ink just Yeah. completely blew my mind. But all of them blew my mind, but that one was like, whoa. Yeah, that was incredible. Um, you know, once again, just totally blown away by the the creativity in our community. I can't wait to see what what crazy meals you guys come up with that'll be shown, right? Should be a little lighter this time around. And I think it would be fun. It doesn't necessarily need to be uh uh it can be funny. You know what I'm saying? It doesn't need to look good. It can be also wild, you know, as far as food. Just be creative, essentially. Yeah. Yeah. And that's the thing, man. We just want you to use comfy. It doesn't have to all be in comfy. It's just part of it and, you know, major part of it. But, you know, if you just made everything in comfy and then go crazy with with a video editor that, you know, it's only going to make your piece look better and and that's totally totally well within the spec of of what we're trying to do here. We're we're trying to get you to to make stuff and then do stuff with it, you know. And yeah, this one alpha thing, I think it'd be a really good chance uh to to try making uh folders and folders of assets and then figure out, you know, the the finer points of of mass asset management. Um, some tips I can give you are um uh learn how to use claude or chatgpt to make bash batch files or uh python scripts that can recursively uh uh dump like the zip files into um folders and then um same thing to to like use ffmpeg to convert them into stuff and then put them in an output folder. basically save yourself as much manual moving and renaming and unzipping and deleting as you possibly can. Um cuz uh it's fine when it's 10 assets, but if you made, you know, 200 assets overnight and you want to just have them uh ready to go in a folder, you're going to need to basically build your own way to do that. This is just there is no there is no software for that really. Like there's there's Adobe Bridge and it kind of does a thing, but not really what we're talking about here. Uh like learning how to like process these big chunks of stuff you make is is one of the skills uh that you'll have to build uh if you're going to make large chunks of stuff and do stuff with it. Um or you can take a more pointed approach and make the stuff you need and only convert the things you're going to actually use. But what's the fun in that? I'd rather have a hard drive full of files I'm never going to touch. Hm. Oh, yeah. We didn't try the um the pixel. Yeah, let's try that real quick before we before we close up we close up shop. All right. So, we're going to try 256 x 256. Let's just yolo this. I bet you it could be all time. Okay. And we're going to say medium shot. A little man jumping in the air. transparent background. Uh, we're going to say pixel art 8bit. I have no idea if this model can even do this res or if it can do this kind of stuff, but I am curious. Yep. Gemini CLI. Yeah, anything. Yeah, I didn't mean to uh just say the expensive ones. Uh any uh Whoa. Well, it's not pixel art, but it's cool. That's 256 by 256. Wow, that's really impressive. Actually, that looks You could just pixelate this guy. No. Wow. I mean, this would kind of stuff would be good for like children's animations where you just need a bunch of like stuff to fill a scene. You just like kids running around and like uh while somebody's like learning how to do math or whatever. Try a couple runs. I don't even know if you can lower the resmore. Wow. It just doesn't have any 8 bit stuff in the model, I don't think. Man, it spits these assets out fast. It's It's taking longer to save the PNG sequences. I mean, look, they're pretty good, too. They're very usable. All right, let's try 64 by 64. Whoa. Might have found the lower limits of the model. 128 by 128. This all This is all very important research, everybody. Whoa. Okay. I think 256 by 256 feels like the lower limit. Let's try um uh 480 by 480. That feels like it should be the right res for lightning. That was 128 by 128. John, thanks Willie. Appreciate you, bud. That's pretty good, actually. Damn, that's so good. But yeah, this would be a great way to make little tiny assets. Yeah, you can always couple this with um Phil's um what is it? Multiple uh FL prompt selector. So we've covered this a couple times but uh in fill nodes there's a node called uh FL prompt selector and this is the index. So in a bash sequence this index will will iterate or rather this index will correspond to the line on this file. So um I can say uh like in our case here right sorry I got to get away from this node. It's there we go. So, here's our prompt. I'm going to copy this prompt. I'm going to take it over to the fill node here. And we So, we have a prepend and an append. So, I'm going to prepend some stuff. And I'm going to append some stuff. I'm going to append transparent background. And I'm going to prepend medium shot. And I'm going to get rid of the pixel art stuff because it ain't doing a damn thing. Boink. And then we're going to grab this middle bit here and we're going to paste it here. So, we're going to say frog, turkey, horse, dog. Medium shot, frog, turkey, horse, dog. Index zero, one, two, three. What do we need? We need an incrementer node. You can get these anywhere. You can even use an int. Actually, we can just use an int. I'm going to use an incrementer node because it has this max value thing. It's over where the prompts are. Doop doop doop doop. Okay. our incrementer over here. This is the thing that's going to go up a value with every run. And that's what's going to iterate through each one of these prompts using this prepend and append for each shot. So seed zero, we're going to set that to increment. And our max value is 0 1 2 3. Um, ids start at zero. Regular numbers start at one. So you got to subtract one number from the amount of things in your thing to do the thing when you use indexes. This is a problem you'll come into every time you have to use an index. I'm going to do a batch of four up here. I'm changing it to four. And then we got our incrementers incrementing from seed zero, seed 1, seed 2, seed 3, seed four. Then we just plug our little string into our one alpha node here. And it's going to go mediumshot frog transparent background, medium shot turkey transparent background, mediumshot horse, medium shot dog. So obviously um these are very stupid prompts, but if you were to um use better prompts, you could get much better results. But see, we got four queued up. So it should make a frog, then it should make a turkey, then it should make a horse, then it should make a dog. If all is right with the world, there's your frog. You can also do this with a um uh just like a uh text list. There's like text list nodes that do the same thing. Um if you feed a text list into um this input here, this node will automatically skip through the text list line by line. So if you just new line separate each prompt, it will do the same thing. Uh, the reason I use the incrementer is because I might want to do eight runs and then have it loop through the thing twice or or 12 runs and then I can say there's only four prompts so that it it loops every four prompts. It does the same thing. Yeah. So, we got our turkey. Now we're doing our horse and then we should get our dog. Unless I mathed wrong. I might have mathed too close to the sun, but we'll see. Uh this is I think it's native. Yeah. I think it is the alpha ved code. I'm not sure where that comes from. Oh, it's just a regular VA code. Yeah. Yeah. Yeah. This is all um this is all um native. Oopsies. Frog horse. Oh, I did frog again. I probably messed up the math here. Max value probably four. I think they changed it so the max value is not related to the zero one two three four things. So like it's actually how many things you have in your list and then it just it does the math itself. So but yeah, this should have been a dog, but it's a frog. Whatever, man. Yeah, friends, hopefully you can have some fun with this. Um I would love to see uh sorry, where do you select four prompts to loop the list? Uh right here in the increment node by masquerade nodes, I'm selecting a max value of four and then I'm starting my seed at zero. So when it gets to four, it's going to loop back to zero. And then I just ran four runs at the top. So if I ran eight, it would loop through this list twice. Yeah, hopefully that makes sense. But yeah, friendos, um that's been super fun. Uh thanks for hanging out as always. We appreciate you very much and we will be back on Tuesday. C H E W E S D A Y Tuesday. It's And I just wanted to say it's good to be back. I miss you all. Miss your purse. Yeah, we haven't seen you in two weeks. Feels like a long time. Let me tell you, jet lag is real. You want to tell the people what you got under your desk right now? Oh, yeah. I do have a little RTX 6000 Pro. Cool. Yeah, it's exciting. It's set up. It's running. Everyone has a 6000 but me. But that's okay. That's fine. I'm fine. It's fine. I'm fine. It's fine. I don't need you the way the remote end first. Thanks, friends. Uh much love. Uh go play with one alpha. Uh every like again everything all the stuff you need is in the link to one alpha below. Just read the GitHub and go through the stuff. Uh the model links are there. The directions I'm going to go just pull it right back up so you can see what I'm talking about talking about. Come on. There we go. Okay. Uh so if you just go to that link at the bottom, you will come here to the WeChat CV one alpha page. Scroll down past all the demos and you will see a section for how to get set up with Comfy UI right here. Official Comfy UI version. Download all the models and then place them in these folders that it's telling you to put them in. After you do that, the workflow will just run. Uh this is the workflow. And uh please don't forget the other thing you have to do is download this rgba save tools python file and just drop it in the root of your custom nodes folder so that uh it gives it uh if you don't do that this node won't appear um this save png zip and preview rgba. You need this node to see what you're doing and to get the um pgs out. So uh yeah that's it. Thanks friends. Have fun. Enjoy. We'll see you all on Tuesday. And I have a very special guest coming on Tuesday. Uh that'll be CJ from Databots and Stable Audio. And we'll be talking about Stable Audio 2.5. Uh the state of current open-source and closed source uh audio diffusion and the history of the band Databots. Go check them out. D A D A B O Ts Databots. They are probably the first AI band. They've been doing it uh longer than most of us have even been aware of AI. So, that's going to be really sick. Come out on Tuesday. And if you miss it, don't sweat it. It'll be on YouTube like every other one. Uh we are always saving these on YouTube and they will always be available for you to come back and watch again. So, if you miss something, come on back and check it out. Uh the live chat usually works a day or two after they're posted. So, uh if I was sending links in a live chat, you know, apologies for that. They will show up eventually, though. So yeah, much love. Thanks, Julian. Uh hopefully we'll see Phil on Tuesday and we'll have CJ and we'll have lots of fun. Uh so yeah, do the challenge everybody. I want to see your weird food. I want to see your weird food. I can't wait. That's gonna be a fun one. Bye-bye. Have a good weekend, everyone.

Wan Alpha in ComfyUI - Videos with Transparency

Channel: ComfyUI

Convert Another Video

Share transcript:

Want to generate another YouTube transcript?

Enter a YouTube URL below to generate a new transcript.