Thumbnail for How to Start Making AI Videos in 2026 - Full Course by Youri van Hofwegen

How to Start Making AI Videos in 2026 - Full Course

Youri van Hofwegen

17m 42s3,654 words~19 min read
Auto-Generated

[0:00]Thousands of people start learning how to make AI videos every day, but over 90% of them give up in the first week. And that's not because the creation itself is hard. It's because they get lost in what AI model to use, what workflow to follow, or they simply get overwhelmed by so many tutorials. That's why in this video, I'm going to walk you step by step through the essentials you need, so by the end of it, you can go from a complete beginner to someone who can actually create impressive videos in less than 15 minutes. But first, before we actually get into the practical steps, you need to understand something, because this is what decides whether your next generations will look realistic or not. Most people spend 90% of their time inside AI video generators testing prompts and regenerating over and over again. And while that might seem like the right approach, it's actually the biggest mistake you can make. Whenever you create a video with AI, you have two options, text to video and image to video. Text to video is pretty straightforward. You write a prompt and the AI creates an image based on your description. But there's a problem with that. The AI has to figure out the character, the lighting, the environment and every other aspect on its own. And what happens almost every time is that it misses details. So let's just test it with this prompt, and this is the video that we get back.

[1:09]As you can see, it looks pretty decent, and most of the details are there, but it didn't really add everything I asked for in the prompt. And this is not even the biggest problem. Here's what I get if I regenerate the same video with the exact same prompt as before. The character looks completely different, and it looks like a different rooftop. And this happens because every time you hit generate, the AI starts completely from scratch. It doesn't remember anything from your last iteration. So if you want to create a video with multiple scenes that doesn't look like random shots stitched together, it's impossible to use the text to video method. And that's exactly why the most experienced AI creators avoid this method all the time. They always go with the second one, image to video. With image to video, you're showing the AI exactly what the first frame of your video needs to look like before it even starts generating. The model takes that image as a visual reference and builds on top of it. The AI doesn't have to guess what your character looks like or what the lighting should be, because you've already locked him in. So if I take this image and add a simple prompt, I get this result. It kept all the details in place, but on top of that, here's the video if I modify the prompt. The character and all the other aspects look exactly the same in every generation. But not everyone who uses this method gets realistic results. Most people generate average images and then expect to get a cinematic video from them, which is simply impossible. If you want to get realistic results, you need to make high quality images. So the real question is, how do you actually create them? Well, the answer comes down to five specific principles. And the reason thousands of beginners generate low-quality images isn't that they're using the wrong tool. It's that they don't actually know what makes an image cinematic in the first place. Now, after studying real filmmakers from Hollywood, I found out that they all shared the exact same five principles when creating a movie. And the good news is that once you understand what these are, you can start applying them directly inside your AI videos and create better results than most people. The first principle is lighting, and this is what makes the difference between an image that looks like a normal photo and one that feels alive. And that's because good lighting that has direction and creates shadows will make it seem like it's in 3D instead of 2D. Just take a look at this image with poor lighting, and now at this one. The difference is massive, even though we only changed the lighting. Filmmakers know how important this is. That's the reason they hire multiple teams and spend hundreds of thousands of dollars on lighting equipment alone. Now, the second principle is depth, and this goes hand in hand with the lighting principle for creating that 3D feeling. If you want to avoid creating flat images, you just need to add something slightly out of focus in the foreground, your subject sharp in the midground, and then the environment in the background. This will make viewers see it as a real three-dimensional space. But this doesn't have to be really complicated. Even a small detail, like a blurred railing at the edge of the frame, is enough to create that effect. Now, the third principle is composition, and this is where you actually place your subject inside the frame. Beginners who have no real background in filmmaking think that the subject should always be in the center. But that's actually a mistake very few people talk about. What professional filmmakers do instead is apply the rule of thirds. Imagine your frame divided into nine equal sections by two horizontal and two vertical lines, and then just place your subject on one of those vertical lines slightly off-center. It might seem really complicated, and it actually is for real movies, but when it comes to AI videos, you can adjust this in just a few clicks, like I'm going to show you now. For this principle, you can also take advantage of leading lines from roads, corridors, or walls that naturally pull the viewer's attention toward the subject. Now, the fourth principle is emotion, and this is actually the secret behind any blockbuster movie. All the cinematic scenes and characters are important, but what really makes them special are the intense emotions behind them. So before you even start generating your images, you always need to think, what should the viewer feel? Because once you know the emotion you're going for, everything else falls into place. You'll know exactly how to use lighting, the composition, and all the scenes. But if you skip it, you'll end up making all those technical decisions randomly, and the final result will feel low-quality. So if you want to take full advantage of this principle, you need to also use the fifth principle, which is color. Every color you see has a feeling behind it, and that's why real filmmakers study them so much. Warm tones like reds, oranges, and golds naturally push toward intensity, tension, and passion. Cool tones like blues, teals, and grays push toward calm, distance, and isolation. So once you know the emotion you want to create, you basically already know which direction your colors should go. And now that you are aware of this, you'll easily avoid mixing warm and cool tones. These five principles are what every experienced creator applies in his AI projects. But just knowing them is not enough, you still need to know how to practically use them. To show you how that works, I'm going to go over to Higgsfield. If you've never heard of Higgsfield before, it's one of the most popular all-in-one platforms that gives you full access to all the image and video AI models. So instead of having five different subscriptions and switching between tabs, with Higgsfield, you get all that under one roof. But what really differentiates it from other similar platforms are its original AI features. One of them is Cinema Studio 2.5, which is the only workflow I recommend beginners use. It has everything you need, from creating images and videos, to even voices at the highest quality out there. And the only reason this is possible is because Cinema Studio is only trained on cinematic only data. So even if you give it a simple prompt, the results are way better than most AI models. And on top of that, the overall interface is so easy to understand that even a complete beginner can use it. That's why I'm going to use it for this video. And by the time you're watching this, Higgsfield will have already dropped Cinema Studio 3.0. It's their latest AI film tool, focused on more realistic optical physics, better scene understanding from references, built-in audio, and an overall jump in cinematic video quality. Everything I'm about to show you still applies. It's just that Cinema Studio 3.0 will take these same techniques and push the results even further. So if you want to follow along with the tutorial, I'll leave a link to Higgsfield in the description where you can sign up. Once you're inside the platform, this is the homepage you'll see. Now, I'll go to Cinema Studio 2.5 and select the image section because I want to generate the location for my video. For this, I'll select the location option and then create the prompt. For the scene location, I'll choose a rooftop over a war-torn city at golden hour, distant buildings on fire, thick smoke rising into an orange and red sky, debris and rubble scattered across the rooftop surface, dramatic shadows, cinematic. For the emotions inside, I'll go with tension and urgency. To emphasize that, I'll use warm tones like orange and red in the sunset light that creates dramatic shadows. Here's the exact prompt I'll use for it, and here's the result we get back. Even from a fairly simple prompt, the result already looks like it was made by a real production team. And you can definitely feel the intensity inside, so I'll save it as a new location. This allows me to see exactly how the environment will look before animating anything, which is crucial for not wasting my credits. Now, let's go and create the character. But you need to be really careful here because if you're not doing this right, your character might look different with every generation. Even though you're using the image to video method. And honestly, this is the biggest trap when it comes to generating AI videos. I see too many people falling into it, so let me show you the proven way to generate consistent characters throughout generations. Just a few months ago, this required a lot of effort to set up, but now you don't even need to write a single prompt. And all of that is thanks to "Cinema Studio." So once you're inside, go to the character section. Here you can build your character through eight specific categories, the same way a casting director would think about building a role for a film. Genre is first, and this matters more than it looks. There are 14 options like action, drama, and horror. And the reason this category is so important is that the same prompt generates completely different results for thriller than for comedy. I'll go with war. The budget is set in millions of dollars and this sets the overall visual polish. A higher budget means a more refined, sleek aesthetic, so I'll pick 250 million. Then we have the era. This is the time period, which shapes the clothing, the grooming, and the overall style. And because I want something modern for my video, I'll choose the 2020s. For the archetype, I'll set my character to be the hero. But these were just the first options. Let's now get into the physical appearance of the character. Inside the identity section, you can choose the gender, which will be female for my example. Race, I'll pick Asian, and then the age. My character is a female soldier, so for the physical appearance build, it'll be athletic. Now, these are most of the options you'll get with a normal AI generator. But when it comes to Cinema Studio, they actually took that extra step and added all the human details you can think of. You have height, eye color, hair style and texture. And for her physical details, I'll go with brown eyes, brown hair with a fringe and a wavy texture. There are a few more options you can choose from, like the outfit or even adding tattoos, so I'll add that too. Now, I'll click generate. She looks exactly like I was expecting. Look at the textures on her. It's nowhere close to that regular AI plastic texture you get with most generators. Higgsfield really nailed every detail, and that's because it was trained on cinematic only data.

[9:13]And what this means in practice is that you can use this exact character across all your different scenes and it'll always look like the same person. No matter what location, lighting, or mood you choose. Now, go back to the image section and select scenes. This is where you combine the character and the location we just created into a single shot. And it's also where all five principles come together for the first time. So select your character and your location, and then set the resolution to 4K to get the best quality possible. Then for composition, I'll place her on the left third of the frame with the ruined city creating leading lines behind her. For depth, I'll add a piece of debris out of focus in the foreground, her in the midground, and let the skyline stay in the background. Now, I'll hit generate, and here's what we get back. Even as a still image, you can already feel the tension on her face. But I want to go even further and make it more realistic. For this, Cinema Studio gives you a full set of editing tools to refine it. One of the most important ones is color grading. At the top of the panel, you'll find presets like natural, split tone, and cinematic. These are the fastest ways to shift the overall mood of the image in one click. So pick the one that matches your emotion. Once you have a direction you like, you can go into the color settings for more control. This is where you adjust the temperature, hue, saturation, and contrast all in one place. Temperature lets you push the entire image warmer or cooler. So if your scene is supposed to feel cold and distant, but the generation came out too warm, this is where you fix it. Next is bloom, which adds a soft glow around the bright areas of the frame. Then there's halation, which simulates the red glow that forms around highlights on real film. You can also layer in film grains for that cinematic texture. And here's the result. The difference between the before and after is not huge, but these small details are what make people watch it or not. But if you don't want something extremely detailed and specific, the color presets cover everything you might need. However, there are a few simple settings that make a massive difference. Everyone should know what these are and how to use them. One is relight. This lets you change the lighting direction after the image has already generated. So if the scene looks right, but the light is hitting from the wrong angle, you don't need to regenerate the whole image, hoping you don't lose important details. Knowing these features alone makes you an intermediate AI user. There's one more step you need to take: how to turn your image into a high-quality video. This is pretty straightforward, but you still need to be really careful here because if you get this wrong, you'll burn through credits and still end up with unusable results. One of the biggest mistakes you can make here is to write a huge prompt, trying to explain all the motions, emotions, and actions you want to have inside. And that's because the video model gets confused and the output is not going to be the one you expect. And honestly, this is the part where I see people give up because they think the tool isn't good enough, when the real problem is their approach. In reality, it's way easier than you think. The video section inside Cinema Studio gives you dozens of pre-built options to handle all of that for you. Instead of spending time trying to describe everything in one prompt, you just write the action you want in plain English and set the rest from the controls. So let's go to the video section and select single shot. First, upload your reference image as the starting frame and add your character, as well as the location. Now, before you write anything, you can already set most of what you need from the pre-built options. The first is the emotion setting for your character. You can choose from options like hope, anger, and even fear. The model uses this to adjust the character's expression and body language throughout the entire scene. For this shot, I want her to feel tense but focused, so I'll go with this. Then there's genre, which tells the model the overall energy and pacing of the video. And just like with the principles, this one decision shapes how everything else moves and feels. For a military rooftop scene at Golden Hour, I'll go with action. Next is camera movement, and here Cinema Studio gives you a full range of pre-built options from slow pushes and dollies to even a 360 roll. For this shot, I want a slow cinematic push toward the character as she scans the horizon. Then, for the motion prompt, you just describe the action in plain English. And the last setting before generating is the speed ramp, which controls how the movement feels emotionally. You can leave it on auto, but if you want full control, you can put it on slow for more tension or make it faster for urgency and action. I'll keep it slow here because I want the viewer to feel that tension for a couple of seconds before the next scene comes. And here's what we get back. The motion is exactly what I wanted. You can instantly feel the tense atmosphere, and the character already looks under a lot of tension. The overall scene's body language already carries the emotion we set before we even wrote a single word. But with the single shot mode, you can only generate one scene. It's good for testing quick ideas. But if you want to create a longer project, you'd need to stitch all the scenes together in real editing software, which can quickly get complicated if you don't use it properly. So that's why Cinema Studio has the multi-shot manual mode. This has all the same features as the previous one, but here you can actually create a video with multiple shots in one go. Now, you can build up to six shots in a single generation, each with its own prompt, motion, and duration. And even though the workflow is now way more powerful, Higgsfield keeps it very simple to use. You can add, adjust, and even move around these shots with only a mouse click, which would have taken multiple complex prompts just a few months ago. So I'll create three different scenes for this example. For the first one, I'll paste in this prompt. And for the camera movement, I'll go with the zoom in, and then slow down the speed ramp so it builds up more tension inside the scene. For the duration, I'll go with four seconds. Now, in the next scene, she'll raise her rifle, so I'll paste in this prompt. For this, I'll make everything more alert by adding a zoom in as she tracks the target and changing the speed ramp. Now, for scene three, I'll keep it at just three seconds and paste in this prompt. This is where the impact happens, so I'll try the hero mode for the speed. Now, this is exactly the moment where everything we built so far comes together. So let's click generate and see what happens. The character stays consistent across all three scenes. The pace follows the exact emotional arc we planned, and everything looks cinematic. It's actually insane what you can build in just a few minutes as a complete beginner. Now, at this point, you already have a full cinematic video. But if you want to take this even further, there's one more thing you can do. And this is where one of the newest AI models comes in. I'm talking about Seedance 2.0, which is finally available inside Higgsfield. Now, what makes this different is the way it works with references. Instead of starting from an image, you can actually take an entire video and use it as a reference to create a new one. And what it does behind the scenes is that it rebuilds your video in a completely new way, while keeping the same structure and motion. This means you can test different styles, change the mood, or even push the realism further without losing the consistency you already have. So here's the new video I got from Seedance just by referencing a video and using this prompt.

[17:15]Take a look at how it compares side by side with the original clip. The results are absolutely insane. In my opinion, this is the best AI feature I've seen this month, and you can access it right now inside Higgsfield. So if you want to start building your own AI videos without getting overwhelmed, go sign up for Higgsfield with the link in the description below. Thanks for watching, and I'll see you in the next one.

Need another transcript?

Paste any YouTube URL to get a clean transcript in seconds.

Get a Transcript