YouTube to Text Converter

Transcript of Stop Hallucinations! Best n8n AI Agent Settings Explained

Video Transcript:

i'm sure most of the videos you have watched so far just shows you how to add this ei agent node over here and give a rough prompt in the system message and then add one random chat model and then add a bunch of tools that's it but that's a very naive way of creating ai agents if you truly want to become an expert and build professional grade ai agents that work in the real world and they perform the tasks as desired then you need to watch this video i will unveil the correct way of creating ai agents with all the required configurations what you need to consider while selecting chat model how to make this ei agent output the data in your desired format what's the correct way of using memory what's the best way of giving the system prompt what to add what to remove from there and so on i will also share the system prompt templates and the guides that you can directly use in your workflows so keep watching till the end and let's get started so right now this ai agent is kind of boring and useless it doesn't know what it needs to do and it has no capability no brain and no memory now let's configure it one by one to make it the best ai agent in the world let's start with the most important component of this ai agent which is the brain choosing the chat model depends on what you want this ai agent to act as do you want it to act as an einstein or do you want to act it like the best marketer in the world if we click on this chat model button we can see a long list of large language model options we can choose from entropic is really good for longer documents and safer tone azure openai is just openai hosted on azure and it provides you the enterprise security so if you're looking for that you can go ahead with this option aws bedrock has models coming from aws deepseek has good coding and reasoning capabilities google gemini chat model has multimodel llm models in the google ecosystem they're strong at vision text voice and they integrate well with google cloud grock is really good for ultra fast responses where you're looking for responses in milliseconds it's kind of real time mistral and olama are open source which means that you can they can be hosted on your server and you have the full control of all the data if that's what you're looking for you can explore these options open router is really interesting if you don't want to create separate accounts for all these different companies like openai anthropic etc you can just create one single account on open router and you can access all of those models just from one single api key this is really powerful and really good you can even compare different models you can experiment which model performs well on this platform and finally we have openai chat model which gives access to a large number of openai models openai has reasoning models chat models and cost optimized models the reasoning models are the most powerful ones but they are the most expensive as well so 01 03 01 pro are quite expensive but have really good reasoning capabilities if you're looking for brainstorming ideas kind of use case where you're trying to come up with a solution for complex problems these are the models you should be looking at if you're building some chat assistant or looking for general question and answer summarization etc or maybe you want to create some blog i think jpd 4.1 jpd 4o are really good jpd 40 audio gives you the capability of audio processing as well and if you're looking for costoptimized models then you can go for the smaller models like o4 mini 4.1 mini gp4 mini and so on so let's say if i select open ai chart model from here then i can select the model that i want to use for this ai agent however it's possible that you want to use different models for your ai agent depending on the user's query or the type of task which is being provided to the ai agent you can create one more ai agent just before this ai agent then give a system message like this your lightweight model router assistant given the user query output only one model so we have given only three models right right now but you can add more models as per your requirement and then these are the decision rules on when to select which model and then it will output the name of that model now this ai agent will give us the name of the model that we want to use in this primary ai agent then we need to give a chat model to this agent we can use a cheaper model let's say google gemini chat model and maybe let's say google gemini 2.0 flash that should be fine dynamically generated model name we need to come here and add the open router chat model you can go to open router then go to your profile then click on keys and create your own api key from here and then you can create a credential over here once you do that you'll start seeing all the models provided by open router but since we need to give the model dynamically so so we need to select expression and in the expression we can give the model name generated by the previous agent so let's run it and let's give it a query what is 2 into 5 into 50 into 50 and now let's run this model and it returned open ai 01 it looks like it's adding /n at the end so we need to remove that so if you come to this ai agent then first of all we need to change this chat trigger node to define below and then we need to go to the chat message received and drag the chat input over here next in the open router chat model node we need to come here and give this output in the model field but we need to trim the slashn at the end so we can select trim end function to do that and we have our model successfully selected over here that's how we can dynamically select the model for our ei agent so as a general rule of thumb if you want to select the best model for your use case this is what you can do if you're looking for reasoning capabilities then you can choose from deepseek or open one thing to keep in mind is the deepseek r1 model is not supporting function calling at this time it means that you'll not be able to add tools with that model this a agent can only use that model to think and respond that's it it will not be able to execute the tools if you want your ai agent to execute tools then you'll need to select openai's open or o3 models now if you have any dependencies on certain vendors like azure or aws then you can go for aws bedrock chat model or azure openai chat model if you're looking for accuracy you can go for gpd's 40.0 or 4.1 models and and enthropics cloud models if you have some strict privacy concerns and if you can't use external services then you can go for olama or mestral in most of the cases you just want to focus on openai anthropic and google gemini now let me remove all of these things just to keep the explanation clear and select openai's 4.1 model now in this chat model there are a couple of options that we need to understand the first option is frequency penalty it penalizes the repeated words or phrases in the generated output a higher frequency penalty makes the model less likely to repeat itself in the final response you can set it around 0.5 to one to encourage fresh and varied responses in fact you can start with the default settings and if you don't like the responses generated by the ai agent then you should probably check all of these setting to improve it further you can fine-tune the model through these settings the second thing is maximum number of tokens it limits the length of the ai generated responses let's say you don't want the responses from this chat model to exceed 200 tokens then we have one of the most important setting over here which is sampling temperature you can think of this temperature like the spice level in your food if the temperature is zero then there are literally no spices in your food it's completely tasteless if you set the sampling temperature to one then your food is highly spicy a low temperature produces accurate factual and predictable responses a higher temperature generates imaginative expressive and sometimes really unpredictable responses its value is from 0 to 1 and as an example let's say our prompt is describe coffee in one sentence and if we set the temperature to 0.1 then it may give an output like coffee is a hot drink made from roasted beans if we set the temperature to 0.7 then it may give a response like coffee is a warm hug in a mug that fuels your morning if we set the temperature to one then it may generate response like coffee is the liquid symphony of sunrise chaos and bliss you can see the difference when to use what if your use case is support chatbot or let's say frequently asked questions chatbot then the suggested temperature is 0.2 to 0.3 it sticks to the facts if you're creating instagram captions blogs etc then you can set it from 0.6 to 0.9 so that it becomes creative if your use case is code helper or tools agent then you may want to set it to 0.1 to 0.3 so that you don't have any surprises time out is the time this node considers the request to be timed out if it doesn't get the response back from openai in the defined duration you can leave the default value next is max retries this is the number of times this node will try to make request to open aai if the request fails and finally we have top p so top p is also called nuclear sampling it decides how many word choices the ai considers when replying so the llm model is nothing but it's a predicting machine for a given query it tries to predict the next set of words and returns them as responses so if we set top p to one it will allow the ai to consider every possible word for the response if we set top p to 0.3 then it will allow ai only to choose from the top 30% safest most likely words you can think of it like a word filter so let's say if we have the same prompt describe coffee in one sentence and if we have our top p set to 0.3 then it may give a response like coffee is a caffeinated drink if we set top p to 0.9 we may get coffee is a fuel of productivity and if we set top p to one then we may get coffee is lava for your soul in a ceramic cup the final rule of thumb is if you want reliable answers then you can set temperature to 0.2 and top p to 0.7 for creative writing you can set temperature from 0.5 to 0.8 and top p to 1 if you're looking for structured json output then you can set temperature to 0.1 and top p to 0.5 i hope you can see how powerful these settings are and and how important it is to unlock the true potential of an ei agent now let's quickly talk about the memory if we attach a memory node then it will help this ei agent to remember the past conversations and the context so if this ei agent is handling the ongoing conversation then you have to add the memory if this ai agent is handling one-off tasks where the previous context is not required then you can skip the memory and the worst practice is using the simple memory node you should consider it using only during the pc's or while building the workflows that's it not in the production it will use your machine's memory and it may fill it up and ultimately crash your na10 so you can use memories from this list the best and easiest one is the postgress chat memory you can select this is to use superbase they have a pretty generous free plan and you can create an account and then go to this dashboard then click on connect at the top then scroll down to transaction pooler expand view parameters and one by one copy all of these fields and paste them in the new credentials fields and then save it then you just need to give a table name over here it can be anything let's say conversations table or any chat histories and that's it this node will take care of creating that table in the superbase the important thing to consider here is context window length this is the number of past interactions that will be shared with the ai agent as context now you may want to reduce it or increase it depending on the use case of your ei agent if you expect long conversations with your customers let's say and if there is some context which might have been shared in older conversations then you may want to increase it from 5 to 10 if you think the context shared in the past five conversations is good enough then you can just set it to five now let's talk about the tools tools gives the capability to this ai agent to take actions you can literally have this ai agent perform any action in any service if they are exposing their functionalities via api there are a large number of predefined tools you can select from you can in fact create your own workflow and give it as a tool over here you can write your own code and give it as a tool you can make http request to any service and give it as a tool and you can use mcp client tool if you don't know what mcp is you must watch the previous video on mcp client and how to use it so let's say if i select gmail tool and and let's say if i want to give a capability to this ei agent to send messages on gmail then it has a couple of fields over here to subject message and so on now the best part is you can see this button over here that says let the model define this parameter and if i select this then i i don't need to define the value of this field this will be dynamically populated by the ai agent in the runtime now this is really amazing and powerful but it also has limitations because if you leave large number of fields to ai agent then it may hallucinate and may not give proper values so i would suggest use these fields only if you really can't define those values beforehand to reduce the overall hallucination now let's take a look at the settings of this ai agent itself as you can see the first field is taking the user's query then in the options we have system message so the default message used in the system prompt is you are a helpful assistant and you should always update this system message for your use case so that you have more control and personalization and you can give instructions on how this ei agent should perform we have three more options like max iterations which limits how many times the ei agent can loop through the tool usage and reasoning before stopping so the default value is 10 and if you expect your a agent to have large number of back and forth communication with your tools then you may want to increase it let's say if you're doing some sort of deep analysis and you want multiple back and forths happening between the ea agent and if this number is restricted to 10 then your e agent will stop once it hits this limit and you may want to increase it for other simple use cases like chatbots you can leave the default value next option is return intermediate steps when enabled the output includes reasoning steps tool invocations and internal logic not just the final result you may want to enable it for debugging audits and let's say transparency use cases you can log all of those details somewhere and then you can perform the complete audits on the performance of this ei agent otherwise you don't need that then we have automatically pass through binary images if the ai agent receives binary image input this lets the image to be passed through the model but you need to make sure that the llm model that you have selected should support multimodel but you need to make sure that you have selected the multimodel llm model like gbt4 and this is how a really good prompt looks like it contains a role at the top that tells what this ai agent is for example you are a customer support agent at xy z company or something else then we define the primary goal of this ai agent then we define the domain knowledge or the context so we give all the facts that should be known by this agent then we give it a list of all the tools which are available to this agent and when and why to use those tools then we have knowledgebased search so if we are giving this agent access to our vector or rack tool then we should keep this section otherwise we should remove it and then there are some formatting rules on how it should format the final response then we define the style and tone of the responses then we define the safety and accuracy instructions and finally we can define some additional reasoning instructions if required you can get this template in the free school community you can find the link in the description and you can just copy and paste in your workflows and modify it so this is the prompt i've given it for demonstration purpose and one thing to note is the format rules if this agent is acting as a chat assistant then you may just want to say create the final response for the user's query which will be sent directly to the user something like that but if you want some kind of json then you can give this prompt however you should not give these format rules if you are using the output parser in our example let's use the output parser and i'll remove it from here go out and then enable this require specific output format once we do that we'll start seeing one more option to add output parser over here so output parsers are the way to make sure ai speaks the language our workflow understands they make sure that our ai just doesn't talk but talks in a format that our system can use let's break down the three output parsers ideally there are just two output parsers and the third one is to fix these two let me explain how so if i select structured output parser then here we can define the format in which we want this ai agent to return the response so let's say if we want this ai agent to create instagram posts then maybe we want it to output the title content and tags let's say if i run this chat and give this prompt create an instagram post showing ai usage in healthcare and then the ai agent is working on this post and it should output an instagram post and there we have it so if i open this as you can see the output includes title content and tags so this is the format we were looking for now let's talk about the second output parser which is item list output parser we should use it when we want to generate a list of items from the ai agent and to be able to do that we need to select the separator and tell it what kind of separator it should look for in the response generated by the ei agent so let's say we give it the separator as this and in the ei agents prompt we need to tell it to separate the instagram posts by the same separator string and then if we open the chat and give it the instruction to create three different instagram posts it will create three different items for that now it may happen that uh the ai agent is not always able to output the response in the given format and for that purpose we have the third output parser option which is autofixing output parser it is nothing but it's a wrapper on top of the two output parsers that we talked about so after adding this node we need to select the item list or the structured output parser so if i select structured output parser here i can give the same json string that we gave earlier and in the model we can select the same openai model now once we do this and if we run it let's say ai agent is not able to generate the response in the given json format this node over here autofixing output parser will make a request to the openi model and it will and it will send the received response and the instructions and the error to openai and have it fixed to match the json format that we are giving over here that's how it works and this is how you build a perfect ai agent that works in all situations the one that you can monitor and tweak to improve its performance to do a better job to reduce the token usage and reduce the hallucination if you found value in this video don't forget to comment and share your insights and don't forget to like the video and subscribe the channel for more in-depth videos like these thanks for watching and see you in the next

Stop Hallucinations! Best n8n AI Agent Settings Explained

Channel: FuturMinds

Convert Another Video

Share transcript:

Want to generate another YouTube transcript?

Enter a YouTube URL below to generate a new transcript.