Back to blogs

Meet Ante - How We Developed Our Own AI Chatbot

Juraj S.8 min readMar 14, 2024Technology

Juraj S.8 min read

Contents:

What are the challenges in AI chatbot development

This is how we developed our very own custom chatbot

How we refined Ante's intelligence with RAG architecture

The benefits we got from our custom chatbot

Conclusion to our AI chatbot development story

As with all good software, it starts with a problem. As Devōt grew in size, we started developing processes and guidelines for certain things, such as a referral program, employee benefits, and education courses. We've documented everything to provide employees with a single source of truth. However, navigating through all this documentation can be tiresome and time-consuming for employees.

This problem led us on the chatbot development path. The goal was to develop an AI-powered chatbot that would give us answers as quickly as possible.

That is how "Ante" was born. Instead of relying on colleagues who might unintentionally provide incorrect information, you can turn to your AI colleague, Ante, for reliable answers. Ante is capable of answering a wide range of questions related to the company's policies, benefits, procedures, and other relevant information.

What are the challenges in AI chatbot development

Creating applications that rely on a large language model, especially for AI chatbots, has its fair share of challenges. With these sorts of applications, we have no control over the output; it’s a magical black box that we try to steer in the right direction but ultimately has a mind of its own. This situation requires special attention to ensure chatbot users (in this case, our employees) have a good experience.

1. Does your chatbot hallucinate?

There is a concern over the risk of fabricated answers when developing custom chatbots. If the chatbot model doesn’t have access to proprietary or niche data, it will generate answers for things it doesn’t know or have context for, leading to users receiving incorrect information. Without citations to verify the source of the content, it can be difficult to confirm whether or not a certain response is hallucinated.

In the tech world, it's called "hallucination" because the system generates responses or information based on patterns it has learned, without any real understanding or basis in factual data, much like how a hallucination is a perception of something that isn't actually present.

2. Even context has its limits

If you have been using ChatGPT, then you have probably heard about the importance of context in AI chatbots. Chatbot models need context with every prompt to improve answer quality and relevance, but there’s a size limit to how much additional context a query can support.

This is an abstract limitation, meaning that a chatbot will never have all the context that the person using it possesses.

3. High query latencies

Adding context to chatbot models significantly increases the expense and time required for their operation. This is because incorporating more context requires greater processing resources; especially, longer contexts can make the computing needs too high to manage effectively.

Essentially, every piece of added context extends the waiting time for responses from the Large Language Model (LLM), as the model needs to process a larger volume of data to generate its responses. Therefore, it's crucial to carefully select the data we provide to the model, ensuring that it's both useful and doesn't overload the system.

4. Inefficient knowledge updates

AI models require tens of thousands of high-cost GPU training hours to retrain on up-to-date information. Once the training process is completed, the AI model is stuck in a “frozen” version of the world it saw during training.

This is how we developed our very own custom chatbot

Using ChatGPT's language model

Chatbots like ChatGPT are AI language models. LLMs (large language models) are trained to predict language and writing based on large datasets of written language. So, if you want to build your own AI chatbot, you’ll need access to an extensive database of human-generated text.

This is why we chose to use ChatGPT's advanced language model as our foundation.

Customizing AI chatbot Ante to Devōt needs

Our custom AI chatbot got access to the Devōt Employee Handbook, where we keep documentation related to company matters. Over the years, we have refined our handbook to have all answers to common employee questions in one place.

The process seems straightforward: send the employee handbook with the question and let Ante find the answer.

Well, sending the entire Employee Handbook is neither optimal nor possible. The number of tokens you can use in each request to the ChatGPT API is limited, so we need to extract the most relevant information from the handbook related to the user's question.

Simplifying search with vector embeddings

To find the most relevant information related to the user's question, we need to find the most similar part of the two texts. In other words, when a user inquires about a specific topic, such as "X," the relevant information will likely be found in sections where "X" is mentioned. You might wonder how we identify which parts of the text relate to the question at hand. The answer lies in the use of vector embeddings. This technique transforms extensive text passages into a compact vector form, making it easier to determine their relevance to the user's query.

This representation makes it possible to translate semantic similarity to proximity in a vector space.

This way, instead of sending the entire Employee Handbook, we can find the most relevant sections and send them with the question to ChatGPT.

Choosing Pinecone as the vector database

Since the employee handbook is transformed into vector embeddings, we need a specialized database specifically for handling this data type. We chose Pinecone, a vector database, due to its additional capabilities for efficient and fast lookup of nearest neighbors in an N-dimensional space, which is exactly what we need.

What does that actually mean? It means that vector databases are optimized to swiftly identify the closest points to any given query point in a space, regardless of the number of dimensions (referred to as N-dimensional space). Even though the image above uses two-dimensional spaces, it's important to note that embeddings can span more than a thousand dimensions.

How we refined Ante's intelligence with RAG architecture

RAG is a powerful paradigm used to enhance natural language processing models. It excels in scenarios where domain-specific knowledge is crucial for delivering precise and contextually appropriate responses. We augment private data to answer domain-specific questions better. In this case, our domain knowledge is the Devōt itself.

The reason behind RAG

The simplest approach is to ask ChatGPT a question directly. We know that asking a general question, such as "how to bake bread," will give us a correct response from the application, assuming that the relevant information was available online during ChatGPT's development phase.

Of course, ChatGPT knows nothing about Devōt. So, the next idea was to keep track of previous questions and answers that Ante would learn. For example, if we ask a question and indicate that the response is incorrect, the system should learn from this feedback and avoid repeating the same error in the future.

Keeping track of previous questions in the custom chatbot

This means that for nearly all questions about Devōt, Ante will either respond with "I don’t know" or provide hallucinated (fabricated) answers. So, this approach remains inadequate. The direction for the next iteration seems pretty clear. We need to equip Ante with knowledge about Devōt before he answers the questions.

ai chatbots being equipped with knowledge

Implementing RAG (Retrieval Augmented Generation) for domain-specific knowledge

RAG has two main components: indexing & retrieval and generation.

Indexing

First, we index our domain-specific data, our Employee Handbook, by converting it to embeddings and then storing it in a vector database.

Retrieval and generation

We retrieve the nearest neighbors of the user's question to find the relevant contexts from our vector database and feed them to LLM.

Ready to see RAG in action? Let's dive in!

Bringing Ante to life - Our development process

The codebase

Keeping track of the chat history is done with the useState React hook. The chat starts with a message from Ante.

Adding a new user question to the history is done using a state variable. It preserves the existing state by creating a new object with the spread operator and only modifies the messages property to include the new user message. By creating a new object we are updating the state in an immutable manner.

Then, we send the question and the chat history to the backend for processing and update the chat with Ante’s response.

We achieve this architecture by creating a handler function that extracts the question and history from the request body.

It validates the data only to allow POST requests and rejects requests without questions.

Then comes the important bit: Vector Store Initialization. Here, we initialize a vector store in which our Employee Handbook is indexed.

Integration of LangChain and improving conversational AI

Chains allow us to chain together multiple calls in a logical sequence, enabling us to keep track of the state of the conversation while also retrieving the context we need to provide to the LLM.

Each conversation is a new ConversationalRetrievalQAChain chain imported from LangChain, which we use with the OpenAI LLM. This chain is specialized for follow-up questions. For instance, if someone says, "In which projects do we use Typescript?" and then asks a follow-up question, "How about Ruby on Rails?" the question would not make sense on its own.

LangChain will add a step combining the chat history and the question into a standalone question. It then performs the standard retrieval steps of looking up relevant documents from the retriever and returns a response.

Finally, we issue a call to the chain, which then returns an object containing the answer from the LLM.

Going the extra mile

We have now addressed the big problems with AI-powered applications, like hallucinations and context limits, but you might have noticed we went a step further.

We are also retrieving the documents LangChain looked up and actually showing the sources to the user on the front end.

The benefits we got from our custom chatbot

When you go down the path of chatbot development, the goal is to achieve tangible, concrete benefits that contribute directly to customer satisfaction. In this context, our customers were our employees. Here are some of the key benefits we got from ai chatbot Ante:

1. Faster answers

Using AI chatbot development, we've significantly accelerated the pace at which employees can get answers. This expedited response time is not just about speed but also about making efficient use of our employees' time.

The AI-powered chatbot Ante uses advanced natural language processing (NLP) to understand and process queries in real-time.This ensures that employees spend less time waiting and more time focusing on their core tasks.

2. Improved accuracy

By providing the sources straight from the documentation, we ensure the reliability of our ChatBot.

Accuracy is important, especially if you are building an AI chatbot that needs to answer questions from employees. You do not want to spread misinformation among employees. This is why our chatbot development services have focused on integrating direct sources from our documentation.

This approach ensures that our chatbot's information is timely, reliable, and grounded in the company’s approved knowledge base. By employing NLP and machine learning algorithms, Ante can understand the specific details of user inquiries, pulling the most relevant and accurate information.

3. Reduced workload

Ante's implementation has been a game-changer in optimizing HR processes by autonomously handling routine inquiries. This capability has significantly reduced the workload on our HR department, allowing them to redirect their focus towards more strategic initiatives. You can improve overall operational efficiency by using AI chatbots like Ante.

Ante’s ability to provide instant, accurate responses to frequently asked questions means that HR professionals can concentrate on tasks that require a human touch, such as employee development and strategic planning, thereby adding more value to the company.

Conclusion to our AI chatbot development story

This project shows that a good idea for improving how we do things inside our company can also help us offer new services outside.

By welcoming change in the software development world, we learned a lot about RAG architecture, LLMs, and all the magic behind what the whole world is so excited about. AI is an esoteric tool that is quite simple in theory; it hallucinates text based on 1 trillion parameters generated from a compressed chunk of the internet. Nevertheless, it is changing how we interact with the world. Understanding all these concepts and tools allows you to use a multi-billion dollar company’s AI model to answer problems for your specific domain of knowledge.

If you have any questions about AI chatbot development, feel free to reach out to us.

Spread the word:

Keep readingSimilar blogs for further insights

Guide to Microsoft Outlook Mail Integration Using Microsoft Graph API

Technology

Tomislav B.5 min readJul 29, 2025

Guide to Microsoft Outlook Mail Integration Using Microsoft Graph APIInsider roadmap to seamless Outlook Mail integration in Java and Spring Boot—unlocking Microsoft Graph API–powered app registration, secure authentication, and end-to-end email retrieval and delivery.

GitHub vs GitLab: Choosing the Right Platform for Your Development Workflow

Technology

Marko M.4 min readJul 23, 2025

GitHub vs GitLab: Choosing the Right Platform for Your Development WorkflowHead-to-head showdown between GitHub’s vast ecosystem and GitLab’s all-in-one DevOps powerhouse, revealing differences in repo management, CI/CD pipelines, security features, pricing, project-size suitability, and cutting-edge AI tools.

Technology

Tomislav B.3 min readJul 17, 2025

How to Integrate Microsoft Outlook Calendar into Your Spring Boot ApplicationA step-by-step tutorial showing you how to: register a Spring Boot app in the Microsoft Entra admin center; add Graph SDK dependencies; set up DeviceCodeCredential for user authentication; fetch calendar events using start/end filters; and programmatically create new Outlook Calendar events via Microsoft Graph API.