← Back to Blog  ·  Mar 4, 2026 · automation

Building MK ai: From a Stateless Bot to a Context-Aware AI WhatsApp Assistant

By Manases

MK ai did not start as a complex system. It began as a simple experiment: connect AI to WhatsApp and make it reply.

What it became is something very different - a context-aware, location-intelligent, persistent AI assistant deployed in production.

This is the story of that evolution.


Version 1: The Stateless AI Bot


The first iteration of this project was a straightforward integration between the WhatsApp Business API and the Gemini API.

  1. The architecture was simple:
  2. Receive a message from WhatsApp.
  3. Send the message to Gemini.
  4. Return the AI-generated response.
  5. End the request.

There was no database.

No memory.

No context retention.

Each message was treated independently.

The project is still available here:

GitHub Repository:

https://github.com/Manasess896/Whatsapp-Bot

At the time, the goal was proof of concept — confirm that:

  1. Webhooks worked correctly.
  2. AI responses could be generated dynamically.
  3. Messages could be sent back through WhatsApp reliably.

It worked.

But it had limitations.

The bot forgot everything immediately. Conversations felt artificial because there was no continuity between messages.

Then the Gemini API documentation changed.

That update forced me to revisit the implementation. Instead of patching the existing system, I made a decision: rebuild the architecture properly.

That decision led to the rebranding.


Rethinking the Architecture


If I was rebuilding the project, I wanted to solve the original limitations:

  1. Add conversational memory
  2. Improve inference speed
  3. Introduce contextual intelligence
  4. Make it production-ready


Phase 1: WhatsApp Cloud API Integration


The foundation remained the WhatsApp Cloud API via Meta’s developer platform.

The setup involved:

  1. Creating a Business App
  2. Enabling WhatsApp Cloud API
  3. Generating a test phone number
  4. Configuring webhook verification

Meta requires a verification handshake to confirm server ownership. I implemented a verification route using Flask:

@app.route("/webhook", methods=["GET"])
def verify():
if request.args.get("hub.verify_token") == VERIFY_TOKEN:
return request.args.get("hub.challenge"), 200
return "Forbidden", 403

Once verified, the system could receive real-time message events.


Phase 2: Local Development with Ngrok


Meta requires a public HTTPS endpoint. Ngrok was used to expose the local Flask server:

ngrok http 5000

This allowed live webhook testing without deploying the application repeatedly.


Phase 3: Switching AI Providers


In the first version, I used Gemini. During the rebuild, I evaluated the new documentation for Gemini models Api and their pricing and decided to look for an alternative. I transitioned to Groq for their easily understandable documentation and pricing using Llama-based models. They provide free tier with a reasonable number of limits per time. The integration was straightforward:

ai_client = Groq(api_key=GROQ_API_KEY)

However, the real work was in prompt design. I created a structured system prompt that: Defined the bot`s tone and behavior, enforced strict privacy guidelines, reduced hallucination risks, maintained conversational consistent.


Phase 4: Introducing Persistent Memory with MongoDB


The biggest architectural upgrade was adding memory. I introduced MongoDB as the persistence layer. Its document structure aligned naturally with webhook payloads and message history storage.

The conversational pipeline became:

  1. Receive message
  2. Save message to database
  3. Retrieve last 30 messages
  4. Generate AI response using conversation history
  5. Save AI response

This enabled contextual continuity across sessions. The debugging phase here was significant. Early versions responded correctly but failed to store data consistently. Refactoring the order of database writes and AI generation resolved synchronization issues. The result was a bot that could reference prior discussions naturally.


Phase 5: Location-Aware Intelligence


While working with phone numbers, I realized they contain geographic metadata through international dial codes. I introduced a location detection module based on structured country code mappings. By use of a Json file containing all the world dial codes and respective country metadata the bot could now identify the location of the user and try to provide efficient data related to the user. Initially, location was detected only during the first interaction. Later, I refactored the system to update location metadata during every message event, accounting for user mobility.

Refactored logic:

location_data = detect_user_location(user_id)
if location_data:
save_user_location(user_id, location_data, user_name)


This allowed the bot to adapt responses with subtle cultural awareness when relevant without explicitly asking users for their location.


Phase 6: Deployment and Production Hardening


After stabilizing the architecture locally, I prepared the application for production deployment.

Steps included Freezing dependencies in requirements.txt, Configuring a Procfile for Gunicorn, pushing to GitHub, Deploying to Heroku. Production debugging introduced new challenges Webhook validation errors, Environment variable management, Token handling, Intermittent 500 responses. Using structured logging and live log monitoring, I resolved these issues until webhook responses stabilized at consistent 200 OK status codes.


Final Architecture Overview


The bot now consists of:

  1. WhatsApp Cloud API for messaging
  2. Flask webhook server
  3. MongoDB for persistent memory
  4. Groq for AI inference
  5. Cloud hosting for production deployment

Compared to Version 1, this is no longer a simple message relay.

It is a multi-layered system that:

  1. Maintains conversational memory
  2. Updates contextual metadata dynamically
  3. Optimizes inference performance
  4. Operates reliably in production

What This Project Represents

This project reflects more than API integration skills.

It demonstrates architectural evolution after API changes, migration between AI providers, database-driven conversational design, contextual intelligence implementation, production debugging and stabilization

The transition from a stateless Gemini-based bot to a context-aware AI assistant is what truly defines this bot.

you can interact with the bot at mk ai and view the documentation from github

Share this article

Share on Twitter Share on LinkedIn