← Back to Blog · Mar 4, 2026 · automation

Building MK ai: From a Stateless Bot to a Context-Aware AI WhatsApp Assistant

By Manases

MK ai did not start as a complex system. It began as a simple experiment: connect AI to WhatsApp and make it reply.

What it became is something very different - a context-aware, location-intelligent, persistent AI assistant deployed in production.

This is the story of that evolution.

Version 1: The Stateless AI Bot

The first iteration of this project was a straightforward integration between the WhatsApp Business API and the Gemini API.

The architecture was simple:
Receive a message from WhatsApp.
Send the message to Gemini.
Return the AI-generated response.
End the request.

There was no database.

No memory.

No context retention.

Each message was treated independently.

The project is still available here:

GitHub Repository:

https://github.com/Manasess896/Whatsapp-Bot

At the time, the goal was proof of concept — confirm that:

Webhooks worked correctly.
AI responses could be generated dynamically.
Messages could be sent back through WhatsApp reliably.

It worked.

But it had limitations.

The bot forgot everything immediately. Conversations felt artificial because there was no continuity between messages.

Then the Gemini API documentation changed.

That update forced me to revisit the implementation. Instead of patching the existing system, I made a decision: rebuild the architecture properly.

That decision led to the rebranding.

Rethinking the Architecture

If I was rebuilding the project, I wanted to solve the original limitations:

Add conversational memory
Improve inference speed
Introduce contextual intelligence
Make it production-ready

Phase 1: WhatsApp Cloud API Integration

The foundation remained the WhatsApp Cloud API via Meta’s developer platform.

The setup involved:

Creating a Business App
Enabling WhatsApp Cloud API
Generating a test phone number
Configuring webhook verification

Meta requires a verification handshake to confirm server ownership. I implemented a verification route using Flask:

@app.route("/webhook", methods=["GET"])

def verify():

if request.args.get("hub.verify_token") == VERIFY_TOKEN:

return request.args.get("hub.challenge"), 200

return "Forbidden", 403

Once verified, the system could receive real-time message events.

Phase 2: Local Development with Ngrok

Meta requires a public HTTPS endpoint. Ngrok was used to expose the local Flask server:

ngrok http 5000

This allowed live webhook testing without deploying the application repeatedly.

Phase 3: Switching AI Providers

In the first version, I used Gemini. During the rebuild, I evaluated the new documentation for Gemini models Api and their pricing and decided to look for an alternative. I transitioned to Groq for their easily understandable documentation and pricing using Llama-based models. They provide free tier with a reasonable number of limits per time. The integration was straightforward:

ai_client = Groq(api_key=GROQ_API_KEY)

However, the real work was in prompt design. I created a structured system prompt that: Defined the bot`s tone and behavior, enforced strict privacy guidelines, reduced hallucination risks, maintained conversational consistent.

Phase 4: Introducing Persistent Memory with MongoDB

The biggest architectural upgrade was adding memory. I introduced MongoDB as the persistence layer. Its document structure aligned naturally with webhook payloads and message history storage.

The conversational pipeline became:

Receive message
Save message to database
Retrieve last 30 messages
Generate AI response using conversation history
Save AI response

This enabled contextual continuity across sessions. The debugging phase here was significant. Early versions responded correctly but failed to store data consistently. Refactoring the order of database writes and AI generation resolved synchronization issues. The result was a bot that could reference prior discussions naturally.

Phase 5: Location-Aware Intelligence

While working with phone numbers, I realized they contain geographic metadata through international dial codes. I introduced a location detection module based on structured country code mappings. By use of a Json file containing all the world dial codes and respective country metadata the bot could now identify the location of the user and try to provide efficient data related to the user. Initially, location was detected only during the first interaction. Later, I refactored the system to update location metadata during every message event, accounting for user mobility.

Refactored logic:

location_data = detect_user_location(user_id)

if location_data:

save_user_location(user_id, location_data, user_name)

This allowed the bot to adapt responses with subtle cultural awareness when relevant without explicitly asking users for their location.

Phase 6: Deployment and Production Hardening

After stabilizing the architecture locally, I prepared the application for production deployment.

Steps included Freezing dependencies in requirements.txt, Configuring a Procfile for Gunicorn, pushing to GitHub, Deploying to Heroku. Production debugging introduced new challenges Webhook validation errors, Environment variable management, Token handling, Intermittent 500 responses. Using structured logging and live log monitoring, I resolved these issues until webhook responses stabilized at consistent 200 OK status codes.

Final Architecture Overview

The bot now consists of:

WhatsApp Cloud API for messaging
Flask webhook server
MongoDB for persistent memory
Groq for AI inference
Cloud hosting for production deployment

Compared to Version 1, this is no longer a simple message relay.

It is a multi-layered system that:

Maintains conversational memory
Updates contextual metadata dynamically
Optimizes inference performance
Operates reliably in production

What This Project Represents

This project reflects more than API integration skills.

It demonstrates architectural evolution after API changes, migration between AI providers, database-driven conversational design, contextual intelligence implementation, production debugging and stabilization

The transition from a stateless Gemini-based bot to a context-aware AI assistant is what truly defines this bot.

you can interact with the bot at mk ai and view the documentation from github