Skip to main content

🧠 ALXR Agent

WebSocket Chat API

This project provides a real-time AI assistant powered by local LLMs and Retrieval-Augmented Generation (RAG).
It exposes a FastAPI WebSocket API that streams model responses chunk-by-chunk, supports product and document retrieval, and can be deployed locally or on cloud infrastructure.


🚀 Features

  • 🔸 Real-time streaming responses through WebSocket
  • 🪄 Local LLM support (GGUF via llama.cpp or Hugging Face Transformers)
  • 📚 Integrated document & product retrieval (RAG)
  • 🧭 Simple API key enforcement (optional)
  • ⚡ Fast cold-start through pre-warming of embeddings
  • 🧩 CORS-enabled for frontend clients (e.g., React / Streamlit)

📦 Tech Stack

  • FastAPI – Web framework
  • uvicorn – ASGI server
  • llama.cpp / Hugging Face Transformers – Model inference
  • ChromaDB – Vector database for RAG
  • Python 3.10+

🧰 Installation

  1. Clone the repository

    git clone https://github.com/your-username/alxr-agent.git
    cd alxr-agent

    2. **Install the dependencies**
    ```bash
    pip install -r requirements.txt
    3. Set up environment variables
    ```bash
    Create a .env file in the root directory:

    ALXR_MODEL_TYPE=gguf
    ALXR_GGUF_MODEL_PATH=./Mistral-7B-Instruct-v0.3/mistral-7b-instruct-v0.3-Q5_K_M.gguf
    ALXR_TRANSFORMERS_MODEL_PATH=./Mistral-7B-Instruct-v0.3
    ALXR_CHROMA_PATH=./chroma_db
    ALXR_DOC_COLLECTION=documents
    ALXR_PRODUCT_COLLECTION=products
    ALXR_API_KEY=your_secret_key_here # Optional
    ALXR_HOST=0.0.0.0
    ALXR_PORT=8080

    # ==== Embeddings (LOCAL only) ====
    ALXR_EMBED_PATH=./models/bge-m3
    EMBED_BACKEND=local
    HF_HUB_OFFLINE=1
    TRANSFORMERS_OFFLINE=1

  2. Running the server

    python alxr_ws_server.py 

    or with uvicorn directly:

    uvicorn alxr_ws_server:app --host 0.0.0.0 --port 8080
  3. Websocket Endpoint

    ws://localhost:8080/v1/chat/ws

    Example payload sent from client after connecting:

    {
    "api_key": "your_secret_key_here",
    "message": "What is the price of CBD oil?",
    "history": []
    }