GitHub just published details about Qubot, its internal analytics agent that allows any employee to query company data in natural language. This approach democratizes access to business intelligence without requiring SQL skills. Here's how to build a similar system for your organization.
The Problem an Analytics Agent Solves
In most companies, data access remains a bottleneck. Business teams depend on data analysts to get reports. Analysts are buried under ad hoc requests. Strategic decisions sometimes wait days before being informed by numbers.
An AI analytics agent inverts this dynamic. Instead of formulating a SQL query or navigating a complex BI tool, users ask a question in plain language: "What are our sales by region this quarter?" The agent translates this question into a query, executes it against your databases, and returns an understandable response.
GitHub reports that Qubot processes over 2,000 queries per week from non-technical employees. Average time to get an answer dropped from several hours (waiting for an analyst) to under 30 seconds.
This transformation represents a considerable productivity gain. Analysts can focus on high-value tasks while business teams gain autonomy. The ripple effects extend throughout the organization, enabling faster and more data-driven decision making.
Architecture of a Modern Analytics Agent
Component 1: Conversational Interface
Users interact through a chat interface integrated into your existing tools: Slack, Microsoft Teams, or a dedicated web application. The interface must support:
- Natural language questions in multiple languages
- Display of tables and charts in responses
- Ability to ask follow-up questions
- Conversation history for context
Integration with existing collaboration tools is crucial for adoption. An agent accessible from Slack will be used daily, while an isolated application risks being forgotten.
Component 2: Understanding Engine (LLM)
A large language model (LLM) like GPT-4, Claude, or an open-source model like Llama 3 analyzes the user's question. The engine must:
- Identify the intent (data lookup, comparison, trend)
- Extract mentioned entities (products, regions, time periods)
- Determine relevant metrics (revenue, volume, margin)
- Handle ambiguities by asking for clarification
The model choice directly impacts response quality. Proprietary models like GPT-4 or Claude offer the best performance, while open-source models allow full data control.
Component 3: SQL Translation Layer
Once intent is understood, the agent generates the corresponding SQL query. This critical step requires:
- A well-documented database schema with business descriptions
- Query examples to guide the model (few-shot learning)
- Guardrails to prevent dangerous queries (DELETE, UPDATE)
- Syntax validation before execution
Schema documentation quality largely determines agent success. Invest time to name columns explicitly and add business descriptions.
Component 4: Data Connectors
The agent must access your data sources without exposing credentials. Implement:
- Secure connectors to your databases (PostgreSQL, MySQL, BigQuery)
- Permission system based on user identity
- Caching for frequent queries
- Timeout limits to avoid heavy queries
Component 5: Response Formatter
Raw results are transformed into understandable responses:
- Explanatory text summarizing key insights
- Formatted tables for tabular data
- Automatically generated charts when relevant
- Suggestions for follow-up questions
Step-by-Step Implementation
Step 1: Prepare Your Data Infrastructure
Before building the agent, ensure your data is accessible and documented.
Required actions:
- Centralize your data in a data warehouse (BigQuery, Snowflake, Redshift)
- Create a semantic schema with explicit column names
- Document each table with its business description
- Establish a SQL views layer to simplify common queries
Verification: A junior analyst should be able to understand your schema in under 30 minutes.
Step 2: Choose Your Technical Stack
For an SME, prefer managed solutions that reduce operational burden.
Option 1: Cloud stack (recommended for starting)
- LLM: OpenAI GPT-4 or Anthropic Claude API
- Interface: Slack or Microsoft Teams with native bot
- Backend: AWS Lambda or Google Cloud Functions
- Database: Your existing data warehouse
Option 2: On-premise stack (sensitive data)
- LLM: Self-hosted Llama 3 70B via Ollama or vLLM
- Interface: Custom web app (React + FastAPI)
- Backend: Dedicated server with GPU
- Database: Local PostgreSQL or ClickHouse
Monthly cost for a cloud solution serving 100 users ranges from $500 to $2,000 depending on query volume.
Step 3: Build the Translation Pipeline
The agent's core is its ability to transform a question into valid SQL.
Prompt engineering:
You are an analytics assistant. You translate questions into SQL for our database.
Available schema:
- sales(id, date, product_id, region, amount, quantity)
- products(id, name, category, unit_price)
- customers(id, name, segment, created_date)
Rules:
- Use only SELECT, never INSERT/UPDATE/DELETE
- Limit results to 1000 rows maximum
- Aggregate data when the question is about trends
- Return NULL if the question is out of scope
User question: {question}
Few-shot examples: Add 5-10 question examples with their expected SQL to guide the model. This technique considerably improves generated query accuracy.
Step 4: Implement Guardrails
An agent without controls can expose sensitive data or execute destructive queries.
Mandatory security:
- Dangerous keyword filtering (DROP, TRUNCATE, DELETE)
- User filter injection in every query (WHERE user_id = ?)
- 30-second maximum timeout per query
- Logging all queries for audit
- Rate limiting per user (50 queries/hour)
Result validation:
- Verify returned columns are authorized
- Automatic masking of personal data (emails, phone numbers)
- Alert if returned data volume is abnormally high
Step 5: Test with Pilot Users
Deploy the agent to a restricted group before general launch.
Test protocol:
- Select 10 users with varied profiles (sales, marketing, finance)
- Ask them to pose 20 natural questions each
- Measure success rate (correct vs incorrect response)
- Collect failed questions to improve the prompt
- Iterate until reaching minimum 85% accuracy
GitHub indicates they iterated for 3 months before reaching satisfactory accuracy on Qubot. Factor this refinement time into your planning.
Best Practices from GitHub's Experience
Document Your Schema in Business Language
The LLM performs better when column and table names match user vocabulary. "monthly_revenue" is clearer than "rev_m" or "ca_mensuel".
Offer Suggested Questions
Display examples of questions the agent can handle. This educates users and reduces out-of-scope queries. Contextual suggestions based on user role further improve the experience.
Implement a Feedback Loop
Allow users to flag incorrect responses. Use this feedback to enrich your few-shot examples and refine the prompt. This continuous improvement mechanism is essential for long-term quality.
Limit Initial Scope
Start with a specific business domain (sales, or marketing) before expanding company-wide. A specialized agent will be more accurate than a generalist.
Concrete Use Cases for Moroccan SMEs
Real-Time Sales Tracking
Sales teams can query performance without waiting for the weekly report. "How many leads did we convert this week in Casablanca?" becomes an instant-answer question.
Self-Service Financial Analysis
The CFO can explore data without mobilizing the IT team. "What's our margin rate by product category over the last 6 months?" AI transformation services enable integrating these analyses into your existing workflows.
Operational Monitoring
Logistics managers track indicators in real time. "How many orders are delayed in delivery today?" The agent can even trigger proactive alerts.
Strategic Decision Support
Executives get quick answers to prepare board meetings. "How has our market share evolved compared to last year?" CRM and ERP solutions can be enhanced with these natural query capabilities.
Costs and Expected ROI
Initial Investment
- Prototype development: 2-4 weeks for a senior developer
- Cloud costs (LLM + infrastructure): $500-1,500/month
- Integration with your tools: 1-2 additional weeks
Measurable Gains
- 70% reduction in analyst time on ad hoc queries
- Decision acceleration (from days to minutes)
- Business team autonomy (less IT dependency)
- Better data adoption in company culture
GitHub estimates Qubot's ROI at over 200% in the first year, mainly through analyst productivity gains.
Mistakes to Avoid
Not Validating Generated SQL Queries
The LLM can produce syntactically correct but semantically wrong queries. Always implement human review for critical decisions.
Ignoring Data Governance
Ensure the agent respects existing access rules. A marketing user shouldn't be able to query HR data.
Underestimating Maintenance Needs
Schemas evolve, LLM models are updated, users change. Plan for ongoing maintenance of your agent.
Promising Too Soon
Communicate clearly about the agent's limitations. Better a tool that handles 80% of cases well than a tool promised as magical that disappoints.
Advanced Considerations
Multi-Database Federation
Large organizations often have data spread across multiple systems. Consider implementing a federation layer that allows the agent to query across different databases while maintaining security boundaries.
Caching Strategy
Implement intelligent caching based on query patterns. Frequently asked questions about the same time periods or dimensions can be pre-computed, reducing both latency and API costs.
Audit and Compliance
For regulated industries, maintain detailed logs of all queries, results, and user interactions. This audit trail is essential for compliance with GDPR, HIPAA, or industry-specific regulations.
FAQ
Which LLM should I choose for an analytics agent?
To start, GPT-4 or Claude offer the best quality-to-ease-of-integration ratio. If your data is sensitive and cannot leave your infrastructure, self-hosted Llama 3 70B is a viable alternative. API cost for GPT-4 averages around $0.03 per query.
How do I handle ambiguous questions?
Implement a clarification mechanism. If the agent detects ambiguity ("sales" could mean revenue or volume), it asks the user to clarify before executing the query. This improves accuracy and educates users.
Can the agent modify data or only read it?
For security reasons, limit the agent to read operations (SELECT). If you need write operations, implement them as separate workflows with mandatory human validation.
How long does it take to deploy a functional analytics agent?
A functional prototype can be deployed in 2-4 weeks. However, reaching 85% or higher accuracy typically requires 2-3 months of iteration with real users. Factor this refinement time into your planning.
How do I measure the agent's success?
Track these metrics: successful query rate (target: over 85%), average response time (target: under 30 seconds), number of queries per user per week, and user satisfaction score (NPS or CSAT). GitHub also measures "time saved" by comparing with the previous process.
