InspectRAG Domain-Wide Gmail Integration: Step-by-Step Guide
Overview
This guide provides detailed instructions for setting up and configuring InspectRAG's integration with Google Workspace for domain-wide Gmail access. This integration allows you to index and search emails across all users in your domain while maintaining appropriate access controls.
Prerequisites
Before beginning, ensure you have:
- Administrator access to your Google Workspace Admin Console
- Access to Google Cloud Console
- Ability to create and configure service accounts
- Redis server running (for temporary data storage)
- InspectRAG application installed and configured
Step 1: Create a Google Cloud Project
- Go to the Google Cloud Console
- Click on the project dropdown at the top of the page
- Click "New Project"
- Enter a project name (e.g., "InspectRAG-Integration")
- Click "Create"
- Wait for the project to be created and select it
Step 2: Enable Required APIs
- In your Google Cloud Project, navigate to "APIs & Services" > "Library"
- Search for and enable the following APIs:
- Gmail API
- Admin SDK API
- Google Drive API (if you plan to index attachments)
Step 3: Create a Service Account
- Navigate to "IAM & Admin" > "Service Accounts"
- Click "Create Service Account"
- Enter a name (e.g., "inspectrag-gmail-integration")
- Add a description: "Service account for domain-wide access to Gmail"
- Click "Create and Continue"
- Skip role assignment (you'll set up domain-wide delegation instead)
- Click "Done"
Step 4: Create and Download Service Account Key
- From the Service Accounts list, click on your newly created service account
- Go to the "Keys" tab
- Click "Add Key" > "Create new key"
- Select "JSON" format
- Click "Create"
- Save the downloaded key file securely
Step 5: Enable Domain-Wide Delegation
- Still on your service account page, click on the "Details" tab
- Under "Domain-wide delegation," click "Edit"
- Check the box for "Enable Google Workspace Domain-wide Delegation"
- Add a product name for the consent screen (e.g., "InspectRAG Gmail Integration")
- Click "Save"
- Note your service account's Client ID for the next step
Step 6: Configure Google Workspace Admin Console
- Go to your Google Workspace Admin Console
- Navigate to "Security" > "API controls"
- In the "Domain-wide Delegation" section, click "Manage Domain-wide Delegation"
- Click "Add new"
- Enter the Client ID of your service account
- Enter the following OAuth scopes:
- Click "Authorize"
Step 7: Set Up Environment Variables in InspectRAG
- In your InspectRAG installation directory, locate the environment configuration file (
.env
) - Configure the following environment variables:
# Enable/disable domain email processing
ENABLE_DOMAIN_EMAILS=true
# Processing interval (in seconds)
DOMAIN_EMAIL_INTERVAL=15 # How frequently to check for new emails
# Domain-wide delegation settings
SERVICE_ACCOUNT_FILE=/path/to/service_accounts_key.json
WORKSPACE_DOMAIN=yourdomain.com
[email protected]
DOMAIN_EMAIL_QUERY=is:inbox newer_than:1d
# Redis connection (required for thread tracking)
RAG_REDIS_URL=redis://localhost:6379/0
- Replace the values with your specific configuration:
SERVICE_ACCOUNT_FILE
: Full path to your downloaded service account keyWORKSPACE_DOMAIN
: Your Google Workspace domain name (without @ symbol)WORKSPACE_ADMIN
: Email of an administrator in your domainDOMAIN_EMAIL_QUERY
: Gmail search query to determine which emails to indexRAG_REDIS_URL
: Your Redis server connection string
Environment Variable Details
Variable | Description | Example |
---|---|---|
ENABLE_DOMAIN_EMAILS |
Toggles the domain Gmail integration on/off | true |
DOMAIN_EMAIL_INTERVAL |
How often to check for new emails (seconds) | 15 for testing, 300 (5 min) for production |
SERVICE_ACCOUNT_FILE |
Path to your service account key file | /home/user/InspectRAG/service_accounts_key.json |
WORKSPACE_DOMAIN |
Your Google Workspace domain | company.com |
WORKSPACE_ADMIN |
Admin email for domain operations | [email protected] |
DOMAIN_EMAIL_QUERY |
Gmail search query for filtering emails | is:inbox newer_than:1d |
Note: After changing these environment variables, you'll need to restart the InspectRAG application for the changes to take effect.
Step 8: Store the Service Account Key
- Create a directory for storing service account keys if it doesn't exist:
- Set appropriate permissions:
- Copy your service account key file to this directory:
- Set appropriate file permissions:
Step 9: Test the Configuration
- Run the following Celery task to verify access to your domain:
- Check the logs to ensure the task is running successfully
- Verify that user emails are being discovered and processed
Step 10: Configure Scheduled Tasks
- Set up a periodic task to regularly sync emails using Celery Beat
- Add the following configuration to your Celery Beat schedule:
# In your Celery configuration
app.conf.beat_schedule = {
'sync-domain-emails-daily': {
'task': 'tasks.domain_email_tasks.process_all_domain_users',
'schedule': crontab(hour=1, minute=30), # Runs at 1:30 AM daily
'args': (
'/etc/app/service-accounts/yourdomain-gmail.json',
'yourdomain.com',
'is:inbox newer_than:1d'
)
},
'track-domain-threads-daily': {
'task': 'tasks.domain_email_tasks.track_domain_threads',
'schedule': crontab(hour=3, minute=0), # Runs at 3:00 AM daily
'args': (
'/etc/app/service-accounts/yourdomain-gmail.json',
'yourdomain.com'
)
},
}
Step 11: Fine-Tune Query Parameters
For optimal performance and coverage, you can adjust the Gmail query parameters:
Query Parameter | Description | Example |
---|---|---|
newer_than: |
Retrieves emails within time range | newer_than:7d |
is:inbox |
Only inbox emails | is:inbox |
is:sent |
Only sent emails | is:sent |
is:anywhere |
All emails (inbox, sent, archived) | is:anywhere |
has:attachment |
Only emails with attachments | has:attachment |
-label:spam |
Exclude spam | -label:spam |
Common query combinations:
- is:inbox newer_than:7d -label:spam
(new inbox emails, not spam)
- is:anywhere newer_than:30d has:attachment
(all recent emails with attachments)
- is:sent newer_than:14d
(recently sent emails)
Step 12: Monitor and Troubleshoot
-
Check Celery logs for errors:
-
Monitor Redis for thread tracking information:
-
Common issues and solutions:
Issue | Possible Cause | Solution |
---|---|---|
Authentication errors | Incorrect service account configuration | Verify scopes in Google Workspace Admin Console |
Task failures | Redis connection issues | Check Redis server status and connection URL |
Missing emails | Restrictive query | Adjust query parameters to be more inclusive |
No users found | Insufficient admin permissions | Verify admin account has directory access |
Rate limiting | Too many API requests | Implement progressive backoff; spread tasks over time |
Step 13: Securing Sensitive Information
- Ensure service account keys are stored securely
- Implement access controls to the email index
- Set up appropriate user authentication for InspectRAG
- Configure Redis to require authentication if exposed to network
Advanced Configuration
Processing Specific Users
To process emails for specific users only:
from tasks.domain_email_tasks import fetch_gmail_domain_wide_task
fetch_gmail_domain_wide_task.delay(
user_id="yourdomain.com:[email protected]",
service_account_file="/etc/app/service-accounts/yourdomain-gmail.json",
subject_email="[email protected]",
query="is:inbox newer_than:7d"
)
Cross-Thread Analysis
To analyze relationships between email threads across users:
from tasks.domain_email_tasks import fetch_domain_threads
fetch_domain_threads.delay(
service_account_file="/etc/app/service-accounts/yourdomain-gmail.json",
domain="yourdomain.com",
admin_email="[email protected]",
query="is:anywhere newer_than:30d"
)
Performance Optimization
For large domains: 1. Process users in batches 2. Use different time ranges for initial vs. delta syncs 3. Schedule jobs during off-peak hours 4. Adjust Redis expiration settings for thread tracking
Resources
- Gmail API Documentation
- Google Workspace Admin SDK
- Service Account Documentation
- Redis Documentation
This guide provides comprehensive instructions for setting up and configuring InspectRAG's domain-wide Gmail integration. Follow these steps carefully to ensure proper configuration and operation.