Admin Guide for InspectRAG 🛠️
About this Guide
This guide provides system administrators with step-by-step instructions to install, configure, and manage InspectRAG. It includes detailed installation guides for both Linux and Windows platforms, instructions on OpenID integration, LDAP configuration, SharePoint setup, real-time synchronization, and the use of environment variables for smooth deployment. Additionally, it covers the configuration of the docker-compose.yml
file required for deployment, with necessary modifications for pulling Docker images.
Table of Contents
- Installation Guide
- Prerequisites
- Installation Steps
- System Setup
- Step 1: Install and Configure OpenID Provider
- Step 2: Integrate with Active Directory via LDAP
- Step 3: Set Up VectorDB
- Step 4: Configure SharePoint Integration
- Managing Users and Permissions
- Environment Variables Configuration
- System Monitoring and Logs
- Troubleshooting Common Issues
- Best Practices for Administrators
Installation Guide
This guide explains how to install and set up InspectRAG using the zip file with Docker and Docker Compose.
Prerequisites
Before proceeding with the installation, ensure that your environment meets the following requirements:
1. Personal Access Token (PAT):
- Request your Docker Hub Personal Access Token (PAT) from [email protected]. This token is required for Docker Hub authentication.
2. Software Requirements:
- Docker Engine: Version 20.10.x or newer.
- Docker Compose: Version 2.x or newer.
- GNU C Library (glibc): Version 2.28 or newer (ensure compatibility with the host system).
3. Hardware Requirements:
- RAM: Minimum 24 GB (32 GB recommended for optimal performance).
- CPU: At least 8 cores.
- Storage: 40 GB of available space (SSD recommended for better performance).
4. Network and Security:
- Ensure network access to all required external services.
- On-premises SharePoint should be accessible to the machine running the server.
5. System Preparation:
- Update your system to the latest package versions for compatibility with Docker, glibc, and related tools.
- Verify that your system meets the glibc version requirements.
6. Additional Notes:
- If running on Linux, check for dependencies or libraries required by Docker or Docker Compose.
- Confirm that the environment can handle large workloads if deploying resource-intensive components.
Ensure all prerequisites are met to avoid potential issues during installation.
Installation Steps
Step 1: Extract the Package File
Extract the provided package that includes docker-compose.yml
, InspectRAG-CLI
,files into a directory.
Step 2: Run the Install Script
Execute the installation script by running the following command:
Important Notes: - The installer script is designed for Linux systems only.
- The installer should only be used for initial installation.
- After installation, use standard Docker Compose commands for daily operations.
- The -p flag must point to the full path where you extracted the files.
- Rename the docker-compose.yaml
file to docker-compose.yml
if you get path not found error.
Environment Configuration: - By default, the installer will prompt you for various configuration values during installation - To bypass interactive configuration, you can use the --env-file parameter:
What the Install Script Does
-
Check for Docker: The script checks if Docker is installed. If not, it installs Docker along with necessary dependencies.
-
Check for Docker Compose: Similarly, it checks for Docker Compose and installs it if it's missing.
-
Prompt for Docker Hub PAT: During execution, the script will request:
- Your Docker Hub username.
- The PAT token provided by [email protected].
- Authenticate with Docker Hub: The script validates the PAT token and authenticates with Docker Hub.
- Environment Variable Configuration: It will prompt you to input or confirm key environment variables:
RAG_PORT
POSTGRES_USER
POSTGRES_PASSWORD
SHAREPOINT_HOSTNAME
,SHAREPOINT_SITE_PATH
RAG_OPENAI_API_KEY
,OPENAI_API_KEY
GRAPH_API_CLIENT_ID
,GRAPH_API_CLIENT_SECRET
,GRAPH_API_TENANT_ID
CHUNK_SIZE
(default:1500
)-
CHUNK_SCORE_THRESHOLD
(default:0.5
) -
Validate docker-compose.yml Configuration: It ensures you've correctly configured
docker-compose.yml
to use the provided environment variables. You'll be asked to confirm this step. -
Pull Required Docker Images: The script pulls these images from Docker Hub:
eunomatix/inspect-rag-api
eunomatix/inspect-rag:-celery
ankane/pgvector:latest
- Start Docker Compose: All services are launched using
docker compose up -d
.
Post-Installation Steps
Step 3: Verify Installation
-
Check Container Status:
Ensure all services are running.
-
Check Logs:
Review logs for any errors during startup.
-
Test InspectRAG API:
Use the followingcurl
command to verify the API is working. Replace placeholders with actual values, including a valid JWTaccess_token
with required roles and username.
curl -X POST "http://RAG_API_URL/v1/chat/completions" \ -H "Authorization: Bearer YOUR_ACCESS_TOKEN" \ -H "Content-Type: application/json" \ -H "x-user-name: janedoe" \ -H "accesstoken: YOUR_ACCESS_TOKEN" \ -d '{ "model": "gpt-4", "messages": [ { "role": "user", "content": "Hello, InspectRAG!" } ] }'
-
Important:
- Ensure the Bearer token in the
Authorization
header matchesYOUR_ACCESS_TOKEN
. - The
x-user-name
header can be thepreferred_username
from your JWT (for example,"janedoe"
). - The
accesstoken
header also needs the same token as inAuthorization
. - Confirm that the JWT token (access token) includes the roles granting access to the documents or resources you want to query.
- Ensure the Bearer token in the
If you receive a valid JSON response from InspectRAG, your installation and authentication flow are correctly configured. If you encounter any 401 Unauthorized or 403 Forbidden errors, verify that:
- The token is unexpired and has the correct roles (e.g., "Members"
).
- The Authorization
header and the accesstoken
header match.
- Your roles in the JWT token align with the permissions set within InspectRAG and your identity provider.
Additional Notes
- Ensure your environment variables match the requirements of your deployment.
- If the script encounters issues, check that
docker-compose.yml
exists and is correctly configured. - For further assistance, contact [email protected].
- To setup Sharepoint Online/Onprem you can check their guide on Online Guide/OnPrem Guide
2. System Setup
Prerequisites
Ensure the following prerequisites are met:
- Active Directory (AD) is configured for managing users and roles.
- OpenID provider (e.g., Keycloak or Azure AD) is installed and accessible.
- VectorDB is installed to store embeddings and query data.
- SharePoint Online or On-Prem is available with the required permissions.
Step 1: Install and Configure OpenID Provider
-
Set Up OpenID Provider:
-
Install and configure your OpenID Provider (e.g., Keycloak or Azure AD) to handle authentication.
-
Enable LDAP Integration:
-
Integrate LDAP with your OpenID provider to import users and roles from AD.
-
Configure OpenID Clients for InspectRAG:
-
Specify the callback URI for InspectRAG.
- Generate and securely store the client ID and secret.
Step 2: Integrate with Active Directory via LDAP
-
Configure LDAP Settings:
-
Connect to the AD server using the LDAP protocol.
- Provide the bind DN and password for querying the AD server.
-
For further details you can refer to the LDAP Configuration Guide
-
Synchronize Users and Roles:
-
Import users and groups from AD into the OpenID provider.
- Ensure hierarchical roles are correctly mapped.
Step 3: Set Up VectorDB
-
Install VectorDB:
-
Use Docker or your preferred setup to install VectorDB.
-
Initialize the Database:
-
Create schemas to store embeddings and metadata.
-
Connect InspectRAG to the Database:
-
Provide the database connection details in the
.env
file.
Step 4: Configure SharePoint Integration
For SharePoint Online:
-
Register an App in Azure AD:
-
Register a new application for Microsoft Graph API access.
-
Grant API Permissions:
-
Add permissions:
Sites.Read.All
,Sites.Manage.All
. -
Set Up Webhooks:
-
Configure webhooks to receive real-time file change notifications.
For SharePoint On-Prem:
-
Deploy Event Receivers:
-
Use PowerShell scripts to deploy event receivers.
-
Enable InspectRAG Plugin:
-
Install and enable the InspectRAG plugin.
-
Verify LDAP Integration:
-
Ensure roles and permissions sync between AD and SharePoint.
3. Managing Users and Permissions
Role Management in OpenID
-
Sync Roles and Groups:
-
Ensure roles and groups from AD are synced with the OpenID provider.
-
Map Roles to SharePoint Permissions:
-
Align roles with permissions in SharePoint.
-
Assign Roles Appropriately:
-
Assign roles based on departments or group membership.
Configuring SharePoint Permissions
-
Set Document-Level Permissions:
-
Assign permissions at the document level.
-
Align with AD Roles:
-
Ensure permissions correspond to user roles from AD.
Monitoring Synchronization
-
Utilize Webhooks and Event Receivers:
-
Capture file changes and role updates in real-time.
-
Immediate Reflection of Changes:
-
Ensure role changes in AD are promptly updated.
4. Environment Variables Configuration
Ensure the following environment variables are configured correctly in your .env
file.
# =============================================================================
# Server Configuration
# =============================================================================
RAG_HOST=0.0.0.0
RAG_PORT=8000
RAG_UPLOAD_DIR=./uploads/
# =============================================================================
# Database Configuration
# =============================================================================
VECTOR_DB_TYPE=pgvector
POSTGRES_DB=inspectrag_db
POSTGRES_USER=inspectrag_user
POSTGRES_PASSWORD=password123
DB_HOST=vectordb
DB_PORT=5432
COLLECTION_NAME=inspectrag_collection
# =============================================================================
# Text Processing Configuration
# =============================================================================
CHUNK_SIZE=1500
CHUNK_SCORE_THRESHOLD=0.5
CHUNK_OVERLAP=100
PDF_EXTRACT_IMAGES=false
MAX_CHUNK_TOKENS=8000
MAX_CHUNK_CHARS=32000
MAX_ITERATIONS=5
MAX_FINAL_TOKENS=16000
HIERARCHICAL_BATCH_SIZE=10
MIN_SUMMARY_RATIO=0.1
MAX_HIERARCHY_LEVELS=3
# =============================================================================
# Logging Configuration
# =============================================================================
DEBUG_RAG_API=false
CONSOLE_JSON=false
# =============================================================================
# Redis Configuration
# =============================================================================
RAG_REDIS_URL=redis://inspectrag-redis:6379/0
# =============================================================================
# OpenAI Configuration
# =============================================================================
OPENAI_API_KEY=sk-your-openai-key
RAG_OPENAI_API_KEY=sk-another-openai-key
RAG_OPENAI_BASEURL=https://api.openai.com
RAG_OPENAI_PROXY=
OPENAI_REVERSE_PROXY=https://api.openai.com/v1
# =============================================================================
# Embeddings Configuration
# =============================================================================
EMBEDDINGS_PROVIDER=openai
EMBEDDINGS_MODEL=text-embedding-3-small
HF_TOKEN=
OLLAMA_BASE_URL=http://ollama:11434
# =============================================================================
# Microsoft Graph API Configuration
# =============================================================================
SHAREPOINT_ENABLED=true
GRAPH_API_CLIENT_ID=your-client-id
GRAPH_API_CLIENT_SECRET=your-client-secret
GRAPH_API_TENANT_ID=your-tenant-id
# =============================================================================
# SharePoint Configuration
# =============================================================================
SHAREPOINT_ON_PREM_ENABLED=true
SHAREPOINT_HOSTNAME=yourcompany.sharepoint.com
SHAREPOINT_SITE_PATH=/sites/UsersSite
# =============================================================================
# Kerberos Authentication Configuration
# =============================================================================
KRB5_CONFIG_PATH=/etc/krb5.conf
[email protected]
KERBEROS_PASSWORD=your-kerberos-password
KERBEROS_TICKET_LIFETIME=100
# =============================================================================
# License Configuration
# =============================================================================
LICENSE_KEY=your-license-key
LICENSE_REQUEST_LIMIT=10000
MAX_LICENSE_VIOLATIONS=5
LICENSE_CACHE_EXPIRY_HOURS=1
PRIVATE_KEY=
PUBLIC_KEY=
# =============================================================================
# Jira Configuration
# =============================================================================
CLOUD_JIRA_ENABLED=true
JIRA_HOST=your-domain.atlassian.net
[email protected]
JIRA_API_TOKEN=your-jira-api-token
JIRA_WEBHOOK_SECRET=your-jira-webhook-secret
ONPREM_JIRA_ENABLED=true
ONPREM_JIRA_HOST=http://localhost:8090
ONPREM_JIRA_PAT=your-onprem-jira-pat
ONPREM_WEBHOOK_SECRET=your-onprem-jira-webhook-secret
# =============================================================================
# Confluence Configuration
# =============================================================================
ONPREM_CONFLUENCE_ENABLED=true
ONPREM_CONFLUENCE_HOST=http://localhost:8070
ONPREM_CONFLUENCE_PAT=your-confluence-pat
ONPREM_CONFLUENCE_WEBHOOK_SECRET=your-confluence-webhook-secret
CONFLUENCE_DEFAULT_PERMISSIONS=
# =============================================================================
# Email Integration Configuration
# =============================================================================
ENABLE_GMAIL=false
ENABLE_PERSONAL_GMAIL=false
ENABLE_DOMAIN_EMAILS=false
ENABLE_OUTLOOK=false
DOMAIN_EMAIL_INTERVAL=300
[email protected]
PERSONAL_GMAIL_INTERVAL=300
GMAIL_CREDENTIALS_FILE=path/to/credentials.json
GMAIL_TOKEN_FILE=path/to/token.json
SERVICE_ACCOUNT_FILE=path/to/service_account.json
WORKSPACE_DOMAIN=your-domain.com
[email protected]
DOMAIN_EMAIL_QUERY=is:inbox newer_than:1d
SECURE_SERVICE_ACCOUNTS_DIR=/etc/app/service-accounts
OUTLOOK_INTERVAL=600
OUTLOOK_USER_ID=your-outlook-user-id
[email protected]
EMAIL_DEFAULT_PERMISSIONS=
# =============================================================================
# AWS S3 Configuration
# =============================================================================
S3_ENABLED=false
AWS_ACCESS_KEY_ID=your-aws-access-key-id
AWS_SECRET_ACCESS_KEY=your-aws-secret-access-key
AWS_REGION=your-aws-region
S3_BUCKET_NAME=your-s3-bucket-name
S3_DEFAULT_PERMISSIONS=
S3_DEFAULT_CREATOR=s3_webhook
Variable Descriptions
Server Configuration
RAG_HOST
: Server bind address for the RAG API (default: 0.0.0.0)RAG_PORT
: Port number for the RAG API server (default: 8000)RAG_UPLOAD_DIR
: Directory path where uploaded files are stored (default: ./uploads/)
Database Configuration
VECTOR_DB_TYPE
: Type of vector database to use (currently only pgvector is supported)POSTGRES_DB
: PostgreSQL database namePOSTGRES_USER
: PostgreSQL database usernamePOSTGRES_PASSWORD
: PostgreSQL database passwordDB_HOST
: PostgreSQL database host addressDB_PORT
: PostgreSQL database port numberCOLLECTION_NAME
: Name of the vector collection in the database
Text Processing Configuration
CHUNK_SIZE
: Maximum character length for each document chunk processed by the embedding model- Implications:
- Larger Values: Fewer chunks with more context per chunk
- Smaller Values: More chunks with less context per chunk
-
Recommendation: Adjust based on document sizes and desired granularity
-
CHUNK_SCORE_THRESHOLD
: Minimum cosine similarity score threshold for retrieving relevant chunks - Cosine Similarity: Measures similarity between vectors, ranges from -1 to 1
-
Interpretation:
- Lower Values: Stricter matching; only highly similar chunks retrieved
- Higher Values: Looser matching; more chunks retrieved, including less similar ones
- Purpose: Prevents unrelated chunks of personal data from going to your LLM model
-
CHUNK_OVERLAP
: Number of overlapping characters between adjacent chunks (default: 100) PDF_EXTRACT_IMAGES
: Whether to extract images from PDF files (default: false)MAX_CHUNK_TOKENS
: Maximum number of tokens allowed per text chunk (default: 8000)MAX_CHUNK_CHARS
: Maximum number of characters allowed per text chunk (default: 32000)MAX_ITERATIONS
: Maximum number of processing iterations (default: 5)MAX_FINAL_TOKENS
: Maximum number of tokens in final output (default: 16000)HIERARCHICAL_BATCH_SIZE
: Batch size for hierarchical processing (default: 10)MIN_SUMMARY_RATIO
: Minimum ratio for summary generation (default: 0.1)MAX_HIERARCHY_LEVELS
: Maximum number of hierarchy levels for processing (default: 3)
Logging Configuration
DEBUG_RAG_API
: Enable debug mode for detailed logging (default: false)CONSOLE_JSON
: Output logs in JSON format (default: false)
Redis Configuration
RAG_REDIS_URL
: Redis connection URL for caching and task queue (default: redis://localhost:6379/0)
OpenAI Configuration
OPENAI_API_KEY
: OpenAI API key for embeddings and completionsRAG_OPENAI_API_KEY
: Separate OpenAI API key for RAG operations (defaults to OPENAI_API_KEY if not set)RAG_OPENAI_BASEURL
: Custom OpenAI base URL (default: https://api.openai.com)RAG_OPENAI_PROXY
: OpenAI proxy URL (optional)OPENAI_REVERSE_PROXY
: Reverse proxy URL for OpenAI API (default: https://api.openai.com/v1)
Embeddings Configuration
EMBEDDINGS_PROVIDER
: Embeddings provider (options: openai, azure, huggingface, huggingfacetei, ollama; default: openai)EMBEDDINGS_MODEL
: Embeddings model name (defaults vary by provider)- OpenAI: text-embedding-3-small
- Azure: text-embedding-3-small
- HuggingFace: sentence-transformers/all-MiniLM-L6-v2
- Ollama: nomic-embed-text
HF_TOKEN
: HuggingFace token (required if using HuggingFace models)OLLAMA_BASE_URL
: Ollama base URL (default: http://ollama:11434)
SharePoint Configuration
SHAREPOINT_ENABLED
: Enable SharePoint Online integration (default: false)SHAREPOINT_ON_PREM_ENABLED
: Enable SharePoint On-Premises integration (default: false)SHAREPOINT_HOSTNAME
: SharePoint hostname (e.g., company.sharepoint.com)SHAREPOINT_SITE_PATH
: SharePoint site path (e.g., /sites/yoursite)GRAPH_API_CLIENT_ID
: Microsoft Graph API client ID for SharePoint OnlineGRAPH_API_CLIENT_SECRET
: Microsoft Graph API client secretGRAPH_API_TENANT_ID
: Microsoft Graph API tenant ID
Kerberos Authentication Configuration
KRB5_CONFIG_PATH
: Path to Kerberos configuration file (krb5.conf)KERBEROS_PRINCIPAL
: Kerberos principal for authentication (e.g., [email protected])KERBEROS_PASSWORD
: Password for Kerberos authenticationKERBEROS_TICKET_LIFETIME
: Kerberos ticket lifetime in days (default: 100)
License Configuration
LICENSE_KEY
: License key for premium features (required)LICENSE_REQUEST_LIMIT
: Maximum number of requests allowed (default: 10000)MAX_LICENSE_VIOLATIONS
: Maximum number of license violations allowed (default: 5)LICENSE_CACHE_EXPIRY_HOURS
: License cache expiry time in hours (default: 1)PRIVATE_KEY
: Private key for license generation (optional)PUBLIC_KEY
: Public key for license validation (optional)
Jira Configuration
CLOUD_JIRA_ENABLED
: Enable Jira Cloud integration (default: false)JIRA_HOST
: Jira Cloud hostname (e.g., your-domain.atlassian.net)JIRA_EMAIL
: Jira user email addressJIRA_API_TOKEN
: Jira API token for authenticationJIRA_WEBHOOK_SECRET
: Jira webhook secret for secure communicationONPREM_JIRA_ENABLED
: Enable On-premises Jira integration (default: false)ONPREM_JIRA_HOST
: On-premises Jira URLONPREM_JIRA_PAT
: On-premises Jira Personal Access TokenONPREM_WEBHOOK_SECRET
: On-premises Jira webhook secret
Confluence Configuration
ONPREM_CONFLUENCE_ENABLED
: Enable On-premises Confluence integration (default: false)ONPREM_CONFLUENCE_HOST
: On-premises Confluence URLONPREM_CONFLUENCE_PAT
: On-premises Confluence Personal Access TokenONPREM_CONFLUENCE_WEBHOOK_SECRET
: Confluence webhook secretCONFLUENCE_DEFAULT_PERMISSIONS
: Default permissions for Confluence content (comma-separated)
Email Integration Configuration
ENABLE_GMAIL
: Enable Gmail integration (default: false)ENABLE_PERSONAL_GMAIL
: Enable personal Gmail integration (default: false)ENABLE_DOMAIN_EMAILS
: Enable domain-wide Gmail integration (default: false)ENABLE_OUTLOOK
: Enable Outlook integration (default: false)DOMAIN_EMAIL_INTERVAL
: Domain email sync interval in seconds (default: 300)GMAIL_USER_ID
: Gmail user ID/email addressPERSONAL_GMAIL_INTERVAL
: Personal Gmail sync interval in seconds (default: 300)GMAIL_CREDENTIALS_FILE
: Path to Gmail credentials JSON fileGMAIL_TOKEN_FILE
: Path to Gmail token JSON fileSERVICE_ACCOUNT_FILE
: Path to service account JSON file for domain-wide accessWORKSPACE_DOMAIN
: Workspace domain for domain-wide Gmail accessWORKSPACE_ADMIN
: Workspace admin email addressDOMAIN_EMAIL_QUERY
: Gmail search query for domain emails (default: is:inbox newer_than:1d)SECURE_SERVICE_ACCOUNTS_DIR
: Directory for secure service account files (default: /etc/app/service-accounts)OUTLOOK_INTERVAL
: Outlook sync interval in seconds (default: 600)OUTLOOK_USER_ID
: Outlook user IDOUTLOOK_EMAIL
: Outlook email addressEMAIL_DEFAULT_PERMISSIONS
: Default permissions for email content (comma-separated)
AWS S3 Configuration
S3_ENABLED
: Enable AWS S3 integration (default: false)AWS_ACCESS_KEY_ID
: AWS access key ID for S3 accessAWS_SECRET_ACCESS_KEY
: AWS secret access key for S3 accessAWS_REGION
: AWS region for S3 bucketS3_BUCKET_NAME
: S3 bucket name for file storageS3_DEFAULT_PERMISSIONS
: Default permissions for S3 objects (comma-separated)S3_DEFAULT_CREATOR
: Default creator for S3 objects (default: s3_webhook)
Chunk Score Threshold Table
Threshold Value | Strictness Level | Description |
---|---|---|
0.1 | Very High | Only the most similar chunks are retrieved. |
0.3 | High | Retrieves chunks with strong similarity to the query. |
0.5 | Moderate | Balances precision and recall. |
0.7 | Low | Retrieves more chunks, including moderately similar ones. |
0.9 | Very Low | Retrieves most chunks, even if similarity is low. |
Note: Adjust CHUNK_SCORE_THRESHOLD
based on your application's need for precision (fewer, more relevant chunks) versus recall (more chunks, potentially less relevant).
5. System Monitoring and Logs
Enable Audit Logs
-
Capture Activities:
-
Record all access attempts, queries, and modifications.
-
Secure Log Storage:
-
Ensure logs are stored securely and accessible only to authorized personnel.
Monitor Alerts
-
Set Up Alerts for Suspicious Activities:
-
Configure alerts for failed login attempts and unauthorized access.
-
Proactive Threat Response:
-
Use alerts to respond quickly to potential security threats.
Regular Log Review
-
Periodic Analysis:
-
Regularly review logs to identify patterns or anomalies.
-
Compliance and Reporting:
-
Maintain logs for compliance purposes and generate reports as required.
6. Troubleshooting Common Issues
Issue 1: Users Not Syncing from AD to OpenID
Solution:
-
Verify LDAP Configuration:
-
Ensure LDAP settings in the OpenID provider are correct.
-
Check Network Connectivity:
-
Confirm network connectivity between the OpenID provider and the AD server.
-
Validate Bind Credentials:
-
Ensure the bind DN and password are correct and have necessary permissions.
Issue 2: Incorrect User Access Permissions
Solution:
-
Review SharePoint Permissions:
-
Check that SharePoint permissions align with roles from AD.
-
Manual Sync:
-
Perform a manual sync of users and roles between AD and the OpenID provider.
Issue 3: Webhooks Not Working in SharePoint Online
Solution:
-
Verify Microsoft Graph API Permissions:
-
Ensure the app has the necessary permissions.
-
Check Notification URL Accessibility:
-
Confirm that the webhook notification URL is correctly configured and accessible.
7. Best Practices for Administrators
-
Periodic Sync of AD Roles:
-
Schedule regular synchronization to keep roles and permissions up to date.
-
Audit User Activity:
-
Regularly monitor user activities to detect unauthorized access.
-
Test Permissions Regularly:
-
Conduct periodic tests to ensure permissions are correctly enforced.
-
Monitor Webhooks and Event Receivers:
-
Ensure that file changes are captured and processed in real-time.
-
Backup Configurations and Data:
-
Maintain regular backups to prevent data loss.
-
Stay Updated:
-
Keep all components, including Docker images and dependencies, updated to the latest stable versions.
8. Making a Query Request to InspectRAG
This section outlines how to: 1. Obtain an access token from your OpenID provider. 2. Use the access token in a Chat Completions API request to query InspectRAG.
Step 1: Obtaining the Access Token
To authenticate requests, you'll need a JWT access token from an OpenID provider. This token is required to contain specific fields for authorization in InspectRAG.
JWT Token Requirements
The token must be a JSON Web Token (JWT) with fields similar to the following example:
{
"exp": 1731259764,
"iat": 1731259464,
"jti": "90bd5c93-0f4d-429b-8536-df475b23a77e",
"iss": "https://your-openid-provider.com/realms/YourRealm",
"sub": "926e6a63-8c7e-40a9-96cb-b764362e932e",
"typ": "Bearer",
"azp": "your-client-id",
"allowed-origins": [
"https://your-app-domain.com"
],
"realm_access": {
"roles": [
"default-roles-inspectchat",
"Members",
"offline_access",
"uma_authorization"
]
},
"scope": "profile email",
"name": "janedoe",
"preferred_username": "janedoe",
"email": "[email protected]"
}
Key Fields in the Token:
- exp
and iat
: Expiration and issued-at timestamps.
- iss
: The issuer URL (your OpenID provider).
- sub
: User's unique ID.
- azp
: Client ID used to request the token.
- realm_access.roles
: Array of roles assigned to the user, such as "Members"
or the roles user must have to query the file.
- preferred_username
: Username for identifying the user.
- email
: User's email address.
Getting the Token (Example)
Here's an example curl command to obtain the token from an OpenID provider:
curl -X POST "https://your-openid-provider.com/realms/YourRealm/protocol/openid-connect/token" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "client_id=your-client-id" \
-d "client_secret=your-client-secret" \
-d "grant_type=password" \
-d "username=janedoe" \
-d "password=your-password"
This command will return a JSON response containing the access token under "access_token"
.
Step 2: Using the Access Token in Chat Completions
After obtaining the access token from your OpenID provider, use it in the Chat Completions API request as shown below.
Example Chat Completions Request
curl -X POST "http://RAG_API_URL/v1/chat/completions" \
-H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-H "x-user-name: janedoe" \
-H "accesstoken: YOUR_ACCESS_TOKEN" \
-d '{
"model": "gpt-4",
"temperature": 1,
"top_p": 1,
"presence_penalty": 0,
"frequency_penalty": 0,
"user": "user-unique-id",
"stream": true,
"messages": [
{
"role": "user",
"content": "Your question here"
}
]
}'
Explanation of Headers and Body Fields
- Headers:
Authorization: Bearer YOUR_ACCESS_TOKEN
: Used to authenticate the request, whereYOUR_ACCESS_TOKEN
is the JWT token obtained from your OpenID provider.Content-Type: application/json
: Specifies that the request body is in JSON format.x-user-name: janedoe
: The username orpreferred_username
from the token, typically used for tracking or logging purposes.-
accesstoken: YOUR_ACCESS_TOKEN
: Provides the same access token for additional validation by the API. -
Body Fields:
"model": "gpt-4"
: Specifies the model to use for the response generation.
- `
"temperature"
,"top_p"
,"presence_penalty"
,"frequency_penalty"
: Control settings for response creativity, randomness, and variety."user": "user-unique-id"
: Replace with a unique identifier for the user (e.g., thesub
field from the JWT token)."messages"
:"role": "user"
: Sets the role for the message, indicating it’s a user-initiated query."content": "Your question here"
: Replace with the question or query content you want to submit to the Chat Completions API.