# Production Readiness Launch checklist and operational best practices for Kadindexer. ## Pre-Launch Checklist ### Infrastructure - [ ] Rate limiting configured with exponential backoff - [ ] Caching layer deployed (permanent, balance, recent) - [ ] HTTP connection pooling enabled - [ ] API endpoint configured: `https://graph.kadindexer.io` - [ ] Environment variables secured (no keys in code) ### Query Optimization - [ ] All queries use pagination (`first: 50`) - [ ] Server-side filters applied (chainId, accountName, minHeight) - [ ] Query complexity under tier limits - [ ] Query variables used for all user input - [ ] Only necessary fields selected ### Error Handling - [ ] 429 rate limit retry logic implemented - [ ] Network error handling with timeouts - [ ] GraphQL error parsing configured - [ ] User-friendly error messages ### Monitoring - [ ] Request metrics tracked (rate, latency, errors) - [ ] Rate limit quota monitoring - [ ] Cache hit rate tracking - [ ] Performance dashboard live - [ ] Alerts configured for critical thresholds ### Security - [ ] No API keys in version control - [ ] HTTPS enforced - [ ] Input validation on user queries - [ ] Query complexity limits respected ## Error Handling Handle errors gracefully to maintain user experience during failures. ### Common Error Types | Status | Type | Meaning | Action | | --- | --- | --- | --- | | **400** | Bad Request | Invalid query syntax | Log error, fix query | | **429** | Rate Limited | Burst limit exceeded | Exponential backoff + retry | | **500** | Server Error | Kadindexer issue | Retry up to 3x | | **503** | Unavailable | Temporary outage | Retry with backoff | | **Network** | Timeout/DNS | Connection failure | Retry with timeout | ### Implementation Pattern ```javascript async function robustQuery(query, variables, options = {}) { const { maxRetries = 3, timeout = 30000 } = options; for (let attempt = 0; attempt < maxRetries; attempt++) { try { const controller = new AbortController(); const timeoutId = setTimeout(() => controller.abort(), timeout); const response = await fetch('https://graph.kadindexer.io', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ query, variables }), signal: controller.signal }); clearTimeout(timeoutId); if (!response.ok) { const status = response.status; // Rate limited - exponential backoff if (status === 429) { const delay = Math.min(1000 * Math.pow(2, attempt), 60000); console.warn(`Rate limited, retrying in ${delay}ms`); await new Promise(resolve => setTimeout(resolve, delay)); continue; } // Server error - retry if (status >= 500) { const delay = 1000 * (attempt + 1); console.warn(`Server error ${status}, retrying in ${delay}ms`); await new Promise(resolve => setTimeout(resolve, delay)); continue; } // Client error - don't retry if (status >= 400 && status < 500) { const error = await response.json(); throw new Error(`Query failed: ${error.message || status}`); } } return await response.json(); } catch (error) { // Network/timeout error if (error.name === 'AbortError' || error.code === 'ECONNREFUSED') { if (attempt < maxRetries - 1) { const delay = 1000 * (attempt + 1); console.warn(`Network error, retrying in ${delay}ms`); await new Promise(resolve => setTimeout(resolve, delay)); continue; } } // Final attempt failed if (attempt === maxRetries - 1) { throw new Error(`Query failed after ${maxRetries} attempts: ${error.message}`); } throw error; } } } ``` ### GraphQL Error Handling GraphQL can return partial data with errors: ```javascript const result = await client.request(query, variables); // Check for GraphQL errors if (result.errors) { result.errors.forEach(error => { console.error('GraphQL Error:', { message: error.message, path: error.path, extensions: error.extensions }); }); // Decide whether to use partial data or throw if (!result.data) { throw new Error('Query returned no data'); } } // Use data if available return result.data; ``` ### User-Facing Error Messages Translate technical errors into user-friendly messages: ```javascript function getUserMessage(error) { if (error.response?.status === 429) { return 'Too many requests. Please wait a moment and try again.'; } if (error.response?.status >= 500) { return 'Service temporarily unavailable. Please try again shortly.'; } if (error.message?.includes('complexity')) { return 'Query too complex. Please reduce the amount of data requested.'; } if (error.name === 'AbortError') { return 'Request timed out. Please check your connection.'; } return 'An error occurred. Please try again.'; } ``` ## High Availability Strategies ### Graceful Degradation Provide cached or limited data when primary queries fail: ```javascript async function getAccountBalance(accountName, chainId) { try { // Try live query const result = await client.request(balanceQuery, { accountName, chainId }); balanceCache.set(cacheKey, result); return result; } catch (error) { // Fall back to cached data const cached = balanceCache.get(cacheKey); if (cached) { console.warn('Using cached balance due to error:', error.message); return { ...cached, stale: true }; } throw error; } } ``` ### Circuit Breaker Pattern Prevent cascading failures by temporarily stopping requests: ```javascript class CircuitBreaker { constructor(threshold = 5, timeout = 60000) { this.failureCount = 0; this.threshold = threshold; this.timeout = timeout; this.state = 'CLOSED'; // CLOSED, OPEN, HALF_OPEN this.nextAttempt = Date.now(); } async execute(fn) { if (this.state === 'OPEN') { if (Date.now() < this.nextAttempt) { throw new Error('Circuit breaker is OPEN'); } this.state = 'HALF_OPEN'; } try { const result = await fn(); this.onSuccess(); return result; } catch (error) { this.onFailure(); throw error; } } onSuccess() { this.failureCount = 0; this.state = 'CLOSED'; } onFailure() { this.failureCount++; if (this.failureCount >= this.threshold) { this.state = 'OPEN'; this.nextAttempt = Date.now() + this.timeout; console.error('Circuit breaker opened'); } } } // Usage const breaker = new CircuitBreaker(); async function queryWithBreaker(query, variables) { return breaker.execute(() => client.request(query, variables)); } ``` ## Deployment Strategy ### Environment Configuration Use different configurations per environment: ```javascript const config = { development: { endpoint: 'https://graph.kadindexer.io', timeout: 10000, retries: 1, logLevel: 'debug' }, staging: { endpoint: 'https://graph.kadindexer.io', timeout: 30000, retries: 3, logLevel: 'info' }, production: { endpoint: 'https://graph.kadindexer.io', timeout: 30000, retries: 3, logLevel: 'warn' } }; const env = process.env.NODE_ENV || 'development'; export default config[env]; ``` ### Health Checks Implement health checks to verify Kadindexer connectivity: ```javascript async function healthCheck() { try { const result = await client.request( `query { graphConfiguration { version } }`, {}, { timeout: 5000 } ); return { status: 'healthy', version: result.graphConfiguration.version, timestamp: new Date().toISOString() }; } catch (error) { return { status: 'unhealthy', error: error.message, timestamp: new Date().toISOString() }; } } // Run health check every 60 seconds setInterval(async () => { const health = await healthCheck(); if (health.status === 'unhealthy') { console.error('Kadindexer health check failed:', health.error); } }, 60000); ``` ### Gradual Rollout Deploy to a subset of users first: ```javascript function shouldUseKadindexer(userId) { // Gradual rollout based on user ID const rolloutPercentage = 10; // Start with 10% const hash = userId.split('').reduce((a, b) => { a = ((a << 5) - a) + b.charCodeAt(0); return a & a; }, 0); return Math.abs(hash % 100) < rolloutPercentage; } ``` ## Monitoring & Observability ### Key Metrics Dashboard Track these metrics in your monitoring system: ```javascript const metrics = { // Request metrics totalRequests: 0, successfulRequests: 0, failedRequests: 0, // Performance metrics avgResponseTime: 0, p95ResponseTime: 0, p99ResponseTime: 0, // Resource metrics cacheHitRate: 0, rateLimitUsage: 0, // Error metrics errorsByType: {}, // Business metrics activeUsers: 0, queriesPerUser: 0 }; function updateMetrics(result, duration, error) { metrics.totalRequests++; if (error) { metrics.failedRequests++; metrics.errorsByType[error.type] = (metrics.errorsByType[error.type] || 0) + 1; } else { metrics.successfulRequests++; } // Update response time (moving average) metrics.avgResponseTime = (metrics.avgResponseTime * 0.95) + (duration * 0.05); } ``` ### Logging Best Practices Log relevant information without exposing sensitive data: ```javascript function logQuery(query, variables, result, duration) { const logData = { timestamp: new Date().toISOString(), queryName: query.definitions?.[0]?.name?.value || 'anonymous', duration, success: !result.errors, // Sanitize variables - remove sensitive data variables: sanitizeVariables(variables) }; if (result.errors) { logData.errors = result.errors.map(e => ({ message: e.message, path: e.path })); } if (duration > 1000) { console.warn('Slow query:', logData); } else { console.info('Query:', logData); } } function sanitizeVariables(variables) { // Remove or redact sensitive fields const sanitized = { ...variables }; if (sanitized.privateKey) delete sanitized.privateKey; if (sanitized.signature) sanitized.signature = '***'; return sanitized; } ``` ## Incident Response ### Detection & Alerting Set up alerts for critical issues: ```javascript function checkAlerts() { // High error rate const errorRate = metrics.failedRequests / metrics.totalRequests; if (errorRate > 0.05) { sendAlert('HIGH_ERROR_RATE', `Error rate: ${(errorRate * 100).toFixed(2)}%`); } // Approaching rate limit if (metrics.rateLimitUsage > 0.9) { sendAlert('RATE_LIMIT_WARNING', `Using ${(metrics.rateLimitUsage * 100).toFixed(0)}% of rate limit`); } // Slow queries if (metrics.p95ResponseTime > 2000) { sendAlert('SLOW_QUERIES', `P95 response time: ${metrics.p95ResponseTime}ms`); } // Low cache hit rate if (metrics.cacheHitRate < 0.5) { sendAlert('LOW_CACHE_HIT_RATE', `Cache hit rate: ${(metrics.cacheHitRate * 100).toFixed(0)}%`); } } setInterval(checkAlerts, 60000); // Check every minute ``` ### Response Procedures **When errors spike:** 1. Check [Kadindexer status](https://kadindexer.io) (if status page exists) 2. Review recent deployments (rollback if needed) 3. Examine error logs for patterns 4. Verify rate limits haven't been exceeded 5. Contact support if Kadindexer issue: [toni@hackachain.io](mailto:toni@hackachain.io) **When performance degrades:** 1. Check query complexity and pagination sizes 2. Verify caching is working correctly 3. Review recent query changes 4. Check network connectivity 5. Consider tier upgrade if sustained high load ## Support Channels ### By Tier **Basic (Free):** - Email: [toni@hackachain.io](mailto:toni@hackachain.io) - Community support - Response time: Best effort **Developer:** - Email: [toni@hackachain.io](mailto:toni@hackachain.io) - Priority support - Response time: 48 hours **Team:** - Dedicated support team - Email: [toni@hackachain.io](mailto:toni@hackachain.io) - Priority escalation - Response time: 24 hours ### When to Contact Support - Persistent 500/503 errors - Unexpected rate limiting - Query complexity issues - Data inconsistencies - Feature requests - Tier upgrades ## Final Checklist Before going live: **Infrastructure:** - [ ] Rate limiting with exponential backoff - [ ] Caching with appropriate TTLs - [ ] Connection pooling enabled - [ ] Circuit breaker implemented - [ ] Health checks running **Monitoring:** - [ ] Metrics collection active - [ ] Logging configured - [ ] Alerts set up - [ ] Dashboard created - [ ] On-call rotation defined **Error Handling:** - [ ] Retry logic for 429/500/503 - [ ] No retry for 400 errors - [ ] User-friendly error messages - [ ] Graceful degradation strategy **Security:** - [ ] API keys in environment variables - [ ] HTTPS enforced - [ ] Input validation implemented - [ ] Sensitive data not logged **Performance:** - [ ] Queries optimized - [ ] Query complexity under limits - [ ] Appropriate tier selected - [ ] Load tested ## Resources - [Query Optimization →](/guides/advanced/query-optimization) - [Performance & Scaling →](/guides/advanced/performance-scaling) - [GraphQL API Reference →](https://docs.kadindexer.io/apis) **Need help?** [toni@hackachain.io](mailto:toni@hackachain.io)