Skip to content
Last updated

Launch checklist and operational best practices for Kadindexer.


Pre-Launch Checklist

Infrastructure

  • Rate limiting configured with exponential backoff
  • Caching layer deployed (permanent, balance, recent)
  • HTTP connection pooling enabled
  • API endpoint configured: https://graph.kadindexer.io
  • Environment variables secured (no keys in code)

Query Optimization

  • All queries use pagination (first: 50)
  • Server-side filters applied (chainId, accountName, minHeight)
  • Query complexity under tier limits
  • Query variables used for all user input
  • Only necessary fields selected

Error Handling

  • 429 rate limit retry logic implemented
  • Network error handling with timeouts
  • GraphQL error parsing configured
  • User-friendly error messages

Monitoring

  • Request metrics tracked (rate, latency, errors)
  • Rate limit quota monitoring
  • Cache hit rate tracking
  • Performance dashboard live
  • Alerts configured for critical thresholds

Security

  • No API keys in version control
  • HTTPS enforced
  • Input validation on user queries
  • Query complexity limits respected

Error Handling

Handle errors gracefully to maintain user experience during failures.

Common Error Types

StatusTypeMeaningAction
400Bad RequestInvalid query syntaxLog error, fix query
429Rate LimitedBurst limit exceededExponential backoff + retry
500Server ErrorKadindexer issueRetry up to 3x
503UnavailableTemporary outageRetry with backoff
NetworkTimeout/DNSConnection failureRetry with timeout

Implementation Pattern

async function robustQuery(query, variables, options = {}) {
  const { maxRetries = 3, timeout = 30000 } = options;
  
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      const controller = new AbortController();
      const timeoutId = setTimeout(() => controller.abort(), timeout);
      
      const response = await fetch('https://graph.kadindexer.io', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ query, variables }),
        signal: controller.signal
      });
      
      clearTimeout(timeoutId);
      
      if (!response.ok) {
        const status = response.status;
        
        // Rate limited - exponential backoff
        if (status === 429) {
          const delay = Math.min(1000 * Math.pow(2, attempt), 60000);
          console.warn(`Rate limited, retrying in ${delay}ms`);
          await new Promise(resolve => setTimeout(resolve, delay));
          continue;
        }
        
        // Server error - retry
        if (status >= 500) {
          const delay = 1000 * (attempt + 1);
          console.warn(`Server error ${status}, retrying in ${delay}ms`);
          await new Promise(resolve => setTimeout(resolve, delay));
          continue;
        }
        
        // Client error - don't retry
        if (status >= 400 && status < 500) {
          const error = await response.json();
          throw new Error(`Query failed: ${error.message || status}`);
        }
      }
      
      return await response.json();
      
    } catch (error) {
      // Network/timeout error
      if (error.name === 'AbortError' || error.code === 'ECONNREFUSED') {
        if (attempt < maxRetries - 1) {
          const delay = 1000 * (attempt + 1);
          console.warn(`Network error, retrying in ${delay}ms`);
          await new Promise(resolve => setTimeout(resolve, delay));
          continue;
        }
      }
      
      // Final attempt failed
      if (attempt === maxRetries - 1) {
        throw new Error(`Query failed after ${maxRetries} attempts: ${error.message}`);
      }
      
      throw error;
    }
  }
}

GraphQL Error Handling

GraphQL can return partial data with errors:

const result = await client.request(query, variables);

// Check for GraphQL errors
if (result.errors) {
  result.errors.forEach(error => {
    console.error('GraphQL Error:', {
      message: error.message,
      path: error.path,
      extensions: error.extensions
    });
  });
  
  // Decide whether to use partial data or throw
  if (!result.data) {
    throw new Error('Query returned no data');
  }
}

// Use data if available
return result.data;

User-Facing Error Messages

Translate technical errors into user-friendly messages:

function getUserMessage(error) {
  if (error.response?.status === 429) {
    return 'Too many requests. Please wait a moment and try again.';
  }
  
  if (error.response?.status >= 500) {
    return 'Service temporarily unavailable. Please try again shortly.';
  }
  
  if (error.message?.includes('complexity')) {
    return 'Query too complex. Please reduce the amount of data requested.';
  }
  
  if (error.name === 'AbortError') {
    return 'Request timed out. Please check your connection.';
  }
  
  return 'An error occurred. Please try again.';
}

High Availability Strategies

Graceful Degradation

Provide cached or limited data when primary queries fail:

async function getAccountBalance(accountName, chainId) {
  try {
    // Try live query
    const result = await client.request(balanceQuery, { accountName, chainId });
    balanceCache.set(cacheKey, result);
    return result;
  } catch (error) {
    // Fall back to cached data
    const cached = balanceCache.get(cacheKey);
    if (cached) {
      console.warn('Using cached balance due to error:', error.message);
      return { ...cached, stale: true };
    }
    throw error;
  }
}

Circuit Breaker Pattern

Prevent cascading failures by temporarily stopping requests:

class CircuitBreaker {
  constructor(threshold = 5, timeout = 60000) {
    this.failureCount = 0;
    this.threshold = threshold;
    this.timeout = timeout;
    this.state = 'CLOSED'; // CLOSED, OPEN, HALF_OPEN
    this.nextAttempt = Date.now();
  }
  
  async execute(fn) {
    if (this.state === 'OPEN') {
      if (Date.now() < this.nextAttempt) {
        throw new Error('Circuit breaker is OPEN');
      }
      this.state = 'HALF_OPEN';
    }
    
    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      throw error;
    }
  }
  
  onSuccess() {
    this.failureCount = 0;
    this.state = 'CLOSED';
  }
  
  onFailure() {
    this.failureCount++;
    if (this.failureCount >= this.threshold) {
      this.state = 'OPEN';
      this.nextAttempt = Date.now() + this.timeout;
      console.error('Circuit breaker opened');
    }
  }
}

// Usage
const breaker = new CircuitBreaker();

async function queryWithBreaker(query, variables) {
  return breaker.execute(() => client.request(query, variables));
}

Deployment Strategy

Environment Configuration

Use different configurations per environment:

const config = {
  development: {
    endpoint: 'https://graph.kadindexer.io',
    timeout: 10000,
    retries: 1,
    logLevel: 'debug'
  },
  staging: {
    endpoint: 'https://graph.kadindexer.io',
    timeout: 30000,
    retries: 3,
    logLevel: 'info'
  },
  production: {
    endpoint: 'https://graph.kadindexer.io',
    timeout: 30000,
    retries: 3,
    logLevel: 'warn'
  }
};

const env = process.env.NODE_ENV || 'development';
export default config[env];

Health Checks

Implement health checks to verify Kadindexer connectivity:

async function healthCheck() {
  try {
    const result = await client.request(
      `query { graphConfiguration { version } }`,
      {},
      { timeout: 5000 }
    );
    
    return {
      status: 'healthy',
      version: result.graphConfiguration.version,
      timestamp: new Date().toISOString()
    };
  } catch (error) {
    return {
      status: 'unhealthy',
      error: error.message,
      timestamp: new Date().toISOString()
    };
  }
}

// Run health check every 60 seconds
setInterval(async () => {
  const health = await healthCheck();
  if (health.status === 'unhealthy') {
    console.error('Kadindexer health check failed:', health.error);
  }
}, 60000);

Gradual Rollout

Deploy to a subset of users first:

function shouldUseKadindexer(userId) {
  // Gradual rollout based on user ID
  const rolloutPercentage = 10; // Start with 10%
  const hash = userId.split('').reduce((a, b) => {
    a = ((a << 5) - a) + b.charCodeAt(0);
    return a & a;
  }, 0);
  
  return Math.abs(hash % 100) < rolloutPercentage;
}

Monitoring & Observability

Key Metrics Dashboard

Track these metrics in your monitoring system:

const metrics = {
  // Request metrics
  totalRequests: 0,
  successfulRequests: 0,
  failedRequests: 0,
  
  // Performance metrics
  avgResponseTime: 0,
  p95ResponseTime: 0,
  p99ResponseTime: 0,
  
  // Resource metrics
  cacheHitRate: 0,
  rateLimitUsage: 0,
  
  // Error metrics
  errorsByType: {},
  
  // Business metrics
  activeUsers: 0,
  queriesPerUser: 0
};

function updateMetrics(result, duration, error) {
  metrics.totalRequests++;
  
  if (error) {
    metrics.failedRequests++;
    metrics.errorsByType[error.type] = (metrics.errorsByType[error.type] || 0) + 1;
  } else {
    metrics.successfulRequests++;
  }
  
  // Update response time (moving average)
  metrics.avgResponseTime = (metrics.avgResponseTime * 0.95) + (duration * 0.05);
}

Logging Best Practices

Log relevant information without exposing sensitive data:

function logQuery(query, variables, result, duration) {
  const logData = {
    timestamp: new Date().toISOString(),
    queryName: query.definitions?.[0]?.name?.value || 'anonymous',
    duration,
    success: !result.errors,
    // Sanitize variables - remove sensitive data
    variables: sanitizeVariables(variables)
  };
  
  if (result.errors) {
    logData.errors = result.errors.map(e => ({
      message: e.message,
      path: e.path
    }));
  }
  
  if (duration > 1000) {
    console.warn('Slow query:', logData);
  } else {
    console.info('Query:', logData);
  }
}

function sanitizeVariables(variables) {
  // Remove or redact sensitive fields
  const sanitized = { ...variables };
  if (sanitized.privateKey) delete sanitized.privateKey;
  if (sanitized.signature) sanitized.signature = '***';
  return sanitized;
}

Incident Response

Detection & Alerting

Set up alerts for critical issues:

function checkAlerts() {
  // High error rate
  const errorRate = metrics.failedRequests / metrics.totalRequests;
  if (errorRate > 0.05) {
    sendAlert('HIGH_ERROR_RATE', `Error rate: ${(errorRate * 100).toFixed(2)}%`);
  }
  
  // Approaching rate limit
  if (metrics.rateLimitUsage > 0.9) {
    sendAlert('RATE_LIMIT_WARNING', `Using ${(metrics.rateLimitUsage * 100).toFixed(0)}% of rate limit`);
  }
  
  // Slow queries
  if (metrics.p95ResponseTime > 2000) {
    sendAlert('SLOW_QUERIES', `P95 response time: ${metrics.p95ResponseTime}ms`);
  }
  
  // Low cache hit rate
  if (metrics.cacheHitRate < 0.5) {
    sendAlert('LOW_CACHE_HIT_RATE', `Cache hit rate: ${(metrics.cacheHitRate * 100).toFixed(0)}%`);
  }
}

setInterval(checkAlerts, 60000); // Check every minute

Response Procedures

When errors spike:

  1. Check Kadindexer status (if status page exists)
  2. Review recent deployments (rollback if needed)
  3. Examine error logs for patterns
  4. Verify rate limits haven't been exceeded
  5. Contact support if Kadindexer issue: toni@hackachain.io

When performance degrades:

  1. Check query complexity and pagination sizes
  2. Verify caching is working correctly
  3. Review recent query changes
  4. Check network connectivity
  5. Consider tier upgrade if sustained high load

Support Channels

By Tier

Basic (Free):

Developer:

Team:

When to Contact Support

  • Persistent 500/503 errors
  • Unexpected rate limiting
  • Query complexity issues
  • Data inconsistencies
  • Feature requests
  • Tier upgrades

Final Checklist

Before going live:

Infrastructure:

  • Rate limiting with exponential backoff
  • Caching with appropriate TTLs
  • Connection pooling enabled
  • Circuit breaker implemented
  • Health checks running

Monitoring:

  • Metrics collection active
  • Logging configured
  • Alerts set up
  • Dashboard created
  • On-call rotation defined

Error Handling:

  • Retry logic for 429/500/503
  • No retry for 400 errors
  • User-friendly error messages
  • Graceful degradation strategy

Security:

  • API keys in environment variables
  • HTTPS enforced
  • Input validation implemented
  • Sensitive data not logged

Performance:

  • Queries optimized
  • Query complexity under limits
  • Appropriate tier selected
  • Load tested

Resources

Need help? toni@hackachain.io