Rate Limiting

The Discord bot must respect Discord's rate limits to avoid being temporarily banned or throttled.

Discord Rate Limits

Global Rate Limit

50 requests per second for all bots
Applies to the entire bot, not per-endpoint
If no authorization header is provided, applies to IP address

Per-Route Rate Limits

Routes have individual limits tracked by buckets. The bucket is identified by:

Route path
Major parameters: channel_id, guild_id, webhook_id

For example:

/channels/123/messages and /channels/456/messages are different buckets
Exceeding the limit on one doesn't affect the other

Invalid Request Limit

10,000 invalid requests per 10 minutes before Cloudflare ban
Invalid requests: 401, 403, 429 status codes
Exception: 429 responses with X-RateLimit-Scope: shared don't count

Rate Limit Headers

Discord returns these headers on API responses:

Header	Description
`X-RateLimit-Limit`	Number of requests that can be made
`X-RateLimit-Remaining`	Remaining requests in current window
`X-RateLimit-Reset`	Unix timestamp when limit resets
`X-RateLimit-Reset-After`	Seconds until reset (includes decimals)
`X-RateLimit-Bucket`	Unique bucket identifier
`X-RateLimit-Scope`	Scope: `user`, `global`, or `shared`

429 Response

When rate limited, Discord returns HTTP 429 with:

{
  "message": "You are being rate limited.",
  "retry_after": 6.457,
  "global": false
}

Serenity's Built-in Rate Limiting

Serenity has built-in rate limit handling that:

Implements "pre-emptive ratelimiting"
Sleeps when no requests available until reset
Parses rate limit headers automatically
Tracks buckets per route + major parameters

Default Behavior

By default, Serenity's HTTP client handles rate limits. You don't need to do anything special:

use serenity::http::HttpBuilder;

// Rate limiting enabled by default
let http = HttpBuilder::new(&token).build();

Disabling (Not Recommended)

Only disable if using a rate limit proxy like twilight-http-proxy:

// ONLY for proxy setups
let http = HttpBuilder::new(&token)
    .ratelimiter_disabled(true)
    .build();

Command Cooldowns with Poise

Poise provides built-in cooldown support with the cooldown attribute:

Basic Cooldowns

use crate::bot::data::{Context, Error};
use std::time::Duration;

/// Command with a 5 second cooldown per user
#[poise::command(
    slash_command,
    prefix_command,
    cooldown_config = "CooldownConfig { user: Some(Duration::from_secs(5)), ..Default::default() }"
)]
pub async fn ping(ctx: Context<'_>) -> Result<(), Error> {
    ctx.say("Pong!").await?;
    Ok(())
}

Cooldown Configuration

Poise cooldowns can be configured per user, guild, channel, or globally:

use poise::CooldownConfig;
use std::time::Duration;

// Per-user cooldown
let user_cooldown = CooldownConfig {
    user: Some(Duration::from_secs(5)),
    ..Default::default()
};

// Per-guild cooldown
let guild_cooldown = CooldownConfig {
    guild: Some(Duration::from_secs(30)),
    ..Default::default()
};

// Per-channel cooldown
let channel_cooldown = CooldownConfig {
    channel: Some(Duration::from_secs(10)),
    ..Default::default()
};

// Global cooldown (applies to all users)
let global_cooldown = CooldownConfig {
    global: Some(Duration::from_secs(60)),
    ..Default::default()
};

// Combined: user + guild limits
let combined = CooldownConfig {
    user: Some(Duration::from_secs(5)),
    guild: Some(Duration::from_secs(2)),
    ..Default::default()
};

Handling Cooldown Errors

The framework's on_error handler receives CooldownHit errors:

async fn on_error(error: poise::FrameworkError<'_, Data, Error>) {
    match error {
        poise::FrameworkError::CooldownHit {
            remaining_cooldown,
            ctx,
            ..
        } => {
            let _ = ctx
                .say(format!(
                    "Please wait {:.1} seconds before using this command again.",
                    remaining_cooldown.as_secs_f32()
                ))
                .await;
        }
        // ... other error handlers
        _ => {}
    }
}

Advanced Dynamic Cooldowns

For per-user or context-aware cooldowns, use manual_cooldowns:

use poise::CooldownConfig;
use std::time::Duration;

/// Command with dynamic cooldown based on user
#[poise::command(
    slash_command,
    prefix_command,
    // Disable automatic cooldown enforcement
    custom_data = "CommandCustomData { manual_cooldowns: true }"
)]
pub async fn dynamic_cooldown(ctx: Context<'_>) -> Result<(), Error> {
    // Access the cooldown tracker
    let cooldowns = &ctx.command().cooldowns;
    let mut cooldown_tracker = cooldowns.lock().unwrap();

    // Define different cooldowns for different users
    let cooldown_config = if is_premium_user(ctx.author().id).await {
        // Premium users: 2 second cooldown
        CooldownConfig {
            user: Some(Duration::from_secs(2)),
            ..Default::default()
        }
    } else {
        // Regular users: 10 second cooldown
        CooldownConfig {
            user: Some(Duration::from_secs(10)),
            ..Default::default()
        }
    };

    // Check if on cooldown
    let cooldown_context = poise::CooldownContext {
        user_id: ctx.author().id,
        guild_id: ctx.guild_id(),
        channel_id: ctx.channel_id(),
    };

    if let Some(remaining) = cooldown_tracker.remaining_cooldown(cooldown_context, &cooldown_config) {
        ctx.say(format!(
            "Please wait {:.1} seconds before using this command again.",
            remaining.as_secs_f32()
        )).await?;
        return Ok(());
    }

    // Start the cooldown
    cooldown_tracker.start_cooldown(cooldown_context);

    // Execute command logic
    ctx.say("Command executed!").await?;

    Ok(())
}

async fn is_premium_user(user_id: serenity::UserId) -> bool {
    // Check premium status via API
    false
}

Cooldown Buckets Pattern

For shared cooldowns across related commands:

use std::sync::Arc;
use parking_lot::Mutex;
use poise::CooldownTracker;

/// Shared cooldown tracker in Data struct
#[derive(Clone)]
pub struct Data {
    pub moderation_cooldowns: Arc<Mutex<CooldownTracker>>,
    // ... other fields
}

/// All moderation commands share the same cooldown
#[poise::command(slash_command)]
pub async fn warn(ctx: Context<'_>, user: serenity::User) -> Result<(), Error> {
    check_shared_cooldown(ctx, "moderation").await?;
    // ... warn logic
    Ok(())
}

#[poise::command(slash_command)]
pub async fn kick(ctx: Context<'_>, user: serenity::User) -> Result<(), Error> {
    check_shared_cooldown(ctx, "moderation").await?;
    // ... kick logic
    Ok(())
}

async fn check_shared_cooldown(ctx: Context<'_>, bucket: &str) -> Result<(), Error> {
    let cooldowns = &ctx.data().moderation_cooldowns;
    let mut tracker = cooldowns.lock();

    let config = CooldownConfig {
        user: Some(Duration::from_secs(5)),
        ..Default::default()
    };

    let cooldown_context = poise::CooldownContext {
        user_id: ctx.author().id,
        guild_id: ctx.guild_id(),
        channel_id: ctx.channel_id(),
    };

    if let Some(remaining) = tracker.remaining_cooldown(cooldown_context, &config) {
        return Err(format!("On cooldown for {} more seconds", remaining.as_secs()).into());
    }

    tracker.start_cooldown(cooldown_context);
    Ok(())
}

Custom Rate Limit Handling

For custom API calls (Heimdall API), implement rate limit handling:

// src/utils/rate_limit.rs

use std::time::Duration;
use tokio::time::sleep;
use tracing::{warn, info};

/// Rate limit constants
pub mod limits {
    /// Discord global rate limit
    pub const GLOBAL_REQUESTS_PER_SECOND: u32 = 50;

    /// Invalid request limit before Cloudflare ban
    pub const INVALID_REQUEST_LIMIT: u32 = 10_000;
    pub const INVALID_REQUEST_WINDOW_MINUTES: u32 = 10;
}

/// Handle a 429 rate limit response with exponential backoff
pub async fn handle_rate_limit(retry_after: f64, attempt: u32) {
    let backoff = if attempt > 1 {
        // Exponential backoff for repeated rate limits
        retry_after * (1.5_f64.powi(attempt as i32 - 1))
    } else {
        retry_after
    };

    warn!(
        retry_after = retry_after,
        attempt = attempt,
        backoff_seconds = backoff,
        "Rate limited, waiting before retry"
    );

    sleep(Duration::from_secs_f64(backoff)).await;

    info!("Rate limit wait complete, resuming");
}

/// Rate limit aware wrapper for API calls
pub struct RateLimitedClient {
    max_retries: u32,
}

impl RateLimitedClient {
    pub fn new(max_retries: u32) -> Self {
        Self { max_retries }
    }

    /// Execute with rate limit handling
    pub async fn execute<F, Fut, T, E>(&self, operation: F) -> Result<T, E>
    where
        F: Fn() -> Fut,
        Fut: std::future::Future<Output = Result<T, E>>,
        E: std::fmt::Debug,
    {
        let mut attempt = 0;

        loop {
            attempt += 1;

            match operation().await {
                Ok(result) => return Ok(result),
                Err(e) => {
                    if attempt >= self.max_retries {
                        return Err(e);
                    }

                    warn!(
                        attempt = attempt,
                        max_retries = self.max_retries,
                        error = ?e,
                        "API call failed, will retry"
                    );

                    // Exponential backoff
                    sleep(Duration::from_millis(100 * 2_u64.pow(attempt))).await;
                }
            }
        }
    }
}

Best Practices

From Discord Documentation

Never hard-code rate limits - Parse from response headers
Serenity handles Discord limits - Built-in ratelimiter works automatically
Check permissions first - Avoid 403 errors counting against invalid request limit
Log failed requests - Monitor for rate limit patterns
Respect Retry-After - Always wait the specified time
Shared scope 429s don't count - X-RateLimit-Scope: shared is free

Additional Recommendations

Batch operations - Combine multiple operations when possible
Use WebSocket for real-time - Reduces REST API calls
Cache responses - Avoid redundant API calls
Queue requests - Spread out bulk operations

Handling Rate Limit Errors

In the Poise error handler:

use rust_i18n::t;

async fn on_error(error: poise::FrameworkError<'_, Data, Error>) {
    match error {
        poise::FrameworkError::CooldownHit {
            remaining_cooldown,
            ctx,
            ..
        } => {
            let _ = ctx
                .say(t!("errors.rate_limited", seconds = remaining_cooldown.as_secs()))
                .await;
        }
        // ... other error handlers
        _ => {}
    }
}

Monitoring

Log rate limit events for monitoring:

use tracing::{warn, info};

// When approaching limits
if remaining < 5 {
    warn!(
        bucket = ?bucket,
        remaining = remaining,
        reset_after = ?reset_after,
        "Approaching rate limit"
    );
}

// When rate limited
if status == 429 {
    warn!(
        retry_after = retry_after,
        global = is_global,
        scope = ?scope,
        "Rate limited by Discord"
    );
}

References

Discord Rate Limits Documentation
Serenity Ratelimiter
twilight-http-proxy (for distributed bots)

Discord Rate Limits​

Global Rate Limit​

Per-Route Rate Limits​

Invalid Request Limit​

Rate Limit Headers​

429 Response​

Serenity's Built-in Rate Limiting​

Default Behavior​

Disabling (Not Recommended)​

Command Cooldowns with Poise​

Basic Cooldowns​

Cooldown Configuration​

Handling Cooldown Errors​

Advanced Dynamic Cooldowns​

Cooldown Buckets Pattern​

Custom Rate Limit Handling​

Best Practices​

From Discord Documentation​

Additional Recommendations​

Handling Rate Limit Errors​

Monitoring​

References​