Skip to main content

Rate Limiting

The Discord bot must respect Discord's rate limits to avoid being temporarily banned or throttled.

Discord Rate Limits

Global Rate Limit

  • 50 requests per second for all bots
  • Applies to the entire bot, not per-endpoint
  • If no authorization header is provided, applies to IP address

Per-Route Rate Limits

Routes have individual limits tracked by buckets. The bucket is identified by:

  • Route path
  • Major parameters: channel_id, guild_id, webhook_id

For example:

  • /channels/123/messages and /channels/456/messages are different buckets
  • Exceeding the limit on one doesn't affect the other

Invalid Request Limit

  • 10,000 invalid requests per 10 minutes before Cloudflare ban
  • Invalid requests: 401, 403, 429 status codes
  • Exception: 429 responses with X-RateLimit-Scope: shared don't count

Rate Limit Headers

Discord returns these headers on API responses:

HeaderDescription
X-RateLimit-LimitNumber of requests that can be made
X-RateLimit-RemainingRemaining requests in current window
X-RateLimit-ResetUnix timestamp when limit resets
X-RateLimit-Reset-AfterSeconds until reset (includes decimals)
X-RateLimit-BucketUnique bucket identifier
X-RateLimit-ScopeScope: user, global, or shared

429 Response

When rate limited, Discord returns HTTP 429 with:

{
"message": "You are being rate limited.",
"retry_after": 6.457,
"global": false
}

Serenity's Built-in Rate Limiting

Serenity has built-in rate limit handling that:

  • Implements "pre-emptive ratelimiting"
  • Sleeps when no requests available until reset
  • Parses rate limit headers automatically
  • Tracks buckets per route + major parameters

Default Behavior

By default, Serenity's HTTP client handles rate limits. You don't need to do anything special:

use serenity::http::HttpBuilder;

// Rate limiting enabled by default
let http = HttpBuilder::new(&token).build();

Only disable if using a rate limit proxy like twilight-http-proxy:

// ONLY for proxy setups
let http = HttpBuilder::new(&token)
.ratelimiter_disabled(true)
.build();

Command Cooldowns with Poise

Poise provides built-in cooldown support with the cooldown attribute:

Basic Cooldowns

use crate::bot::data::{Context, Error};
use std::time::Duration;

/// Command with a 5 second cooldown per user
#[poise::command(
slash_command,
prefix_command,
cooldown_config = "CooldownConfig { user: Some(Duration::from_secs(5)), ..Default::default() }"
)]
pub async fn ping(ctx: Context<'_>) -> Result<(), Error> {
ctx.say("Pong!").await?;
Ok(())
}

Cooldown Configuration

Poise cooldowns can be configured per user, guild, channel, or globally:

use poise::CooldownConfig;
use std::time::Duration;

// Per-user cooldown
let user_cooldown = CooldownConfig {
user: Some(Duration::from_secs(5)),
..Default::default()
};

// Per-guild cooldown
let guild_cooldown = CooldownConfig {
guild: Some(Duration::from_secs(30)),
..Default::default()
};

// Per-channel cooldown
let channel_cooldown = CooldownConfig {
channel: Some(Duration::from_secs(10)),
..Default::default()
};

// Global cooldown (applies to all users)
let global_cooldown = CooldownConfig {
global: Some(Duration::from_secs(60)),
..Default::default()
};

// Combined: user + guild limits
let combined = CooldownConfig {
user: Some(Duration::from_secs(5)),
guild: Some(Duration::from_secs(2)),
..Default::default()
};

Handling Cooldown Errors

The framework's on_error handler receives CooldownHit errors:

async fn on_error(error: poise::FrameworkError<'_, Data, Error>) {
match error {
poise::FrameworkError::CooldownHit {
remaining_cooldown,
ctx,
..
} => {
let _ = ctx
.say(format!(
"Please wait {:.1} seconds before using this command again.",
remaining_cooldown.as_secs_f32()
))
.await;
}
// ... other error handlers
_ => {}
}
}

Advanced Dynamic Cooldowns

For per-user or context-aware cooldowns, use manual_cooldowns:

use poise::CooldownConfig;
use std::time::Duration;

/// Command with dynamic cooldown based on user
#[poise::command(
slash_command,
prefix_command,
// Disable automatic cooldown enforcement
custom_data = "CommandCustomData { manual_cooldowns: true }"
)]
pub async fn dynamic_cooldown(ctx: Context<'_>) -> Result<(), Error> {
// Access the cooldown tracker
let cooldowns = &ctx.command().cooldowns;
let mut cooldown_tracker = cooldowns.lock().unwrap();

// Define different cooldowns for different users
let cooldown_config = if is_premium_user(ctx.author().id).await {
// Premium users: 2 second cooldown
CooldownConfig {
user: Some(Duration::from_secs(2)),
..Default::default()
}
} else {
// Regular users: 10 second cooldown
CooldownConfig {
user: Some(Duration::from_secs(10)),
..Default::default()
}
};

// Check if on cooldown
let cooldown_context = poise::CooldownContext {
user_id: ctx.author().id,
guild_id: ctx.guild_id(),
channel_id: ctx.channel_id(),
};

if let Some(remaining) = cooldown_tracker.remaining_cooldown(cooldown_context, &cooldown_config) {
ctx.say(format!(
"Please wait {:.1} seconds before using this command again.",
remaining.as_secs_f32()
)).await?;
return Ok(());
}

// Start the cooldown
cooldown_tracker.start_cooldown(cooldown_context);

// Execute command logic
ctx.say("Command executed!").await?;

Ok(())
}

async fn is_premium_user(user_id: serenity::UserId) -> bool {
// Check premium status via API
false
}

Cooldown Buckets Pattern

For shared cooldowns across related commands:

use std::sync::Arc;
use parking_lot::Mutex;
use poise::CooldownTracker;

/// Shared cooldown tracker in Data struct
#[derive(Clone)]
pub struct Data {
pub moderation_cooldowns: Arc<Mutex<CooldownTracker>>,
// ... other fields
}

/// All moderation commands share the same cooldown
#[poise::command(slash_command)]
pub async fn warn(ctx: Context<'_>, user: serenity::User) -> Result<(), Error> {
check_shared_cooldown(ctx, "moderation").await?;
// ... warn logic
Ok(())
}

#[poise::command(slash_command)]
pub async fn kick(ctx: Context<'_>, user: serenity::User) -> Result<(), Error> {
check_shared_cooldown(ctx, "moderation").await?;
// ... kick logic
Ok(())
}

async fn check_shared_cooldown(ctx: Context<'_>, bucket: &str) -> Result<(), Error> {
let cooldowns = &ctx.data().moderation_cooldowns;
let mut tracker = cooldowns.lock();

let config = CooldownConfig {
user: Some(Duration::from_secs(5)),
..Default::default()
};

let cooldown_context = poise::CooldownContext {
user_id: ctx.author().id,
guild_id: ctx.guild_id(),
channel_id: ctx.channel_id(),
};

if let Some(remaining) = tracker.remaining_cooldown(cooldown_context, &config) {
return Err(format!("On cooldown for {} more seconds", remaining.as_secs()).into());
}

tracker.start_cooldown(cooldown_context);
Ok(())
}

Custom Rate Limit Handling

For custom API calls (Heimdall API), implement rate limit handling:

// src/utils/rate_limit.rs

use std::time::Duration;
use tokio::time::sleep;
use tracing::{warn, info};

/// Rate limit constants
pub mod limits {
/// Discord global rate limit
pub const GLOBAL_REQUESTS_PER_SECOND: u32 = 50;

/// Invalid request limit before Cloudflare ban
pub const INVALID_REQUEST_LIMIT: u32 = 10_000;
pub const INVALID_REQUEST_WINDOW_MINUTES: u32 = 10;
}

/// Handle a 429 rate limit response with exponential backoff
pub async fn handle_rate_limit(retry_after: f64, attempt: u32) {
let backoff = if attempt > 1 {
// Exponential backoff for repeated rate limits
retry_after * (1.5_f64.powi(attempt as i32 - 1))
} else {
retry_after
};

warn!(
retry_after = retry_after,
attempt = attempt,
backoff_seconds = backoff,
"Rate limited, waiting before retry"
);

sleep(Duration::from_secs_f64(backoff)).await;

info!("Rate limit wait complete, resuming");
}

/// Rate limit aware wrapper for API calls
pub struct RateLimitedClient {
max_retries: u32,
}

impl RateLimitedClient {
pub fn new(max_retries: u32) -> Self {
Self { max_retries }
}

/// Execute with rate limit handling
pub async fn execute<F, Fut, T, E>(&self, operation: F) -> Result<T, E>
where
F: Fn() -> Fut,
Fut: std::future::Future<Output = Result<T, E>>,
E: std::fmt::Debug,
{
let mut attempt = 0;

loop {
attempt += 1;

match operation().await {
Ok(result) => return Ok(result),
Err(e) => {
if attempt >= self.max_retries {
return Err(e);
}

warn!(
attempt = attempt,
max_retries = self.max_retries,
error = ?e,
"API call failed, will retry"
);

// Exponential backoff
sleep(Duration::from_millis(100 * 2_u64.pow(attempt))).await;
}
}
}
}
}

Best Practices

From Discord Documentation

  1. Never hard-code rate limits - Parse from response headers
  2. Serenity handles Discord limits - Built-in ratelimiter works automatically
  3. Check permissions first - Avoid 403 errors counting against invalid request limit
  4. Log failed requests - Monitor for rate limit patterns
  5. Respect Retry-After - Always wait the specified time
  6. Shared scope 429s don't count - X-RateLimit-Scope: shared is free

Additional Recommendations

  • Batch operations - Combine multiple operations when possible
  • Use WebSocket for real-time - Reduces REST API calls
  • Cache responses - Avoid redundant API calls
  • Queue requests - Spread out bulk operations

Handling Rate Limit Errors

In the Poise error handler:

use rust_i18n::t;

async fn on_error(error: poise::FrameworkError<'_, Data, Error>) {
match error {
poise::FrameworkError::CooldownHit {
remaining_cooldown,
ctx,
..
} => {
let _ = ctx
.say(t!("errors.rate_limited", seconds = remaining_cooldown.as_secs()))
.await;
}
// ... other error handlers
_ => {}
}
}

Monitoring

Log rate limit events for monitoring:

use tracing::{warn, info};

// When approaching limits
if remaining < 5 {
warn!(
bucket = ?bucket,
remaining = remaining,
reset_after = ?reset_after,
"Approaching rate limit"
);
}

// When rate limited
if status == 429 {
warn!(
retry_after = retry_after,
global = is_global,
scope = ?scope,
"Rate limited by Discord"
);
}

References