API Rate Limiting and Enumeration Weakness
Description
APIs that lack proper rate limiting, use predictable identifiers (sequential IDs), and don't verify authorization for each request. This allows attackers to systematically enumerate and extract entire datasets by simply iterating through ID ranges. Often combined with Insecure Direct Object Reference (IDOR) vulnerabilities where authorization checks are missing entirely.
Illustrative Cantor Point
The Cantor Point occurs when designing API access patterns - choosing between simple, direct ID-based access versus implementing proper authentication, authorization, and rate limiting. The decision to use sequential IDs without access controls creates a divergent path where a convenience feature becomes a massive data exposure vector.
Real-World Examples / Observed In
- First American Financial (2019): 885 million records exposed via sequential document IDs in URLs without authentication [See: Cases-By-Year/2019 Data Integrity Failures.md#2]
- Facebook Contact Importer (2019): 533 million profiles scraped through bulk phone number queries without rate limiting [See: Cases-By-Year/2019 Data Integrity Failures.md#6]
- Common Pattern: REST APIs exposing /api/resource/{id} without checking if requester owns that resource
Common Consequences & Impacts
Technical Impacts
- - Complete database extraction via enumeration
- - API abuse and resource exhaustion
- - Inability to track or stop ongoing attacks
- - Difficulty retrofitting security controls
Human/Ethical Impacts
- - Privacy violations at scale
- - Identity theft enablement
- - Financial records exposure
- - Personal safety risks
Business Impacts
- - Massive data breaches
- - Regulatory violations
- - API shutdown requirements
- - Competitive disadvantage from data theft
Recovery Difficulty & Escalation
ADI Principles & Axioms Violated
- Principle of Observable Integrity: Cannot detect enumeration without monitoring
- Principle of Trust Insufficiency: Trusting that IDs won't be guessed
Detection / 60-Second Audit
```sql
-- Check for sequential ID usage in APIs
SELECT
table_name,
column_name,
data_type,
column_default
FROM information_schema.columns
WHERE table_schema = 'public'
AND column_name IN ('id', 'document_id', 'resource_id')
AND (data_type IN ('integer', 'bigint', 'serial', 'bigserial')
OR column_default LIKE 'nextval%');
-- Look for missing API rate limiting tables
SELECT
NOT EXISTS (
SELECT 1 FROM information_schema.tables
WHERE table_name IN ('api_rate_limits', 'rate_limit_buckets', 'api_throttle')
) as missing_rate_limiting;
-- Check for API access logging
SELECT
EXISTS (
SELECT 1 FROM information_schema.tables
WHERE table_name LIKE '%api%log%'
OR table_name LIKE '%access%log%'
) as has_api_logging;
-- Identify potential IDOR vulnerabilities
SELECT
tc.table_name,
kcu.column_name as id_column,
CASE
WHEN NOT EXISTS (
SELECT 1 FROM information_schema.columns
WHERE table_name = tc.table_name
AND column_name IN ('user_id', 'owner_id', 'created_by')
) THEN 'No ownership tracking - IDOR risk'
ELSE 'Has ownership columns'
END as idor_risk
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
ON tc.constraint_name = kcu.constraint_name
WHERE tc.constraint_type = 'PRIMARY KEY'
AND tc.table_schema = 'public';
```
```sql
-- Check for sequential ID usage in APIs
SELECT
table_name,
column_name,
data_type,
column_default,
extra
FROM information_schema.columns
WHERE table_schema = DATABASE()
AND column_name IN ('id', 'document_id', 'resource_id')
AND (data_type IN ('int', 'bigint')
OR extra = 'auto_increment');
-- Look for missing API rate limiting tables
SELECT
NOT EXISTS (
SELECT 1 FROM information_schema.tables
WHERE table_schema = DATABASE()
AND table_name IN ('api_rate_limits', 'rate_limit_buckets', 'api_throttle')
) as missing_rate_limiting;
-- Check for API access logging
SELECT
EXISTS (
SELECT 1 FROM information_schema.tables
WHERE table_schema = DATABASE()
AND (table_name LIKE '%api%log%'
OR table_name LIKE '%access%log%')
) as has_api_logging;
-- Identify potential IDOR vulnerabilities
SELECT
t.table_name,
kcu.column_name as id_column,
CASE
WHEN NOT EXISTS (
SELECT 1 FROM information_schema.columns
WHERE table_schema = DATABASE()
AND table_name = t.table_name
AND column_name IN ('user_id', 'owner_id', 'created_by')
) THEN 'No ownership tracking - IDOR risk'
ELSE 'Has ownership columns'
END as idor_risk
FROM information_schema.tables t
JOIN information_schema.key_column_usage kcu
ON t.table_name = kcu.table_name
WHERE kcu.constraint_name = 'PRIMARY'
AND t.table_schema = DATABASE();
```
```sql
-- Check for sequential ID usage in APIs
SELECT
t.name as table_name,
c.name as column_name,
ty.name as data_type,
c.is_identity
FROM sys.columns c
JOIN sys.tables t ON c.object_id = t.object_id
JOIN sys.types ty ON c.user_type_id = ty.user_type_id
WHERE c.name IN ('id', 'document_id', 'resource_id')
AND (ty.name IN ('int', 'bigint') OR c.is_identity = 1);
-- Look for missing API rate limiting tables
SELECT
CASE WHEN NOT EXISTS (
SELECT 1 FROM sys.tables
WHERE name IN ('api_rate_limits', 'rate_limit_buckets', 'api_throttle')
) THEN 1 ELSE 0 END as missing_rate_limiting;
-- Check for API access logging
SELECT
CASE WHEN EXISTS (
SELECT 1 FROM sys.tables
WHERE name LIKE '%api%log%'
OR name LIKE '%access%log%'
) THEN 1 ELSE 0 END as has_api_logging;
-- Identify potential IDOR vulnerabilities
SELECT
t.name as table_name,
kc.name as id_column,
CASE
WHEN NOT EXISTS (
SELECT 1 FROM sys.columns
WHERE object_id = t.object_id
AND name IN ('user_id', 'owner_id', 'created_by')
) THEN 'No ownership tracking - IDOR risk'
ELSE 'Has ownership columns'
END as idor_risk
FROM sys.tables t
JOIN sys.indexes i ON t.object_id = i.object_id
JOIN sys.index_columns ic ON i.object_id = ic.object_id AND i.index_id = ic.index_id
JOIN sys.columns kc ON ic.object_id = kc.object_id AND ic.column_id = kc.column_id
WHERE i.is_primary_key = 1;
Prevention & Mitigation Best Practices
Use Non-Sequential Identifiers:
-- Bad: Sequential IDs enable enumeration CREATE TABLE documents ( id SERIAL PRIMARY KEY, content TEXT ); -- Good: UUIDs prevent enumeration CREATE TABLE documents ( id UUID DEFAULT gen_random_uuid() PRIMARY KEY, content TEXT, owner_id UUID NOT NULL, created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW() );Implement Rate Limiting:
CREATE TABLE api_rate_limits ( id SERIAL PRIMARY KEY, user_id UUID, ip_address INET, endpoint VARCHAR(255), window_start TIMESTAMP WITH TIME ZONE, request_count INTEGER DEFAULT 1, UNIQUE(user_id, ip_address, endpoint, window_start) ); -- Function to check rate limits CREATE OR REPLACE FUNCTION check_rate_limit( p_user_id UUID, p_ip INET, p_endpoint VARCHAR, p_limit INTEGER, p_window_minutes INTEGER ) RETURNS BOOLEAN AS $ DECLARE v_count INTEGER; v_window_start TIMESTAMP WITH TIME ZONE; BEGIN v_window_start := date_trunc('minute', NOW() - INTERVAL '1 minute' * (EXTRACT(MINUTE FROM NOW())::INTEGER % p_window_minutes)); INSERT INTO api_rate_limits (user_id, ip_address, endpoint, window_start) VALUES (p_user_id, p_ip, p_endpoint, v_window_start) ON CONFLICT (user_id, ip_address, endpoint, window_start) DO UPDATE SET request_count = api_rate_limits.request_count + 1 RETURNING request_count INTO v_count; RETURN v_count <= p_limit; END; $ LANGUAGE plpgsql;Implement Authorization Checks:
-- Always verify ownership CREATE OR REPLACE FUNCTION can_access_document( p_user_id UUID, p_document_id UUID ) RETURNS BOOLEAN AS $ BEGIN RETURN EXISTS ( SELECT 1 FROM documents WHERE id = p_document_id AND (owner_id = p_user_id OR EXISTS ( SELECT 1 FROM document_shares WHERE document_id = p_document_id AND shared_with_user_id = p_user_id )) ); END; $ LANGUAGE plpgsql;API Access Pattern Monitoring:
CREATE TABLE api_access_patterns ( id BIGSERIAL PRIMARY KEY, user_id UUID, ip_address INET, endpoint VARCHAR(255), accessed_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), resource_ids TEXT[], pattern_score INTEGER -- Higher = more suspicious ); -- Detect enumeration patterns CREATE VIEW suspicious_api_activity AS SELECT user_id, ip_address, COUNT(*) as request_count, COUNT(DISTINCT unnest(resource_ids)) as unique_resources, MAX(pattern_score) as max_pattern_score FROM api_access_patterns WHERE accessed_at > NOW() - INTERVAL '1 hour' GROUP BY user_id, ip_address HAVING COUNT(*) > 100 -- High volume OR COUNT(DISTINCT unnest(resource_ids)) > 50; -- Many different resourcesAdditional Best Practices:
- Implement CAPTCHA for high-volume requests
- Use API keys with scope limitations
- Deploy Web Application Firewalls (WAF)
- Regular penetration testing focusing on IDOR
- Implement circuit breakers for suspicious patterns
Real World Examples
# The vulnerability:
GET https://docutech-prod.firstam.com/document/view/12345
Response: Full document with sensitive financial data
# How it was exploited:
- No authentication required
- Sequential IDs allowed enumeration
- Simple script could download millions:
for i in range(1, 1000000):
url = f"https://docutech-prod.firstam.com/document/view/{i}"
download(url) # 885 million documents exposed
# Impact:
- Mortgage documents, bank statements, tax records
- Social Security numbers, wire transfer receipts
- Active from 2003 to 2019 (16 years!)
# The vulnerability:
POST /v2.0/me/invitable_friends
Body: { "phone_numbers": ["+1234567890", "+1234567891", ...] }
Response: { "data": [{ "id": "100001234", "name": "John Doe" }, ...] }
# How it was exploited:
- No rate limiting on bulk queries
- Could submit millions of phone numbers
- Returned Facebook profiles for each number
- Scrapers harvested 533 million profiles
# Timeline:
- Vulnerability introduced: Unknown
- Discovered and exploited: 2019
- Patched: August 2019
- Data appeared on hacking forums: 2021
# Before (vulnerable):
@app.route('/api/documents/<int:doc_id>')
def get_document(doc_id):
doc = db.query(f"SELECT * FROM documents WHERE id = {doc_id}")
return jsonify(doc)
# After (secure):
@app.route('/api/documents/<uuid:doc_id>')
@require_auth
@rate_limit(100, per='hour')
def get_document(doc_id):
# Check ownership
if not can_access_document(current_user.id, doc_id):
abort(404) # Always 404, never 403
# Log access for pattern detection
log_api_access(current_user.id, request.remote_addr, doc_id)
# Check for enumeration patterns
if detect_enumeration_pattern(current_user.id):
alert_security_team()
abort(429) # Rate limit response
doc = db.query(
"SELECT * FROM documents WHERE id = %s AND owner_id = %s",
(doc_id, current_user.id)
)
return jsonify(doc)
AI Coding Guidance/Prompt
Prompt: "When designing APIs that access data:"
Rules:
- Never use sequential IDs for public APIs
- Always implement rate limiting from day one
- Require authorization checks for every data access
- Log all API access with enough detail to detect patterns
- Flag any API that returns data based solely on an ID parameter
Example:
# Bad: Direct ID access without controls
GET /api/documents/12345
Response: { "id": 12345, "content": "sensitive data" }
# Good: Proper API security
GET /api/documents/550e8400-e29b-41d4-a716-446655440000
Headers:
Authorization: Bearer <token>
X-API-Key: <key>
Response:
- 200 OK if user owns document
- 404 Not Found if document doesn't exist OR user lacks access
- 429 Too Many Requests if rate limit exceeded
# Implementation example
async function getDocument(req, res) {
// Rate limiting
if (!await checkRateLimit(req.user.id, req.ip)) {
return res.status(429).json({ error: "Rate limit exceeded" });
}
// Authorization
const documentId = req.params.id;
if (!await canAccessDocument(req.user.id, documentId)) {
return res.status(404).json({ error: "Not found" });
}
// Logging
await logApiAccess(req.user.id, req.ip, 'GET /api/documents', documentId);
// Return data
const document = await fetchDocument(documentId);
return res.json(document);
}
Relevant Keywords
rate limiting enumeration weakness Symptoms: slow queries, data inconsistency, constraint violations Preventive: schema validation, constraint enforcement, proper typing Tech stack: PostgreSQL, MySQL, SQL Server, Oracle Industry: all industries, enterprise, SaaS