SWE-0003

Lack of Foreign Keys / Constraints

Cantorian Technical Debt Magnitude: ℵ₁ (Systemic)

Description

Not enforcing relational integrity at the schema level through foreign key constraints, unique constraints, or CHECK constraints. The database becomes unable to guarantee that references between tables are valid or that business rules are upheld. Teams rely solely on application logic for integrity, which fails as systems grow, multiple services access data, or when bugs introduce inconsistencies.

Illustrative Cantor Point

The Cantor Point occurs during initial schema design when the team decides "we'll handle referential integrity in the application layer" to avoid dealing with constraint violations or to simplify development. This convenience-over-correctness decision opens the door to years of accumulating data inconsistencies.

Categories: CP-Schema, CP-API, CP-Process

Real-World Examples / Observed In

  • E-commerce Platform: Orders table had no foreign key to customers table. A race condition during high traffic created thousands of orders pointing to non-existent customers, crashing the billing system
  • Social Media App: No unique constraint on usernames led to duplicate usernames through a race condition, breaking the login system
  • Financial System: Missing constraints allowed negative account balances and duplicate transaction IDs, leading to reconciliation nightmares

Common Consequences & Impacts

Technical Impacts

  • - Orphaned records accumulating over time
  • - Duplicate data violating business logic
  • - Invalid references causing join failures
  • - Inconsistent state across related tables

Human/Ethical Impacts

  • - Customer data integrity compromised
  • - Incorrect billing or missing services
  • - Loss of trust when data errors surface
  • - Engineering time spent on data archaeology

Business Impacts

  • - Financial losses from data inconsistencies
  • - Compliance violations
  • - Manual data cleanup projects
  • - Features built on false assumptions

Recovery Difficulty & Escalation

4
6.5

ADI Principles & Axioms Violated

  • Principle of Invisible Decay: Data corruption accumulates silently
  • Principle of Consequential Stewardship: Failing to protect data integrity affects real people
  • Principle of Performative Integrity: Application-level checks create false confidence

Detection / 60-Second Audit

-- Find tables without primary keys
SELECT 
    schemaname,
    tablename 
FROM pg_tables t
WHERE NOT EXISTS (
    SELECT 1 FROM pg_indexes i 
    WHERE i.tablename = t.tablename 
    AND i.indexname LIKE '%_pkey'
)
AND schemaname = 'public';

-- Find potential foreign key relationships without constraints
WITH potential_fks AS (
    SELECT 
        c1.table_name,
        c1.column_name,
        c2.table_name as referenced_table
    FROM information_schema.columns c1
    JOIN information_schema.columns c2 
        ON c1.column_name = c2.table_name || '_id'
        OR c1.column_name = SUBSTR(c2.table_name, 1, LENGTH(c2.table_name)-1) || '_id'
    WHERE c1.table_schema = 'public'
    AND c2.table_schema = 'public'
    AND c2.column_name = 'id'
)
SELECT * FROM potential_fks pf
WHERE NOT EXISTS (
    SELECT 1 FROM information_schema.table_constraints tc
    WHERE tc.table_name = pf.table_name
    AND tc.constraint_type = 'FOREIGN KEY'
);
-- Find tables without primary keys
SELECT 
    TABLE_SCHEMA,
    TABLE_NAME
FROM information_schema.TABLES t
WHERE TABLE_SCHEMA = DATABASE()
AND NOT EXISTS (
    SELECT 1 FROM information_schema.STATISTICS s
    WHERE s.TABLE_SCHEMA = t.TABLE_SCHEMA
    AND s.TABLE_NAME = t.TABLE_NAME
    AND s.INDEX_NAME = 'PRIMARY'
);

-- Find columns that look like foreign keys but have no constraints
SELECT 
    c.TABLE_NAME,
    c.COLUMN_NAME,
    REPLACE(c.COLUMN_NAME, '_id', '') as potential_ref_table
FROM information_schema.COLUMNS c
WHERE c.TABLE_SCHEMA = DATABASE()
AND c.COLUMN_NAME LIKE '%_id'
AND NOT EXISTS (
    SELECT 1 FROM information_schema.KEY_COLUMN_USAGE kcu
    WHERE kcu.TABLE_SCHEMA = c.TABLE_SCHEMA
    AND kcu.TABLE_NAME = c.TABLE_NAME
    AND kcu.COLUMN_NAME = c.COLUMN_NAME
    AND kcu.REFERENCED_TABLE_NAME IS NOT NULL
);
-- Find tables without primary keys
SELECT 
    SCHEMA_NAME(schema_id) AS schema_name,
    name AS table_name
FROM sys.tables t
WHERE NOT EXISTS (
    SELECT 1 FROM sys.indexes i
    WHERE i.object_id = t.object_id
    AND i.is_primary_key = 1
)
AND type = 'U';

Prevention & Mitigation Best Practices

  1. Schema-First Integrity: Always define constraints at the database level
  2. Foreign Key Discipline: Every reference must have a corresponding FK constraint
  3. Unique Constraints: Enforce business rules (emails, usernames, etc.) in schema
  4. NOT NULL Wisdom: Use NOT NULL where appropriate to prevent incomplete data
  5. CHECK Constraints: Validate data ranges and formats at DB level
  6. Migration Strategy: Add constraints gradually with data cleanup phases
  7. Cascade Planning: Define appropriate ON DELETE/UPDATE behaviors

Real World Examples

-- Missing foreign key constraint allows orphaned orders
CREATE TABLE customers (
    id BIGINT PRIMARY KEY,
    email VARCHAR(255)
);

CREATE TABLE orders (
    id BIGINT PRIMARY KEY,
    customer_id BIGINT,  -- No FK constraint!
    total DECIMAL(10,2)
);

-- Race condition during customer deletion creates orphaned orders
-- Thread 1: DELETE FROM customers WHERE id = 123;
-- Thread 2: INSERT INTO orders (customer_id, total) VALUES (123, 99.99);
-- Result: Order points to non-existent customer
-- No unique constraint leads to duplicate usernames
CREATE TABLE users (
    id BIGINT PRIMARY KEY,
    username VARCHAR(50),  -- Should be UNIQUE!
    email VARCHAR(255)
);

-- Concurrent signups create duplicates
-- Thread 1: INSERT INTO users (username) VALUES ('johndoe');
-- Thread 2: INSERT INTO users (username) VALUES ('johndoe');
-- Result: Two users with same username, login system breaks
-- Proper constraints prevent data integrity issues
CREATE TABLE customers (
    id BIGINT PRIMARY KEY,
    email VARCHAR(255) NOT NULL UNIQUE
);

CREATE TABLE orders (
    id BIGINT PRIMARY KEY,
    customer_id BIGINT NOT NULL,
    total DECIMAL(10,2) NOT NULL CHECK (total >= 0),
    FOREIGN KEY (customer_id) REFERENCES customers(id) ON DELETE RESTRICT
);

CREATE TABLE users (
    id BIGINT PRIMARY KEY,
    username VARCHAR(50) NOT NULL UNIQUE,
    email VARCHAR(255) NOT NULL UNIQUE,
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);

-- Now the database enforces integrity:
-- Cannot delete customer with orders
-- Cannot create duplicate usernames
-- Cannot have negative order totals

AI Coding Guidance/Prompt

Prompt: "When designing database schemas:"
Rules:
  - Every column ending in _id must have a foreign key constraint
  - Unique business identifiers need UNIQUE constraints
  - Use NOT NULL by default, make nullable only when justified
  - Include ON DELETE behavior for all foreign keys
  
Example:
  # Bad: No constraints
  CREATE TABLE orders (
    id BIGINT PRIMARY KEY,
    customer_id BIGINT,
    email VARCHAR(255)
  );
  
  # Good: Proper constraints
  CREATE TABLE orders (
    id BIGINT PRIMARY KEY,
    customer_id BIGINT NOT NULL,
    email VARCHAR(255) NOT NULL,
    FOREIGN KEY (customer_id) REFERENCES customers(id) ON DELETE RESTRICT,
    UNIQUE (order_number),
    CHECK (total_amount >= 0)
  );

Relevant Keywords

lack foreign keys constraints foreign key constraint referential integrity Symptoms: slow queries, data inconsistency, constraint violations Preventive: schema validation, constraint enforcement, proper typing Tech stack: PostgreSQL, MySQL, SQL Server, Oracle Industry: all industries, enterprise, SaaS

Related Patterns

The Cantorian Technical Debt Magnitude scale gives developers an intuitive sense of magnitude beyond simple hour counts - some debts aren't just larger in scale, but qualitatively different in their complexity.

Cantor Points are critical decision junctures—or even moments of non-decision—where seemingly small choices can create drastically divergent futures for a system's integrity, security, and evolvability. These are the "forks in the road" where one path might lead to manageable complexity, while another veers towards systemic entanglement or even chaos. They often appear trivial at the time but can set in motion irreversible or costly-to-reverse consequences.

Applied Data Integrity (ADI) is a framework to understanding the far-reaching consequences of schema and data decisions that impact security and reliability, and accumulate into ethical debt that affects real human lives. Built on research from real-world incidents, ADI uncovered 7 Principles to identify when these decisions are being made, and how to make them better, to avoid future technical debt and potentially catastrophic "butterfly effects" of small decisions that ripple into chaotic technical and ethical debt.