2024-11-18
Multi-Tenant Arhitektura: Zašto Tenant-By-Row Ubija Performance
Scenario: Imate SaaS ERP sa 500 tenanta (kompanija). Tenant #47 ima 2 miliona invoice-a. Tenant #312 ima 50 invoice-a.
Query:
SELECT * FROM invoices
WHERE tenant_id = 312
AND status = 'unpaid'
AND due_date < NOW()
ORDER BY created_at DESC
LIMIT 10;
Problem: PostgreSQL mora skenirati 2+ miliona redova da bi našao 10 faktura za tenant #312.
Zašto? Zato što ste koristili tenant-by-row pattern.
Tenant-By-Row: Najgori Multi-Tenant Pattern
Kako Radi
Svaka tabela ima tenant_id kolonu:
CREATE TABLE invoices (
id UUID PRIMARY KEY,
tenant_id UUID NOT NULL REFERENCES tenants(id),
customer_id UUID NOT NULL,
amount DECIMAL(15,2),
status VARCHAR(50),
due_date TIMESTAMP,
created_at TIMESTAMP,
-- ... još 20 kolona
);
CREATE INDEX idx_invoices_tenant ON invoices(tenant_id);
Svaki query mora filtirati po tenant_id:
// Backend kod
const invoices = await db.invoices.find({
tenant_id: currentUser.tenantId, // OBAVEZNO
status: 'unpaid'
});
Problem #1: Indeksiranje Je Neupotrebljivo
PostgreSQL indeks na (tenant_id, status) postaje ogroman kada imate:
- 500 tenanta
- 10M+ total redova
- Neravnomerna distribucija (1 tenant ima 50% podataka)
Index bloat je ogroman:
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE tablename = 'invoices';
-- Rezultat: 12 GB (tabela) + 8 GB (indeksi) = 20 GB TOTAL
Problem #2: Query Planner Ne Može Optimizovati
Query za tenant sa 50 redova koristi isti execution plan kao query za tenant sa 2M redova:
EXPLAIN ANALYZE
SELECT * FROM invoices
WHERE tenant_id = '...' AND status = 'unpaid';
-- Tenant #312 (50 invoices):
-- Seq Scan on invoices (cost=0.00..350000.00 rows=50 actual=23ms)
-- Tenant #47 (2M invoices):
-- Seq Scan on invoices (cost=0.00..350000.00 rows=800000 actual=1870ms)
Isti query, 80x razlika u performance-u.
Problem #3: Vacuum i Maintenance Su Košmar
VACUUM mora procesirati celu tabelu, čak i ako tenant #312 ima samo 50 redova:
VACUUM ANALYZE invoices;
-- Processing 10M rows...
-- Time: 45 minutes
Tenant #312 plaća performance penalty zbog tenant #47.
Problem #4: Backup i Restore Nisu Selektivni
Ne možete uraditi per-tenant backup ili selective restore:
# Backup CELE baze (uključuje sve tenant-e)
pg_dump mydb > backup.sql
# Želite restore samo tenant #312?
# NEMA NAČINA. Moraš restore SVE pa filtirati.
Problem #5: Data Leakage Rizik
Jedan bug u WHERE clause-u = catastrophic data leak:
// BUG: Zaboravili tenant_id filter
const invoices = await db.invoices.find({
status: 'unpaid' // ❌ LEAK: Vrača SVE tenant-e
});
Compliance nightmare: GDPR, HIPAA, SOC2 - sve su violated jednim query-jem.
PMFA Rešenje: Tenant-By-Architecture
Ideja: Svaki tenant ima fizički odvojen storage (schema, tabela, ili database).
Pattern #1: Schema-Per-Tenant (PostgreSQL)
-- Tenant A
CREATE SCHEMA tenant_a;
CREATE TABLE tenant_a.invoices (
id UUID PRIMARY KEY,
-- NEMA tenant_id kolone
customer_id UUID,
amount DECIMAL(15,2),
status VARCHAR(50),
due_date TIMESTAMP
);
-- Tenant B
CREATE SCHEMA tenant_b;
CREATE TABLE tenant_b.invoices (
id UUID PRIMARY KEY,
customer_id UUID,
amount DECIMAL(15,2),
status VARCHAR(50),
due_date TIMESTAMP
);
Query:
-- Automatski radi sa tenant context-om
SET search_path TO tenant_a;
SELECT * FROM invoices WHERE status = 'unpaid';
-- Skenira SAMO tenant_a.invoices (50 rows), ne celu bazu
Prednosti
✅ 10x+ brži queries - Indeksi su mali, samo za jedan tenant
✅ Perfect isolation - Fizički odvojeni podaci, zero leakage rizik
✅ Selective backups - pg_dump --schema=tenant_a
✅ Per-tenant optimizacije - Vacuum, reindex, samo za jedan tenant
✅ Custom fields per tenant - Tenant A ima kolonu warranty_period, Tenant B nema
PMFA Implementation: Automatski Schema Management
YAML Metadata
# metadata/entities/invoice.yml
entity: Invoice
multiTenant: true
strategy: schema-per-tenant # PMFA automatski kreira schema za svakog tenant-a
fields:
- name: id
type: uuid
primary: true
- name: customerId
type: relation
target: Customer
- name: amount
type: decimal
precision: 15
scale: 2
- name: status
type: enum
values: [draft, sent, paid, overdue]
- name: dueDate
type: timestamp
Automatic Schema Creation
Kada se novi tenant registruje, PMFA automatski:
// Korisnik kreira novi tenant
POST /api/tenants
{
"name": "Acme Corp",
"subdomain": "acme"
}
// PMFA automatski izvršava:
await db.query(`
CREATE SCHEMA tenant_acme;
-- Generiše SVAKU tabelu iz metadata/
CREATE TABLE tenant_acme.invoices (...);
CREATE TABLE tenant_acme.customers (...);
CREATE TABLE tenant_acme.products (...);
-- ... i sve ostale entitete
-- Kreira indekse
CREATE INDEX idx_invoices_status ON tenant_acme.invoices(status);
CREATE INDEX idx_invoices_due_date ON tenant_acme.invoices(due_date);
`);
Zero ručni rad. Sve je metadata-driven.
Query Routing
PMFA automatski postavlja search_path na osnovu JWT tokena:
// Middleware
app.use((req, res, next) => {
const tenantId = req.user.tenantId;
const schema = `tenant_${tenantId}`;
// Postavi PostgreSQL context
await db.query(`SET search_path TO ${schema}`);
next();
});
// Kontroler (bez tenant_id filtera!)
app.get('/api/invoices', async (req, res) => {
// Query automatski koristi tenant_acme schema
const invoices = await db.invoices.find({
status: 'unpaid'
});
res.json(invoices);
});
Nema tenant_id kolona. Nema WHERE clause kompleksnost.
Benchmark: Tenant-By-Row vs Tenant-By-Schema
Setup:
- 500 tenanta
- 10M total invoices
- Tenant #47: 2M invoices (20%)
- Tenant #312: 50 invoices (0.0005%)
Query: SELECT * FROM invoices WHERE status = 'unpaid' LIMIT 10
Tenant-By-Row (klasičan pristup)
SELECT * FROM invoices
WHERE tenant_id = '312' AND status = 'unpaid'
LIMIT 10;
-- Execution time: 1,250ms
-- Rows scanned: 10M (full table scan sa filter)
-- Index used: idx_tenant_status (8 GB size)
Tenant-By-Schema (PMFA pristup)
SET search_path TO tenant_312;
SELECT * FROM invoices WHERE status = 'unpaid' LIMIT 10;
-- Execution time: 12ms
-- Rows scanned: 50 (samo tenant 312 podaci)
-- Index used: idx_status (24 KB size)
Rezultat: 104x brže za male tenant-e.
Šta Sa Custom Fields Per Tenant?
Problem: Tenant A želi custom polje warranty_period, Tenant B ne želi.
Klasičan tenant-by-row:
- Dodaj
warranty_periodkolonu u sharedinvoicestabelu - 499 tenanta ima NULL vrednost (storage waste)
- Ili koristite JSONB kolonu (lose performance, no type safety)
PMFA tenant-by-schema:
# Tenant A metadata override
entity: Invoice
customFields:
- name: warrantyPeriod
type: integer
unit: months
validation:
min: 0
max: 60
PMFA generiše:
-- Tenant A
CREATE TABLE tenant_a.invoices (
id UUID PRIMARY KEY,
amount DECIMAL(15,2),
status VARCHAR(50),
warranty_period INTEGER CHECK (warranty_period BETWEEN 0 AND 60) -- CUSTOM
);
-- Tenant B (nema custom field)
CREATE TABLE tenant_b.invoices (
id UUID PRIMARY KEY,
amount DECIMAL(15,2),
status VARCHAR(50)
-- No warranty_period
);
Perfect type safety. Zero storage waste.
Downsides: Tenant-By-Schema Limitations
1. Schema Limit
PostgreSQL podržava ~9900 schema per database. Ako imate 10,000+ tenanta, treba vam database-per-tenant pattern (PMFA podržava).
2. Cross-Tenant Reporting
Analytics queries preko svih tenant-a su kompleksni:
-- Koliko ukupno unpaid invoices ima?
SELECT COUNT(*) FROM (
SELECT COUNT(*) FROM tenant_a.invoices WHERE status = 'unpaid'
UNION ALL
SELECT COUNT(*) FROM tenant_b.invoices WHERE status = 'unpaid'
-- ... x 500 tenants
);
PMFA rešenje: Separate analytics database sa tenant_id (ETL pipeline, read replika).
Zaključak
Tenant-by-row je anti-pattern za multi-tenant SaaS:
- Spor performance zbog shared indeksa
- Security rizik (jedan bug = sve tenant podatke leaks)
- Nema per-tenant optimizacija
Tenant-by-schema (PMFA pristup):
- ✅ 10x-100x brži queries
- ✅ Perfect isolation
- ✅ Custom fields per tenant
- ✅ Selective backups/restores
PMFA automatski generiše sve iz YAML metadata:
- Schema creation
- Table generation
- Index setup
- Query routing
Metadata-driven multi-tenancy = future of SaaS.
Zainteresovani za PMFA? Kontaktirajte nas na office@nonnotech.com za early access program.