๐ Read the Research Paper
Long-form academic writeup of all seven findings, the LOUO transfer benchmark, the Finding 6 flip at 100 users, and the Finding 7 metric-artifact reframing. PDF for download/print, HTML for in-browser reading, and a one-page executive summary for the time-pressed reader.
Paper โ PDF Primary
16-page xelatex render with embedded figures, references, and reproducibility appendix. Best for download, print, and citation.
Open PDFPaper โ HTML
Self-contained HTML version (figures inlined as base64, MathML math, mobile-responsive). Reads cleanly on phone or desktop โ no notebook viewer needed.
Open HTMLExecutive Summary 1 page
One-page distillation: the problem, the three findings that matter, one figure (sigmoid vs linear vs exponential), and the implication for practitioners.
Open Summary๐ Research Status
Var(log_k) = 0, which makes Rยฒ mathematically
divergent regardless of absolute prediction error. Deployable fix is
a metric guard (Rยฒ โ MAE for low-variance users), not a separate model.
๐ Source & Supplementary (2026-05-13)
Full Writeup (Markdown)
RESEARCH_FINDINGS.md โ original markdown source for the 7 findings, including the Finding 6 flip (30 โ 100 users) and the Finding 7 reframing (outliers are a metric artifact, not a spatial failure).
Read on GitHubNarrative Notebook New
privacy_decay_findings.ipynb โ 32-cell guided tour through all 7 findings with reproduced figures, plain-language Finding 7 explainer, and the spatial/temporal tension finale.
Open NotebookLiving Notebook
02_Real_Data_Findings.ipynb โ runnable notebook with tables, plots, and the convex-hull / log_k-variance analysis. Auto-refreshed by the Sunday real-data cron.
Open NotebookPer-User Rยฒ Data
JSON dumps with per-user results, outlier rosters, and bootstrap CIs for every analysis (LOUO, fine-tune, hull-membership).
Browse JSONs๐จ Experiment Visualizations
Model Comparison Latest
Compare Random Forest, Gradient Boosting, Neural Network, and Ridge Regression performance.
View ChartCustom Decay Curves
Visual comparison of 6 different privacy decay strategies over 2 years.
View ChartFeature Engineering Impact
Analysis showing how different feature sets affect model performance.
View ChartPrivacy Theories Comparison
Validation of different privacy protection theories (age-only, context-first, hybrid, data-type specific).
View ChartMVP Comprehensive Analysis
Complete MVP analysis with decay curves, distributions, and privacy patterns by data type.
View ChartML Model Performance
Detailed ML model metrics including predictions vs actual, residuals, and feature importance.
View Chart๐ค Ollama Model Benchmarks
Model Benchmark Results Baseline
Performance comparison of tinyllama, phi, llama3.2, and mistral models across privacy-specific tasks.
View Chart JSON ReportRecommended Models
Default: tinyllama:1.1b (29 tok/s, 4.6s response)
Critical: mistral:7b (80.3 quality, best accuracy)
๐ Automated Test Results
Real-Data Cron Sundays 06:00 UTC
Weekly Kaggle benchmark (GDPR + HIPAA + GeoLife): runs LOUO transfer, per-user fine-tune ablation, sigmoid fits, and re-executes the findings notebook in place.
View Cron LogPhase 3 Research JSONs
Per-analysis output: louo_transfer.json, per_user_finetune.json, convex_hull_outliers.json, etc. Each includes per-user dumps + outlier rosters + bootstrap CIs.
Original Nightly Tests
Earlier nightly Gradient Boosting tests on synthetic data (pre-real-data pivot). Kept for historical comparison.
Browse DirectorySystem Logs
Cron execution logs across all scheduled jobs (real-data, daily, weekly tests).
Browse Logs๐ Data & Documentation
Datasets Real Data
GeoLife GPS (48,406 rows / 100 users), GDPR Fines (212 records 2018-2020), HIPAA Breaches (1,632 records 2009-2017), plus original synthetic baselines (5,000 samples).
Browse DatasetsAll Visualizations
Browse complete directory of all experiment visualizations and charts.
Browse All๐ Local Reproduction
The full Docker stack (Jupyter, TimescaleDB, Redis, Ollama) reproduces every figure in the paper from raw Kaggle datasets. Credentials are read from a local .env file โ see .env.example in the repo.
Service ports (after docker compose up): Jupyter :8889 ยท Postgres/TimescaleDB :5433 ยท Redis :6380 ยท Ollama :11435