Overview
Data profiling is the first step of every serious data cleaning project, and the step most often skipped in favor of jumping straight to cleaning. The result is cleaning work that addresses the visible problems while missing the structural issues — the cross-variable inconsistencies, the hidden duplicate keys, the columns that look populated but contain only whitespace.
A systematic data profile is not a summary statistics table. It is a quality audit: per-variable statistics that reveal distribution anomalies, cross-variable checks that reveal logical inconsistencies, completeness analysis that distinguishes structured missingness from random gaps, and a quality score that prioritizes remediation effort.
The Data Profiling & Quality Audit Prompt generates a complete profiling framework: the full set of statistics to compute per variable type, cross-variable checks, quality scoring methodology, and a remediation backlog format that translates audit findings into actionable cleaning tasks.
What you get: - Per-variable profiling specification by data type - Cross-variable consistency checks - Quality scoring methodology with dimension weights - Automated profiling tool configuration - Remediation backlog format with priority scoring
Built for: data analysts, data engineers, and data scientists beginning a new dataset, onboarding a new data source, or conducting a periodic quality review.