Overview
Dimensionality reduction is not a preprocessing step to be applied automatically before modeling. It is an analytical decision with consequences: PCA maximizes variance explained but produces components that are linear combinations of all original variables — often uninterpretable. t-SNE produces beautiful cluster visualizations but distances between clusters are not meaningful. UMAP preserves global structure better than t-SNE but its hyperparameters dramatically affect the result.
The right dimensionality reduction method depends on the goal: if the goal is visualization, t-SNE or UMAP; if the goal is feature compression for a linear model, PCA; if the goal is identifying latent constructs, factor analysis; if the goal is removing multicollinearity, PCA or partial least squares. Using the wrong method for the goal produces a technically correct analysis that answers the wrong question.
The Multivariate Analysis & Dimensionality Reduction Prompt generates a complete multivariate analysis framework: method selection by analytical goal, implementation with hyperparameter specification, component interpretation, and a validation framework that tests whether the reduction preserved the structure that matters for the downstream task.
What you get: - Method selection matrix by analytical goal - PCA, factor analysis, t-SNE, and UMAP implementation - Component and factor interpretation - Variance explained and reconstruction error assessment - Downstream task validation
Built for: data scientists and analysts working with high-dimensional data who need to reduce dimensionality in a way that preserves the structure relevant to their analytical goal.