Perform differential expression analysis separately for each identified cell cluster to determine if certain subpopulations exhibit distinct responses. This approach helps identify genes or pathways specifically affected within sensitive subpopulations.
Differential expression analysis (DEA) in subpopulations is critical for identifying how specific perturbations affect gene or feature expression. MATLAB offers various tools and methods for conducting DEA, particularly suited for analyzing datasets from experiments like single-cell RNA sequencing or other high-dimensional studies. Here's how to approach differential expression analysis in subpopulations with MATLAB:
Ensure your data is appropriately formatted and preprocessed before running differential expression analysis:
Example of Normalizing Data:
% Assuming `expressionData` is a matrix of raw gene expression data
expressionDataNorm = log2(expressionData + 1); % Log-transform to reduce skewness
To analyze differential expression within subpopulations, you must first identify and label these groups:
kmeans
, hierarchical clustering (linkage
and cluster
), or fitgmdist
for subpopulation identification.Example of Clustering for Subpopulation Identification:
% Perform K-means clustering to identify subpopulations
[idx, ~] = kmeans(expressionDataNorm', 3); % Transpose for samples as rows
subpopulationLabels = idx;
To perform DEA between subpopulations:
ttest2
for simple two-group comparisons or ANOVA (anova1
) for multi-group comparisons.ranksum
for non-parametric testing.