Workflow¶
Data import¶

mspypeline
package supports the analysis of label-free shotgun proteomics analyzed by the
MaxQuant software. For a complete analysis, the mspypeline
package deploys several
MaxQuant output tables, however, the
minimal requirement to perform exploratory analysis is only the proteinGroups.txt file that contains aggregated
protein intensities.mspypeline
requires a strictly followed internal data format.MQReader
is provided.MQReader
can reformat several output tables provided by MaxQuant including the following text files:
proteinGroups
peptides
summary
parameters
msScans
msmsScans
evidence
MQReader
includes:
the removal of “Reverse” proteins, those “Only identified by site” or marked as “Potential contaminant”
proteins that are missing both an identified gene name and a FASTA header are discarded
intensities of proteins assigned with an identical gene name (duplicates) are handled (sum or drop).
Tip
To perform data analysis with different intensity types (e.g. LFQ or iBAQ) it is necessary to specify these options for the MaxQuant analysis.
It is recommended to select the information of FASTA file headers for the detected proteins as
mspypeline
will use the gene name annotation therefrom to index the detected proteins.To ensure proper analysis the samples of the experiment have to be named according to the naming convention.
Warning
the output folder, called txt, and all contained files from a MaxQuant run must not be renamed or the analysis will not work.
Quality control¶

Data Preprocessing¶

Intensity options¶
LFQ Intensity (“lfq_log2”) (extended information)
raw Intensity (“raw_log2”)
iBAQ Intensity (“ibaq_log2”) (extended information)
Normalization options¶
No normalization
Median Normalization via:
MedianNormalizer
Quantile Normalization with missing value handling via:
QuantileNormalizer
andinterpolate_data()
Tail Robust Quantile Normalization (TRQN) via:
TailRobustNormalizer
andQuantileNormalizer
TRQN with missing value handling via: same as above and
interpolate_data()
Tail Robust Median Normalization via:
TailRobustNormalizer
andMedianNormalizer
plot_normalization_overview()
and
plot_heatmap_overview_all_normalizers()
.Exploratory Analysis¶
Create outlier detection and comparison plots¶

Create statistical inference plots¶
for the
plot_pathway_analysis()
an independent t-test is appliedfor the
plot_go_analysis()
a fisher’S exact test is appliedfor the
plot_r_volcano()
plot the moderated t-statistics is applied which is implemented by the R package limma. Additional R packages might be downloaded when this plot is created for the first time.
Select pathways and GO-Terms of interest¶
Select Pathways. Selected pathways has following effects:
for the
plot_pathway_analysis()
one plot per pathway is createdin the
plot_rank()
, if a protein is found it is marked on the plot and colored by the pathwayin the
plot_r_volcano()
, if a pathway is selected, proteins of that pathway are annotated in the plot instead of the most significant proteins that are annotated by default
Select GO Terms. Selected GO-Terms has following effects:
for the
plot_go_analysis()
one additional barplot is added per GO term