Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

galaxy wrapper for the scp tool #625

Merged
merged 85 commits into from
Jan 22, 2025
Merged
Show file tree
Hide file tree
Changes from 70 commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
210fe93
initial macros including filtering
KristinaGomoryova Nov 25, 2024
51b1726
peptide aggregation macro updated
KristinaGomoryova Nov 25, 2024
1575646
peptide filtering section macro updated
KristinaGomoryova Nov 25, 2024
1e34240
peptides processing macro section
KristinaGomoryova Nov 25, 2024
5c89fe4
macros for all scp parameters
KristinaGomoryova Nov 26, 2024
6e48b5f
initial commit of the config file for scp
KristinaGomoryova Nov 26, 2024
c1ea017
initial version of full scp workflow
KristinaGomoryova Nov 29, 2024
eca415a
tool versions fixed
KristinaGomoryova Nov 29, 2024
f0badcc
mostly col select corrected, fails on medianCV filtering
KristinaGomoryova Dec 5, 2024
59d933d
fixed filter_median_CV
KristinaGomoryova Dec 9, 2024
f936cb4
filter based on median CV returns non-zero ncols
KristinaGomoryova Dec 9, 2024
b4d95ed
cbind outcommented, plots saving within IF...END
KristinaGomoryova Dec 11, 2024
0041081
run_UMAP typo corrected
KristinaGomoryova Dec 11, 2024
abf8f7c
outputs available for final table or all intermediate results
KristinaGomoryova Dec 11, 2024
66ae9d6
form parameters updated
KristinaGomoryova Dec 12, 2024
1923445
function for intermediate tables export
KristinaGomoryova Dec 12, 2024
4f0a166
tested median CV filtering
KristinaGomoryova Dec 12, 2024
c292e7d
QC plots generated
KristinaGomoryova Dec 13, 2024
84c0c61
collection output corrected
KristinaGomoryova Dec 13, 2024
a6e5a40
boxplots and heatmap for missing data added
KristinaGomoryova Dec 13, 2024
900d55a
heatmap size changed
KristinaGomoryova Dec 16, 2024
cf88b2f
test data
KristinaGomoryova Dec 16, 2024
8e3c91d
help section for the scp tool
KristinaGomoryova Dec 16, 2024
cbaa24a
test for plots generation passing
KristinaGomoryova Dec 18, 2024
c91fe0b
test for intermediate outputs passing
KristinaGomoryova Dec 18, 2024
4149ae3
version updated to 1.16.0
KristinaGomoryova Jan 6, 2025
70e68e6
requirements section added
KristinaGomoryova Jan 6, 2025
fa65f0a
shed file created
KristinaGomoryova Jan 6, 2025
0f1fdda
Helge added
KristinaGomoryova Jan 6, 2025
93c531a
new line in the end of script introduced
KristinaGomoryova Jan 6, 2025
f9bfd9f
option "hidden" from param removed
KristinaGomoryova Jan 6, 2025
61e8751
expect_num_outputs added
KristinaGomoryova Jan 6, 2025
1e29b19
formatting fixed
KristinaGomoryova Jan 6, 2025
adb9528
formatting changed
KristinaGomoryova Jan 6, 2025
162a446
formatting fixed
KristinaGomoryova Jan 6, 2025
76251f8
fixed formatting
KristinaGomoryova Jan 6, 2025
77605d1
fixed formatting
KristinaGomoryova Jan 6, 2025
ba66811
fixed formatting
KristinaGomoryova Jan 6, 2025
c4a8112
added missing deps
hechth Jan 7, 2025
48e5323
Single Cell category added
KristinaGomoryova Jan 8, 2025
212f992
spaces removed
KristinaGomoryova Jan 8, 2025
43012b0
quote=F arg added to write.table
KristinaGomoryova Jan 8, 2025
7d74a2a
Processed_data format set to tabular instead txt
KristinaGomoryova Jan 8, 2025
ba4f0e2
scplainer citation added
KristinaGomoryova Jan 8, 2025
798651b
spaces removed
KristinaGomoryova Jan 8, 2025
a8015d0
exported tables format set to tabular
KristinaGomoryova Jan 8, 2025
76b9cb5
SCP_HELP token deleted
KristinaGomoryova Jan 8, 2025
9b18c3e
export scp object as RData
KristinaGomoryova Jan 8, 2025
6ef1e99
test for scp Rdata
KristinaGomoryova Jan 8, 2025
cbc7122
png format of plots changed to pdf
KristinaGomoryova Jan 8, 2025
c99c693
plots converted from png to pdf
KristinaGomoryova Jan 8, 2025
c954ea6
saving heatmap as pdf corrected
KristinaGomoryova Jan 8, 2025
5276c78
description adjusted
KristinaGomoryova Jan 8, 2025
f0a6929
aggregation options are now in a macro
KristinaGomoryova Jan 8, 2025
3228192
normalization options in a macro
KristinaGomoryova Jan 8, 2025
a5153c6
include field removed
KristinaGomoryova Jan 8, 2025
d1772be
space added in quote = F
KristinaGomoryova Jan 8, 2025
1d27c5d
min to median intensity threshold set
KristinaGomoryova Jan 8, 2025
1799c28
peptide aggregation column done via column choice, not a text field
KristinaGomoryova Jan 8, 2025
5757ba0
protein aggregation column done via select, not a text
KristinaGomoryova Jan 8, 2025
ad6d2cb
normalizations explained
KristinaGomoryova Jan 8, 2025
50c8f77
aggregation methods described
KristinaGomoryova Jan 8, 2025
f497a21
min and max of median CV threshold set
KristinaGomoryova Jan 8, 2025
f4e336b
updated files and changed to PDFs
hechth Jan 9, 2025
3825158
removed pngs
hechth Jan 9, 2025
7576d18
Update tools/scp/.shed.yml
hechth Jan 9, 2025
4df9dd4
Update tools/scp/.shed.yml
hechth Jan 9, 2025
dc6b5d6
added required files
hechth Jan 9, 2025
e6d2f36
added xref to bio.tools entry
hechth Jan 9, 2025
0b48c7a
switched to sim size diff
hechth Jan 9, 2025
489c178
option to color PCA based on sampleAnnotation variable added
KristinaGomoryova Jan 10, 2025
daea853
added option to export R script directly
hechth Jan 10, 2025
230a1bb
changed intermediate output format to tabular
hechth Jan 10, 2025
5b0e66e
save scp object as rds instead of RData
KristinaGomoryova Jan 13, 2025
2610ed7
utils.R renamed to utils.r
KristinaGomoryova Jan 13, 2025
64d9060
utils.r was missing in command detect_errors
KristinaGomoryova Jan 13, 2025
76ae91c
dependencies explicitly stated
KristinaGomoryova Jan 13, 2025
2c66332
name changed to bioconductor-scp
KristinaGomoryova Jan 13, 2025
23b8895
refs changed to scp
KristinaGomoryova Jan 13, 2025
98b5339
requirements corrected
KristinaGomoryova Jan 13, 2025
3d47dac
renamed to bioconductor-scp
hechth Jan 21, 2025
5e4631a
updated remote url
hechth Jan 21, 2025
cc3286e
renamed tool to bioconductor-scp
hechth Jan 21, 2025
605a45c
small format fixes
hechth Jan 21, 2025
5adfebd
fixed tests and namespaces
hechth Jan 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions tools/scp/.shed.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
owner: recetox
remote_repository_url: "https://github.com/RECETOX/galaxytools/tree/master/tools/scp"
homepage_url: "https://uclouvain-cbio.github.io/scp/index.html"
categories:
- Proteomics
KristinaGomoryova marked this conversation as resolved.
Show resolved Hide resolved
- Single Cell
description: "scp is a package for the single cell proteomics data processing."
long_description: "scp is an R package for the analysis of mass spectrometry-based single cell proteomics data. It builds on the QFeatures package and allows aggregation to peptide or protein level, data transformation such as log2 transformation or normalization, batch correction and imputation of missing values. It also provides several quality control metrics."
type: unrestricted
name: bioconductor_scp
38 changes: 38 additions & 0 deletions tools/scp/help.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
<macros>

<token name="@GENERAL_HELP@">
Scp Help section
===================

Overview
--------
The `scp` tool facilitates the processing of the mass spectrometry-based single cell proteomics (SCP) data. It builds on the `scp` R package developed in the laboratory of prof. Laurent Gatto and provides functions for the peptide-to-spectrum match (PSM), peptide or protein-level filtering, normalization, transformation and imputation of missing values.

The source code can be found in the following Github repository or on BioConductor:
.. _GitHub: https://github.com/UCLouvain-CBIO/scp/
.. _issues: https://github.com/UCLouvain-CBIO/scp/issues
.. _Bioconductor: https://www.bioconductor.org/packages/release/bioc/html/scp.html

Workflow
--------

The scp workflow currently supports the processing of MaxQuant results and requires two input files:

- evidence.txt file (output from MaxQuant)
- sampleAnnotation file (provided by user). The SampleAnnotation file is a metadata file, describing annotation of individual samples (such as quantification column names, batches, sample types, etc.). Please note, that the run identifier column MUST be present in both evidence and sampleAnnotation files.

The workflow starts at the level of PSM. Firstly, the data are filtered extensively to keep only the most reliable identifications: reverse sequences and potential contaminants are removed, as well as PSMs below certain parental ion fraction threshold or not passing a q-value threshold. Also batches with very few features are excluded.

Subsequently, PSMs are aggregated to peptide level. On the peptide level, another filtering is applied based on median relative intensity or median CV. Peptide-level intensities are then normalized and log2 transformed.

Such intensities are then further aggregated to the protein level, where they undergo another normalization and imputation of missing values.

Because of the unavoidable batch effects present in the single-cell data, scp offers two methods for the batch correction: ComBat and removeBatchEffect() from the limma package.

Finally, dimenson reduction such as PCA or UMAP (on the PCA components) is provided. PCA and UMAP plots are then provided alongside with the (optional) quality controls plots within the `Plots` collection.

Final log2 transformed, normalized, imputed and batch-corrected data are provided, with the option to export also intermediate results.

Due to the internal complexity of data formats handling, we opted for one form with pre-defined settings for the whole processing pipeline. However, we highly recommend to check also QC plots and intermediate results and based on that adjust the workflow settings.
</token>
</macros>
248 changes: 248 additions & 0 deletions tools/scp/macros.xml

Large diffs are not rendered by default.

Loading
Loading