Proteomics
JoBBurt/proteomics-skillProteomics analysis toolkit for label-free quantitative proteomics. Invokes R scripts for normalization, visualization (volcano, heatmap, PCA, LOPIT), pathway analysis (KEGG, ConsensusPathDB), and protein list cross-referencing (MISEV2018, SASP, Matrisome). USE WHEN user says 'analyze proteomics', 'volcano plot', 'normalize protein data', 'pathway enrichment', 'check EV markers', 'SASP analysis', 'matrisome', OR mentions q-value, fold-change, or protein quantification.
SKILL.md
name: Proteomics description: Proteomics analysis toolkit for label-free quantitative proteomics. Invokes R scripts for normalization, visualization (volcano, heatmap, PCA, LOPIT), pathway analysis (KEGG, ConsensusPathDB), and protein list cross-referencing (MISEV2018, SASP, Matrisome). USE WHEN user says 'analyze proteomics', 'volcano plot', 'normalize protein data', 'pathway enrichment', 'check EV markers', 'SASP analysis', 'matrisome', OR mentions q-value, fold-change, or protein quantification.
Proteomics
Quantitative proteomics analysis toolkit combining R script invocation with embedded methodology knowledge. Fully portable - all scripts and reference data included.
Skill Directory: ~/.claude/Skills/Proteomics/
Workflow Routing
When executing a workflow, output this notification:
Running the **WorkflowName** workflow from the **Proteomics** skill...
| Workflow | Trigger | File |
|---|---|---|
| Normalize | "normalize data", "apply normalization", "median/quantile/loess normalize" | workflows/Normalize.md |
| VolcanoPlot | "volcano plot", "create volcano", "visualize fold change" | workflows/VolcanoPlot.md |
| Heatmap | "heatmap", "PCA", "correlation plot", "sample clustering" | workflows/Heatmap.md |
| PathwayAnalysis | "pathway analysis", "KEGG enrichment", "ConsensusPathDB", "GO enrichment" | workflows/PathwayAnalysis.md |
| ProteinListQuery | "check EV markers", "MISEV proteins", "exosome markers", "blood contaminants" | workflows/ProteinListQuery.md |
| ExcelWorkup | "create Excel report", "filter by q-value", "generate data tables" | workflows/ExcelWorkup.md |
| Matrisome | "matrisome analysis", "ECM proteins", "extracellular matrix" | workflows/Matrisome.md |
| SaspAnalysis | "SASP analysis", "senescence factors", "core SASP" | workflows/SaspAnalysis.md |
Examples
Example 1: Generate Volcano Plot
User: "Create a volcano plot for my proteomics comparison data"
-> Invokes VolcanoPlot workflow
-> Asks for data file location and parameters (q-value, fold-change threshold)
-> Either invokes Plot_Workup_V10.R or generates custom ggplot2 code
-> Outputs TIFF files to output/ directory
Example 2: Check for EV Markers
User: "Which MISEV2018 EV markers are in my dataset?"
-> Invokes ProteinListQuery workflow
-> Reads user's protein list
-> Cross-references against data/MISEV2018_EV_Markers.txt
-> Returns categorized matches (Category 1-5, tetraspanins, annexins, etc.)
Example 3: Full Analysis Pipeline
User: "Run a complete proteomics analysis on my kidney data"
-> Sequences multiple workflows:
1. Normalize (median normalization)
2. Heatmap (PCA, sample correlation)
3. VolcanoPlot (for each comparison)
4. Matrisome (ECM protein analysis)
5. SaspAnalysis (if relevant)
6. ExcelWorkup (generate report)
-> Creates organized output/ directory structure
Example 4: Pathway Enrichment
User: "Run KEGG pathway analysis on my significantly altered proteins"
-> Invokes PathwayAnalysis workflow
-> Filters to q < 0.01, |log2FC| > 0.58
-> Runs clusterProfiler or ConsensusPathDB
-> Generates dotplot visualization
R Script Quick Reference
All scripts are in the skill's rscripts/ directory.
| Script | Purpose | Key Parameters |
|---|---|---|
Plot_Workup_V10.R |
Full visualization pipeline | organism, batch, myFC, myQval, mypattern |
Excel_Workup_v05.R |
Excel report generation | myoutput, batch, myFC, q-value flags |
normalization/Step_1_Normalization.R |
Data normalization | Input matrix (iMat) |
ConsensusPathDB_23_0411_v03.R |
Pathway dotplots | input_dir, output_dir, q.val, t.level |
toolkit.R |
Library loading | Called at start of analysis |
barplots.R |
Bar plot utility | Various |
Standard Parameters
| Parameter | Typical Values | Description |
|---|---|---|
| q-value | 0.05, 0.01, 0.001 | Statistical significance threshold |
| Fold Change | 0.58 (1.5x), 1.0 (2x) | Log2 fold change cutoff |
| Organism | "human", "mouse" | Species for reference lists |
| Pattern | "JB\\d_\\d+" |
Regex for sample ID extraction |
Reference Data Available
All protein lists are in the skill's data/ directory.
| List | File | Contents |
|---|---|---|
| MISEV2018 EV Markers | MISEV2018_EV_Markers.txt |
500+ proteins, Category 1-5 |
| EV Categories | MISEV2018_EV_Categories.txt |
Category definitions |
| Exosome Markers | Exosome_Protein_Markers.txt |
CD63, CD81, CD9, TSG101, etc. |
| Blood Contaminants | Top_10_Blood_Proteins.txt |
Albumin, IgG, fibrinogen, etc. |
| Apolipoproteins | Apolipoproteins.txt |
APOA1, APOB, etc. |
| Human Core SASP | Human_Core_SASP.csv |
175 SASP factors with IR/RAS/ATV scores |
| Mouse Core SASP | Mouse_Core_SASP.csv |
Mouse SASP orthologs |
| Human Matrisome | matrisome_hs_masterlist.csv |
ECM proteins by category |
| Mouse Matrisome | matrisome_mm_masterlist.csv |
Mouse ECM proteins |
Required Data Structure
For running the full analysis scripts, data should be organized as:
[PROJECT_DIR]/
├── data/
│ ├── [batch]_Protein_Report_2pep.csv # Protein intensities
│ ├── [batch]_candidates_2pep.csv # Comparison results
│ └── [batch]_ConditionSetup.csv # Sample metadata
└── output/
├── Data_Tables/ # Excel reports
└── [plots will be saved here]
Invocation Pattern
To run R scripts from this skill:
cd [PROJECT_WORKING_DIR]
Rscript ~/.claude/Skills/Proteomics/rscripts/[SCRIPT_NAME].R
Important: Scripts expect:
- Working directory set to project folder
data/subdirectory with input filesoutput/subdirectory for results- Reference data paths point to skill's
data/directory (may need adjustment)
When NOT to Use This Skill
- General R coding questions -> Use standard Claude
- Non-proteomics data analysis -> Use appropriate tools
- Genomics/transcriptomics -> Different methodology
- Statistical consulting without data -> Explain methodology, don't run
README
Proteomics Skill
A Claude Code skill for label-free quantitative proteomics analysis. Provides workflows for data normalization, visualization (volcano plots, heatmaps, PCA), pathway enrichment analysis (KEGG, ConsensusPathDB), and protein list cross-referencing.
Installation
Clone this repository directly to your Claude Code Skills directory:
git clone https://github.com/jobburt-labs/proteomics-skill.git ~/.claude/Skills/Proteomics
R Dependencies
This skill requires R with the following packages installed:
CRAN Packages
install.packages(c(
"tidyverse",
"ggplot2",
"openxlsx",
"pheatmap",
"gplots",
"corrplot",
"RColorBrewer",
"VennDiagram",
"eulerr",
"scales",
"stringr"
))
Bioconductor Packages
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(c(
"limma",
"preprocessCore",
"vsn",
"clusterProfiler",
"org.Hs.eg.db",
"org.Mm.eg.db",
"DOSE",
"enrichplot",
"ReactomePA",
"pathview",
"pRoloc",
"pRolocdata"
))
One-liner Installation
# Install all dependencies at once
install.packages(c("tidyverse", "ggplot2", "openxlsx", "pheatmap", "gplots", "corrplot", "RColorBrewer", "VennDiagram", "eulerr", "scales", "stringr"))
if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager")
BiocManager::install(c("limma", "preprocessCore", "vsn", "clusterProfiler", "org.Hs.eg.db", "org.Mm.eg.db", "DOSE", "enrichplot", "ReactomePA", "pathview", "pRoloc", "pRolocdata"))
Python Dependencies (Optional)
For ConsensusPathDB SOAP client:
pip install zeep # SOAP client for Python 3
Directory Structure
Proteomics/
├── SKILL.md # Main skill definition
├── README.md # This file
├── DataFormats.md # Input/output file specifications
├── NormalizationMethods.md # Statistical normalization reference
├── VisualizationPatterns.md # ggplot2 plotting patterns
├── PathwayAnalysisGuide.md # Pathway enrichment methodology
├── workflows/
│ ├── Normalize.md # Data normalization
│ ├── VolcanoPlot.md # Volcano plot generation
│ ├── Heatmap.md # Heatmap/PCA/correlation
│ ├── PathwayAnalysis.md # KEGG/ConsensusPathDB
│ ├── ProteinListQuery.md # Protein list cross-reference
│ ├── ExcelWorkup.md # Excel report generation
│ ├── Matrisome.md # ECM/Matrisome analysis
│ └── SaspAnalysis.md # SASP factor analysis
├── rscripts/
│ ├── Plot_Workup_V10.R # Visualization pipeline
│ ├── Excel_Workup_v05.R # Excel report generation
│ ├── ConsensusPathDB_23_0411_v03.R
│ ├── toolkit.R # Library loading
│ ├── barplots.R # Bar plot utility
│ └── normalization/
│ ├── Step_1_Normalization.R
│ └── 2201_Label_Free_Functions.R
├── python/
│ ├── client.py # ConsensusPathDB SOAP client
│ ├── cpdb_services.py
│ └── cpdb_services_types.py
├── data/
│ ├── MISEV2018_EV_Markers.txt
│ ├── MISEV2018_EV_Categories.txt
│ ├── Exosome_Protein_Markers.txt
│ ├── Top_10_Blood_Proteins.txt
│ ├── Apolipoproteins.txt
│ ├── Human_Core_SASP.csv
│ ├── Mouse_Core_SASP.csv
│ ├── matrisome_hs_masterlist.csv
│ └── matrisome_mm_masterlist.csv
└── tools/
└── .gitkeep
Workflows
| Workflow | Trigger | Description |
|---|---|---|
| Normalize | "normalize", "median normalization" | Apply normalization methods |
| VolcanoPlot | "volcano plot", "fold change plot" | Generate volcano plots |
| Heatmap | "heatmap", "PCA", "correlation" | PCA, heatmaps, correlation plots |
| PathwayAnalysis | "pathway", "KEGG", "enrichment" | Pathway enrichment analysis |
| ProteinListQuery | "EV markers", "check against" | Cross-reference protein lists |
| ExcelWorkup | "Excel report", "filter proteins" | Generate filtered Excel output |
| Matrisome | "matrisome", "ECM proteins" | ECM/Matrisome analysis |
| SaspAnalysis | "SASP", "senescence" | Core SASP factor analysis |
Usage Examples
"Create a volcano plot for my proteomics data"
"Normalize my protein intensities using quantile normalization"
"Which of my proteins are MISEV2018 EV markers?"
"Run KEGG pathway analysis on significantly altered proteins"
"Check for SASP factors in my aging samples"
"Generate a heatmap of my top differentially expressed proteins"
Reference Data
| Dataset | Description | Organisms |
|---|---|---|
| MISEV2018_EV_Markers | Extracellular vesicle marker proteins | Human |
| Core_SASP | Senescence-associated secretory phenotype factors | Human, Mouse |
| MatrisomeDB | Extracellular matrix proteins | Human, Mouse |
| Apolipoproteins | Blood contamination markers | Human |
License
This skill is provided as-is for proteomics data analysis.