Dig2MZ
Overview
This application digests fasta proteins and reports all peak's mz, charge and other infos.
Usage
usage: Dig2MZ <fasta> [-a] [--cyscam] [-d <arg>] [-e <arg>] [-f <arg>] [-h] [-i <arg>]
	[-L <arg>] [-l <arg>] [-m <arg>] [--non-verbose] [-o <arg>] [-p <arg>] [-q <arg>]
	[-u <arg>] [-v]
 -a,--average                     set the average mass mode for peptide mass
                                  calculation
                                  by default: MONOISOTOPIC.
    --cyscam                      modification of all protein's cysteins
                                  by S-carboxamidomethyl cysteines (CysCAM, +57 Da).
 -d,--delimiter <arg>             define the field delimiter to display
                                  by default: \t.
 -e,--enzymes <arg>               define enzymes that digest proteins separately with:
                                  enzyme name among 'Caspase-1, Caspase-10, Glu-C_bicarbonate, Caspase-3,
                                  Thermolysine, Lys-C, Pepsin_pH1.3, Caspase-8, Glu-C_phosphate, Caspase-9,
                                  Caspase-5, CNBr, ChymoTrypsin_lowspec, Trypsin, ChymoTrypsin_FYL,
                                  Proteinase-K, Caspase-7, Pepsin_pHgt2, ChymoTrypsin_highspec,
                                  BNPS-Skatole, Enterokinase, Caspase-6, Arg-C, Asp-N, ChymoTrypsin_FYLW,
                                  Caspase-4, Caspase-2'.
                                  or custom motifs respecting the following grammar:
                                  <pre-cut> <cut-token> <post-cut>
                                  <cut-token> := '|'
                                  <pre-cut> := (<AA> or <AA-class>)+
                                  <post-cut> := (<AA> or <AA-class>)+
                                  <AA> := [A-Z]
                                  <AA-class> := '[' AA+ ']'
                                  by default: Trypsin.
 -f,--fields <arg>                define the fields to display (1:MZ, 2:Charge,
                                  3:Enzyme, 4:Seq, 5:MC)
                                  by default: [1, 2, 3, 4, 5].
 -h,--help                        print this message.
 -i,--setting-file <arg>          give a property file with all input settings.
 -L,--pept-len-filter <arg>       define a filter over length of digested peptides
                                  by default: 6.
 -l,--pept-mz-lower-filter <arg>  define the lower mz bound (included) of
                                  digested peptides
                                  by default: 400.
 -m,--mc-max-num <arg>            define the number of maximum missed cleavages
                                  (for digestion)
                                  by default: 1.
    --non-verbose                 non verbose mode (no header for settings).
 -o,--oximet-max-num <arg>        define the number of maximum oxidated
                                  methionines
                                  by default: 0.
 -p,--precision <arg>             define the decimal precision for any
                                  mass-to-charge ratio
                                  by default: 6.
 -q,--pept-charge-filter <arg>    define a filter over charges on digested
                                  peptides as a sequence of integers and/or intervals like in 1, 2, 3:5,
                                  10:7
                                  by default: [2, 3].
 -u,--pept-mz-upper-filter <arg>  define the upper mz bound (included) of
                                  digested peptides
                                  by default: 2000.
 -v,--version                     print the version info.
      
Example
# a first way to execute the application with lots of options # the results are redirected in file digests.out and errors in digests.log $ java -jar Dig2MZ-1.1.jar -a -e Lys-C -f 1,2,4,5 -l 1 -p4 -q 2:4 uniprot-human.fasta > digests.out 2> digests.log # .. or the more compact way with all options defined in a setting file $ java -jar Dig2MZ-1.1.jar -i settings.properties uniprot-human.fasta > digests.out 2> digests.log $ more digests.out # ================================================================== # Generated by Dig2MZ v.1.1 # # Input Proteins --------------------------------------------------- # read from file uniprot-human_swissprot.fasta # modified (fixed) with CYS_CAM # digested by enzyme [Lys-C, pattern: K|X, mc#=1] # Digested Peptides ------------------------------------------------ # filtered with charges [2, 3, 4] # filtered with length >= 1 # filtered over mzs in interval [400.0, 2000.0] # with masses computed in mode AVERAGE # Output Fields ---------------------------------------------------- # all [MZ, Charge, Enzyme, Seq, MC] # selected indices [1, 2, 4, 5] # Real # Decimal precision format 4 # ================================================================== MZ Charge Seq MC 1192.3452 3 NDDNAITSPIAGKTSVLRAIPVEVLANSYDISTK 1 894.5108 4 NDDNAITSPIAGKTSVLRAIPVEVLANSYDISTK 1 1355.1842 2 LILSFSLC(C2H3NO)LMVLSC(C2H3NO)SAQLLPWQK 0 658.7105 2 NDDNAITSPIAGK 0 1578.9621 2 MSTKLILSFSLC(C2H3NO)LMVLSC(C2H3NO)SAQLLPWQK 1 1052.9772 3 MSTKLILSFSLC(C2H3NO)LMVLSC(C2H3NO)SAQLLPWQK 1 694.2949 2 AGREGLEWVELK 0 463.1991 3 AGREGLEWVELK 0 651.3932 3 RGGSGRSNGLEQAFC(C2H3NO)NLK 0 488.7968 4 RGGSGRSNGLEQAFC(C2H3NO)NLK 0 556.3634 4 NGRQEVEVFRPFQSRDEK 0 518.5931 2 ERERFSIV 0 895.6627 3 AGREGLEWVELKNDDNAITSPIAGK 1 671.9989 4 AGREGLEWVELKNDDNAITSPIAGK 1 1139.3187 2 TSVLRAIPVEVLANSYDISTK 0 ... # get digested peaks and sort by mz $ tail -n+19 digests.out | sort -k1 -n | uniq > digests_sorted.out $ more digests_sorted.out MZ Charge Seq MC 400.0003 4 ILLEGRRLISDALK 0 400.0006 5 HMEDPLEMERSPQLRK 0 400.0032 4 TAIQQLRSVIRALK 0 400.0057 5 IRQFEEQFERERNSK 0 400.0065 5 AFVYNSSLVSHQEIHHK 0 400.0080 5 RDATHDYRQALATHVNK 0 400.0101 5 ELVERRRTMMEDFRK 0 400.0101 5 PMVNHAEASRLNIERMK 0 400.0117 5 NSPRLRMRTETPSHWK 0 400.0124 5 PQLHSMVARSLC(C2H3NO)RNAAGK 0 400.0138 5 LC(C2H3NO)RLSMQC(C2H3NO)LRDFRIK 0 400.0145 5 HVIIGFSIENSHDRIMK 0 400.0152 5 RLMADELERFTSMRIK 0 400.0161 5 RVTRTGFEDGLFAGWRK 0 400.0167 5 VEQLFGLGLRPRGEC(C2H3NO)HK 0 400.0210 5 DHLTLGTGVAGIDMRRGVK 0 400.0212 5 RAALC(C2H3NO)FRRNLGTYNRK 0 400.0233 5 SIQISHFNPPPPHLRQK 0 400.0240 5 FYASVRC(C2H3NO)DIRRIQALK 0 ...
Releases
The latest version is v1.21 - Download app and the default properties file.
Rel1.21
| Type | Changes | 
|---|---|
| Bug Fix | In missed cleavages mode, modified peptides had weird position shifts. | 
| New | New --cyscam option. The CysCAM fixed modification is now optional. | 
| New | New output field "Mods" that give the number of modifications of the peptide digests. | 
Rel1.1
| Type | Changes | 
|---|---|
| New | Add two new options to define an interval over mzs pep-mz-lower-filter (-l) and pep-mz-upper-filter (-u) | 
| Change | Change option name for peptide length filter '-l' -> '-L'. | 





