Proteome Informatics Group > Java Proteomic Library
 

Symbol Sequence

Overview

A sequence of symbol is used when each symbol have to map another object type.

Here are a few type of sequence's symbols used in the JPL:

  • Sequence of Amino Acids
  • Sequence of Ribo/Deoxy Ribonucleotides

We use Decorators like JPLBioSeq or JPLPeptide to wrap the main used symbol sequence. Decorators enable to compose method variations and add extension to basic objects.

Matching

We propose simple matcher over symbol sequence in util package.

The symbols in the sequence and the type of matcher have to be of the same type. There are many ways to get matches:

	JPLMotifMatcher<JPLAminoAcid> matcher;
	JPLSymbolSequence<JPLAminoAcid> sequence;

	sequence =
	    new JPLSymbolSequence.Builder<JPLAminoAcid>("MPEPTMIDEPEPMTIDEMM",
	      JPLAASymbol.getSymbolType()).build();
	
	matcher = JPLMotifMatcher.newInstance("M");
      

There are two possible ways to get positions on the sequence matching the pattern, here is the old way:

	matcher.setSequence(sequence);
	List<Integer> list = matcher.iterator().nextToList();
      

or the less error prone way:

        List<Integer> list = matcher.iterator(sequence).nextToList();
      

Some more specific parsers exists build on the above models. They are kind of syntaxic sugars ;-)

	// the peptide to search motif on
	JPLPeptide peptide = 
	    new JPLPeptideBuilder("MARVALMMKHGTREEDFFMM").build());
	
	// the matchers
	JPLAAMatcher simpleMatcher = 
	    JPLAAMatcher.newInstance(JPLAminoAcid.M);

	List<Integer> indicesFromSimpleMatcher =
	    simpleMatcher.iterator(peptide).nextToList();
	
	List<Integer> indicesFromSimpleMatcher =
	    modifMatcher.iterator(peptide).nextToList();