Information

Blastn: What substitution matrix is used?

Blastn: What substitution matrix is used?


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I'm currently working aligning sequences, and I need to compute similarity between pairs of DNA 'words' of a particular length.

For amino acids I am able to use the substitution matrices in Biopython (Bio.SubsMat.MatrixInfo).

However, I haven't found anything similar for DNA, so I read up and found that most systems use a match/mismatch scoring system where each nucleotide match and mismatch is scored and then the scores are summed. This works fine as long as I am only dealing with A, G, C, and T, but I run into problems when I get a sequence containing N or M and the like (meaning nucleotide unknown).

Is there a standard way to handle the situation with unknowns? That is, how do I score A versus N or M versus N?

Thanks in advance.


BLASTN does not use a substitution matrix. There are scores for match, mismatch and gaps which you can also define.

There is no feature available as of now to allow scoring of matches against unknowns. They are considered mismatches (as shown below). If these unknown are in the middle of a HSP, then you can probably re-score the HSP according to your scheme using a python script. If theNstretch is disrupting the HSP, then you can try relaxing the mismatch penalties and reduce word size (basically reduce stringency). I can't think of any other solution.

Query 1 CAGCGTCCANNTCCCGAGGTGCCGGGATTGCAGACGGAGTCTGGTTCACTCAGTGCTCAA 60 ||||||||| ||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 8 CAGCGTCCACCTCCCGAGGTGCCGGGATTGCAGACGGAGTCTGGTTCACTCAGTGCTCAA 67 Query 61 TGGTGCCCAGGCTGGAGTGCAGTGGCGTGATCTCGGCTCGCTACANNCTCCACCTCCCAG 120 ||||||||||||||||||||||||||||||||||||||||||||| ||||||||||||| Sbjct 68 TGGTGCCCAGGCTGGAGTGCAGTGGCGTGATCTCGGCTCGCTACAACCTCCACCTCCCAG 127 Query 121 CCGCCTGCCCTGGCCTCCCAAAGTGCCGAGATTGCAGCCTCTGCCCAGCCGCCACCCC 178 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct 128 CCGCCTGCCCTGGCCTCCCAAAGTGCCGAGATTGCAGCCTCTGCCCAGCCGCCACCCC 18


Watch the video: BLOSUM Substitution Matrix (July 2022).


Comments:

  1. Luduvico

    You are absolutely right. It is about something different and the idea of ??keeping.

  2. Faugore

    Simply the Shine

  3. Correy

    I deleted that phrase

  4. Tukora

    Thank you. What is needed))

  5. Alexandru

    This topic is simply incomparable

  6. Dayveon

    Just what is necessary.



Write a message