-------------------------------------------------------------------------------------------------------------------------------  

::: Luciano Carlos da Maia :::

Plant Genomics and Breeding Center - Federal University of Pelotas - Brazil

 

::::  SSRLocator - Tool for simple sequence repeate locate ::::

User Guide

 

SSRLocator Utilization Guide

1)  INITIAL CONSIDERATIONS

             SSRLocator is a tool for detection and characterization of micro- and minisatellites in DNA sequences. Beyond finding these repetition standards, the program executes functions for primers drawing, simulates PCR reaction and makes global aligning between homologues regions obtained from PCR simulation.

             The program code was developed in Pascal language, having graphic components of Borland Delphi suite, destined to MS-Windows operational system. The routine to the detection of repetitions was written in Perl language, so, this module was compiled as an executed file. During the execution of SSRLocator activities, the data are recorded in a Firebird database.

             Delphi is a registered mark of Borland Software Corporation, so, the license for utilization and distribution of this applicative is under our power. The Perl language, and also, the Firebird database are free programs, belonging to different “Free Software” initiatives, not being necessary to have a previous license to use it. 

 To elucidate the main steps for an analysis using SSRLocator, we recommend the utilization of a fasta file having EST redundant sequences, which we have available in our address below.

If you prefer, you can also obtain a file in fasta standard having sequences that you want to analyze. Any kind of sequence can be analyzed with this program, including short sequences with mRNA, ESTs, cDNA, genes and/or bigger sequences as BACs, Shotgun passages and/or pseudo-molecules.

 

Before reading this tutorial, read first the tutorials: 1) SSRLocator Installation Guide and 2) Firebird Installation Guide.

Maia, Luciano Carlos da

Costa de Oliveira, Antonio

Centro de Genomica e Fitomelhoramento

Universidade Federal de Pelotas

 -------------------------------------------------------------------------------------

2) OBTAINING FILE HAVING SEQUENCES FOR ANALYSIS

2.1)     To obtain sequences.fasta file, as we indicated, click in our address (Picture 1).

http://www.ufpel.tche.br/faem/fitotecnia/fitomelhoramento/faleconosco.html 

Picture 1.

 2.2)     Copy sequences.rar file in a folder/directory of your computer (Picture 2), discompacting the file. To discompact it, use Winrar (Picture 3). 

Picture 2. 

Picture 3. 

2.3)     After discompacting, copy or move the file sequences.fasta to a folder/directory C:\ SSRLocatorI. If you have chosen other .fasta file, this one must also be copied for C:\SSRLocatorI file, according to Picture 4. 

Picture 4.

3)   Begin SSRLocator through Windows menu or, if you prefer, go until folder/directory C:\SSRLocatorI and execute the program. The initial program screen is shown in Picture 5.

Picture 5.

4)        The first step for an analysis is to format your .fasta file to be used by SSRLocator.

4.1)     For this, select in the program main menu the option Create File > Format File to SSRLocator Standard, according to Picture 6.  

Picture 6. 

4.2)     After selecting this option, the program will show a message (Picture 7) asking for the .fasta file be chosen and it must be formatted. Click in <OK>, and in Open fasta file box, select the file. Now, click in <OPEN> option, according to what is shown in Picture 8. 

Picture 7. 

Picture 8. 

After selecting .fasta file to be formatted, SSRLocator will show a message asking a name for the new file that will be created, click in <OK> (Picture 9). In Save fasta file box, go until the File name field, put the name for the new file and press the button <SAVE>. In this example, we have put the sequences_formated.fasta name (Picture 10). 

Picture 9. 

Picture 10. 

After the operation is finished, the program will send a message indicating that the new file was created and that, if you want, you can exclude the previous file (Picture 11). 

Picture 11. 

5) Confirming options for the localization of repetitive sequences

5.1) In the main menu, select SSR Search option and after this, SSRs Especifications (Picture 12). 

Picture 12. 

A window for type’s selection, minimum numbers of repetition for each type, minimum space between two micro-mini satellites and maximum distance for imperfect loci classification is shown (Picture 13).

Picture 13. 

This module makes possible selecting which types of repetitions must be found. In this example, the configuration indicates the program to find monomers, dimers, trimers, tetramers, pentamers, hexamers, heptamers, octamers, nonamers and decamers occurrences. The minimum number of repetitions for each of these occurrences must be 20x for monomers (exampleÇ AAAAAAAAAAAAAAAAAAAA) and 10x for dimmers (example ATATATATATATATATATAT). For the other types, the minimum numbers of repetitions are shown in Picture 13. 

             If you prefer, you can select only those types of repetitions of your interest, and change the minimum number of repetitions for each one of them, according to your necessities. After any alteration done in this options, use the button <SAVE> to store your configurations in the database.

6)        Finding Micro satellites/Mini satellites

6.1)     From your main menu, select SSRs Search and then, click in Search SSRs (Picture 14). In Open fasta file box, indicate the .fasta file having sequences to be analyzed (Picture 15). Click in <OPEN> and wait a message indicating the conclusion of the operation (Picture 16).

Picture 14. 

Picture 15. 

Picture 16. 

6.2)     Occurrence Results of Microsatellites/Minisatellites

            In SSRs Search sub-menu, exists options that show the obtained results in the occurrence repetitions for the last analyzed file (Picture 17). In this example, all shown data refer to sequence_formated.fasta file.

Picture 17. 

6.2.1)  Results SSRs – Sequence Loci

 This option shows a window where is indicated details about the micro-minisatellites occurrences. The results can be sent to MS-Excel spreadsheet and/or be recorded in TXT files, through Print XLS and Print TXT options, respectively (Picture 18). 

Picture 18. 

We created an algorithm that allows us to identify until 5 loci adjacent repetitions. From this way, those loci that don’t have other occurrences in distances lower than 100pb (both in flanks) are indicated in Structure field by the letter P (perfect). Repetitions that occurred in adjacent regions, lower than 100pb and higher than 5pb, are considered compound, and are indicated by the letter C, in Structure field. The number of these sequences is indicated in N.Loci field (Picture 18). 

6.2.2)  Results SSRs – Statistics type/motif

             In this option, a window will show a summary of occurrences found in each type analyzed (mono-, di-, tri-, tetra-, penta-, hexa-, hepta-, octa-, nona- and decamers) in each different quantity of repetitions per type. The Total column shows the quantity of occurrences with a determined number of repetitions. The example in Picture 19 shows that the total of 36 loci with 22 repetitions was found. The last line of the picture shows the total of occurrences found for each type analyzed. In this case, 221 loci made up by monomers, 37 loci made up by dimers and successively for the other types, having a total of 317 occurrences.    

            **In this example, the high occurrence of monomers is because the fact that the base used is composed by sequences of ESTs, having PolyA regions. 

Picture 19. 

6.2.3)  Results SSRs – Motifs strand

In this option is shown each occurrence of repetitions, detailing the number of bases and the number of repetitions that compose each loci.  

Picture 20. 

6.2.3)  Results SSRs – Motifs strand/strand 

This report shows the total occurrence in each arrangement (types/reasons) found in the analysis. In this module, the totals of occurrence in each arrangement and in complement arrangement are presented in only one report.  

 In this analysis, for example, the report indicates that, from 221 monomers’ occurrences detected, 220 of them refer to polyA and polyT strands – 148 polyA occurrences and 78 polyT occurrences.  

Picture 21.

6.2.4)  Results A.Acids – Total occurrence 

            This module shows the result of acid occurrences derivate from loci, formed by trimers, hexamers and nonamers, in other words, sequences having codon, two codons and three codons, respectively. The report shows each occurrence of these arrangements, and the respective acids derivate of each loci (Picture 22). 

Picture 22. 

6.2.5)  Results A.AcidsSumary 

            In this window is shown a resume of the total occurrences of each acid in repetitive regions. In our example, the results show that were found 6 loci having repetitions that formed Phenilalanina codons (Phe) and that, in the total, this acid was translated into 38 times from repetitive DNA regions (Picture 23). 

Picture 23. 

7)         Primers drawing for loci micro-mini satellites detected in the analysis 

For the primers drawing, anchored in flanks of each loci micro-minisatellites detected by SSRLocator, was implemented a module that makes a link of the results obtained with the Primer3** program.

 **(ROZEN, S.; SKALETSKY, H. Primer3 on the WWW for general users and for biologist programmers. Methods in molecular biology, v.132, p.365-386, 2000) 

 7.1)     From Primer Design sub-menu, select the option Set Parameters – Prime3 (Picture 24) to open a window having the primers parameters (Picture 25). The primers design parameters is a criterion of each user.   

Always after a data alteration, use the option <SAVE> to update the database. 

Picture 24.

Picture 25. 

7.2)     Running Primer3 

            Select the option Run Primer Design – Primer3 to begin the task. Two progression bars will show the process running and a message will be sent showing the end of the task (Picture 26).  

Picture 26. 

7.3)     Obtainment results of primers  

            In Primer Design sub-menu, select View Results. In this window, sequences (having loci mini-microsatellites) and primers groups obtained for each one of them are shown. The temperature melting for each primer and the size of the amplicons created for each group are also shown (Picture 27). 

Picture 27. 

8)        Simulating a Polimerase Chain Reaction - Virtual-PCR

             To verify the redundancy of each primers set, was implemented an algorithm that simulates protocols PCR reactions, anchoring primers set into different sequences of its origin. After primers hybridization regions are localized, the algorithm runs a global alignment between the primers amplicons (origin region) and the amplicons found in a redundant sequence, showing the homological level between redundant regions.   

Two options to simulate the PCR were implemented in SSRLocator – one of them is for works with short sequences of database (genes, ESTs, cDNAs, mRNAs e BACs), and the other, for studies involving big DNA sequences (pseudo-molecules).   

 

 8.1)     Running a Virtual-PCR 

            In Virtual-PCR sub-menu, select Running Virtual-PCR BACs/ESTs/RNAs/Gene (Picture 28).           

Picture 28. 

In Open box, show the file having the sequences that you want to simulate the PCR and click in <OPEN>. In this example, we selected the same file from where was localized the loci micro-minisatellites and the primers that are deposited in the database (Picture 29). 

              

                               Picture 29.                                                     Picture 30.           

After this step, the SSRLocator will show a progression bar showing the running of the process and in the end of the task, a message of conclusion will be shown (Picture 30).

8.2)     Virtual-PCR Results

            Select Results – Virtual-PCR to access the window having a resume of the PCR simulation. The report shows an identification of each sequence having loci micro-minisatellites and a sequence where yours will be linked, creating redundant amplicons. In the same report, are also shown the results of global alignment (GAPs, transversations, transitions, score HSP e identity) (Picture 31).     

Picture 31. 

8.3)     Primers redundance results

            In Results – Primer Redundance option, is shown each primers set  and with the total of amplicons created for each set (Picture 32).

           

Picture 32. 

9)        Maintenance of SSRLocator database              

In each analysis done with SSRLocator, many data (registers) are deposited and erased of database during the program running, increasing our database. For a cleaning in the database, we put a routine called SSRLocatorBK with the SSRLocator distribution to make this task, so, the user do not have to use administrative tools of the Firebird. 

9.1)     To run the SSRLocatorBK, close the SSRLocator utilization, and then, go until folder/directory C:\SSRLocatorI and select SSRLocatorBK.exe, according to the Picture 33. 

Picture 33. 

9.2)      Now, in the SSRLocatorBKP window, select <Clear DataBase> option, wait the conclusion of the operation and close the program through the button <Exit>, Picture 34. 

Picture 34. 

10)       Talk with us:

 

Maia, Luciano Carlos da       -           lucianoc.maia@gmail.com

 

Costa de Oliveira, Antonio   -           acostol@terra.com.br