Skip to content

hash-overlapper for AMOS. project 4 for bioinformatics algorithms class

Notifications You must be signed in to change notification settings

rabbitXIII/cmsc423-proj4

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

Project 4
Rohit Gopal

CMSC423 0101


The files included for this submission are: 
proj4_rgopal.pl 
README 
run_pipeline.sh

For everything all inclusive:

	Assumptions:
		Current directory contains..

		proj4_rgopal.pl 
		toAmos_new 
		bank-transact 
		tigger 
		make-consensus 
		bank2fasta 


	Command:
		./run_pipeline [INPUT_SEQUENCES] [OUTPUT_FASTA]
		
		INPUT_SEQUENCES is a file of many reads
		OUTPUT_FASTA is the resulting finished consensus 
	

To Run the Project:

	Assumptions:
		HOXD2 Matrix
		proj4_rgopal.pl
		Fasta file with many sequences	

	Command:

	./proj4_rgopal.pl [sequences fasta] [hoxd2 file] [output ovl] [gap_start] [gap_extend] [minimum idenitity%] [min overlap between seqeunces] [max hang]


	For example:

		./proj4_rgopal.pl ../amos/small/crp177.seq ../cmsc423-proj3/testcases/HOXD2.txt my_output.ovl -2000 -200 98 45 60

If you so choose, you can also run the script with only the first three fields. The default values will
kick in as follows:

gap_start	-2000
gap_extend	-200
min identity%	98
min overlap	45
max hang	60	

Notes:
This script is VERY slow... the SW Algorithm is likely very slow and needs to be optimized
and some of the data structures are used out of convenience, but could impact performance
 It doesn't help that perl isn't too quick either.
One option would be to just spawn processes every time we want to do a SW Alignment.  The
only issue here was that Parallel::ForkManager couldn't be used, since Parallel doesn't 
seem to be installed on the grace servers. :(

About

hash-overlapper for AMOS. project 4 for bioinformatics algorithms class

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published