Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle the stop codons, COSMIC errors download, ENSEMBL releases #31

Merged
merged 44 commits into from
Feb 3, 2021
Merged
Changes from 1 commit
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
204115e
handle the stop codons into split proteins instead of removing the st…
ypriverol Jan 26, 2021
7334999
commandline ordered
ypriverol Jan 26, 2021
bf93f1a
better error handling for cosmic downloads
ypriverol Jan 26, 2021
3715155
change command config path
ypriverol Jan 26, 2021
d8fcd94
new downloader for previous versions of ensembl
ypriverol Jan 26, 2021
87faff7
code clean
ypriverol Jan 26, 2021
1811e21
increase package version
ypriverol Jan 26, 2021
d4de5da
ensembl downloader updated
ypriverol Jan 26, 2021
4eb3b49
change command config path
ypriverol Jan 26, 2021
8b0bea8
add changelog.md
ypriverol Jan 26, 2021
e3155f4
added changelog
ypriverol Jan 26, 2021
95e3bf8
changelog actions added
ypriverol Jan 26, 2021
22a7607
changelog added
ypriverol Jan 26, 2021
b74013b
change command config path
ypriverol Jan 26, 2021
38c017b
change command config path
ypriverol Jan 26, 2021
c7579ad
change command config path
ypriverol Jan 26, 2021
894f3a5
Updates version info for
ypriverol Jan 26, 2021
e0f22f4
changelog done
ypriverol Jan 26, 2021
f03350d
Updates version info for
ypriverol Jan 26, 2021
ed03fd1
add changelog actions
ypriverol Jan 26, 2021
6c468b7
Merge remote-tracking branch 'origin/dev' into dev
ypriverol Jan 26, 2021
3c26fc7
Updates version info for
ypriverol Jan 26, 2021
853d672
update README
ypriverol Jan 26, 2021
fbc4a64
README.md updated
ypriverol Jan 26, 2021
fd67ccf
code coverage
ypriverol Jan 26, 2021
c9dfe7d
code coverage
ypriverol Jan 26, 2021
e772df8
added unit tests
ypriverol Jan 26, 2021
2917989
change command config path
ypriverol Jan 26, 2021
010fbce
coverage updated
ypriverol Jan 26, 2021
5effa00
change command config path
ypriverol Jan 26, 2021
fa84cdf
change command config path
ypriverol Jan 26, 2021
376790e
minor examples
ypriverol Jan 26, 2021
3a2da7b
still there
ypriverol Jan 26, 2021
de40c74
change command config path
ypriverol Jan 26, 2021
48ab4e7
change command config path
ypriverol Jan 27, 2021
7895bd8
code clean step
ypriverol Jan 27, 2021
a31dc39
code clean step
ypriverol Jan 27, 2021
78d4e5d
code clean step
ypriverol Jan 27, 2021
c9fdd9a
code clean step
ypriverol Jan 27, 2021
409f4af
code clean step
ypriverol Jan 27, 2021
6a0730b
Merge multiple clean code commits code clean step (+14 squashed commits)
ypriverol Jan 26, 2021
d57521a
Merge remote-tracking branch 'origin/dev' into dev
ypriverol Jan 27, 2021
74a7b9c
added badges
ypriverol Jan 28, 2021
58f30f7
error file open
ypriverol Feb 3, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
code clean step
  • Loading branch information
ypriverol committed Jan 27, 2021
commit c9fdd9a65979cb21c3100bc018ad805573d07f4d
11 changes: 6 additions & 5 deletions pypgatk/ensembl/ensembl.py
Original file line number Diff line number Diff line change
Expand Up @@ -250,11 +250,13 @@ def get_altseq(ref_seq, ref_allele, var_allele, var_pos, strand, features_info,
nc_index = 0
if len(ref_allele) == len(var_allele) or ref_allele[0] == var_allele[0]:
for feature in features_info: # for every exon, cds or stop codon
if var_pos in range(feature[0], feature[1] + 1): # get index of the var relative to the position of the overlapping feature in the coding region
if var_pos in range(feature[0], feature[
1] + 1): # get index of the var relative to the position of the overlapping feature in the coding region
var_index_in_cds = nc_index + (var_pos - feature[0])
# modify the coding reference sequence accoding to the var_allele
c = len(ref_allele)
alt_seq = ref_seq[0:var_index_in_cds] + var_allele + ref_seq[var_index_in_cds + c::] # variant and ref strand??
alt_seq = ref_seq[0:var_index_in_cds] + var_allele + ref_seq[
var_index_in_cds + c::] # variant and ref strand??
if strand == '-':
return ref_seq[::-1], alt_seq[::-1]
else:
Expand All @@ -277,13 +279,12 @@ def parse_gtf(gene_annotations_gtf, gtf_db_file):
keep_order=True, disable_infer_transcripts=True, disable_infer_genes=True,
verbose=True,
force=False)
except: # already exists
except Exception as e: # already exists
print("Databae already exists", gtf_db_file)

db = gffutils.FeatureDB(gtf_db_file)
return db


@staticmethod
def get_features(db, feature_id, biotype_str, feature_types=None):
"""
Expand Down Expand Up @@ -531,7 +532,7 @@ def vcf_to_proteindb(self, vcf_file, input_fasta, gene_annotations_gtf):
except (ValueError, IndexError):
msg = "Could not extra cds position from fasta header for: {}".format(desc)
self.get_logger().debug(msg)
pass

chrom, strand, features_info, feature_biotype = self.get_features(db, transcript_id_v,
self._biotype_str,
feature_types)
Expand Down