Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds synteny info to GSTF prep tool #148

Open
wants to merge 28 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
7bd0304
adds code to GSTF prep tool to create synteny table
Jun 9, 2021
470fbb0
fixes travis errors
Jun 9, 2021
d89c3a3
fixes travis errors
Jun 9, 2021
46a2904
fixes type and repopulate dbs
Jun 10, 2021
f4afedd
Update tools/gstf_preparation/gstf_preparation.py
Jun 10, 2021
259f618
Update tools/gstf_preparation/gstf_preparation.py
Jun 10, 2021
f4b879f
Update tools/gstf_preparation/gstf_preparation.py
Jun 10, 2021
be798d2
Update tools/gstf_preparation/gstf_preparation.py
Jun 10, 2021
5e05764
Update tools/gstf_preparation/gstf_preparation.py
Jun 10, 2021
6284382
Update tools/gstf_preparation/gstf_preparation.py
Jun 10, 2021
e807587
Update tools/gstf_preparation/gstf_preparation.py
Jun 10, 2021
45c54f4
Update tools/gstf_preparation/gstf_preparation.py
Jun 10, 2021
94cb589
Update tools/gstf_preparation/gstf_preparation.py
Jun 10, 2021
36ba71b
Update tools/gstf_preparation/gstf_preparation.py
Jun 10, 2021
6248c7d
Replace tab with spaces
nsoranzo Jun 10, 2021
9b8b67f
Fix syntenic_region_id
nsoranzo Jun 10, 2021
24048e2
Update tools/gstf_preparation/gstf_preparation.py
Jun 11, 2021
ddb0008
Update gstf_preparation.py
Jun 11, 2021
6d589bd
Update tools/gstf_preparation/gstf_preparation.py
Jun 11, 2021
eb38dd4
Update tools/gstf_preparation/gstf_preparation.py
Jun 11, 2021
9c5dcb8
Update tools/gstf_preparation/gstf_preparation.py
Jun 11, 2021
386d4f5
refactors populate_synteny() function
Jun 14, 2021
15c177e
fixes Flake8 error
Jun 14, 2021
157af65
Update gstf_preparation.py
Jun 14, 2021
280eff5
Update gstf_preparation.py
Jun 14, 2021
d322c35
Adds separate cursor for to update syntenic region table in tools/gst…
Jun 15, 2021
19c1812
Deletes fetch_genes_by_order in tools/gstf_preparation/gstf_preparati…
Jun 15, 2021
7b0fcc9
Uses second cursor to update syntenic_region table and inline codes t…
Jun 15, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions tools/gstf_preparation/gstf_preparation.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,12 @@ def create_tables(conn):
FROM transcript JOIN gene
USING (gene_id)''')

cur.execute('''CREATE TABLE syntenic_region (
anilthanki marked this conversation as resolved.
Show resolved Hide resolved
species VARCHAR NOT NULL,
syntenic_region_name VARCHAR NOT NULL,
gene_id VARCHAR NOT NULL REFERENCES gene(gene_id),
order_number INTEGER NOT NULL)''')

conn.commit()


Expand Down Expand Up @@ -308,6 +314,54 @@ def remove_id_version(s, force=False):
return s


def fetch_genomes(conn):
"""
Fetch all the genomes from the database.
"""
cur = conn.cursor()

cur.execute('SELECT DISTINCT species FROM gene')

return cur.fetchall()


def fetch_seq_region_names(conn, genome):
"""
Fetches all the sequence region names for a genome.
"""

cur = conn.cursor()

cur.execute('SELECT DISTINCT seq_region_name FROM gene WHERE species=?',
(genome, ))

return cur.fetchall()


def populate_synteny(conn):
"""
Populates the syntenic_region table.
"""

cur = conn.cursor()
anilthanki marked this conversation as resolved.
Show resolved Hide resolved
cur2 = conn.cursor()

for genome in fetch_genomes(conn):
species = genome['species']
for row in fetch_seq_region_names(conn, species):
seq_region_name = row['seq_region_name']
cur.execute(
'SELECT gene_id FROM gene WHERE species=? AND seq_region_name=? ORDER BY seq_region_start ASC',
(species, seq_region_name)
)
for order_number, gene in enumerate(cur, start=1):
cur2.execute(
'INSERT INTO syntenic_region (syntenic_region_name, gene_id, species, order_number) VALUES (?, ?, ?, ?)',
(seq_region_name, gene["gene_id"], species, order_number)
)
conn.commit()


def __main__():
parser = optparse.OptionParser()
parser.add_option('--gff3', action='append', default=[], help='GFF3 file to convert, in SPECIES:FILENAME format. Use multiple times to add more files')
Expand Down Expand Up @@ -481,6 +535,8 @@ def __main__():
else:
entry.print(output_fasta_file)

populate_synteny(conn)

conn.close()


Expand Down
Binary file modified tools/gstf_preparation/test-data/test1.sqlite
Binary file not shown.
Binary file modified tools/gstf_preparation/test-data/test4.sqlite
Binary file not shown.