Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do pseudocode of how loader constructs site tags short names #1467

Open
phraenquex opened this issue Jun 20, 2024 · 6 comments
Open

Do pseudocode of how loader constructs site tags short names #1467

phraenquex opened this issue Jun 20, 2024 · 6 comments
Assignees
Labels

Comments

@phraenquex
Copy link
Collaborator

phraenquex commented Jun 20, 2024

@kaliif Document here the rules by which the short tag name of the various site tags are constructed by the loader.

Use the style of #1466 (could copy-paste from the original spec in #1277)

@mwinokan use that to update the XCA readme.

@phraenquex phraenquex converted this from a draft issue Jun 20, 2024
@phraenquex
Copy link
Collaborator Author

@kaliif first needs to know if the tags are now fixed (#1466). That requires Ryan to uploade - @max will coordinate.

@mwinokan
Copy link
Collaborator

A71EV2A is not a great example of the CrystalformSites, and CHIKV_Mac will be better. As CHIKV_Mac is blocked by #1477, @kaliif please go ahead with what you currently have

@kaliif
Copy link
Collaborator

kaliif commented Jul 11, 2024

Tags generated by target loader:

  • Canon sites:
    Site data:
    A71EV2A-x0379+A+147+1:
      centroid_res: A71EV2A-x0379/A/38/A
      conformer_site_ids: [A71EV2A-x0379+A+147+1]
      global_reference_dtag: A71EV2A-x0528
      reference_conformer_site_id: A71EV2A-x0379+A+147+1
      residues: [A/81/LEU, A/38/ARG, A/90/TYR, A/39/ASP, A/34/GLU, A/95/GLN, A/93/ARG,
        A/96/SER, A/41/LEU, A/37/SER, A/28/TRP, A/33/TRP, A/94/TYR, A/36/SER, A/31/LEU,
        A/35/ASP, A/97/HIS, A/40/LEU, A/20/ARG]
    
    Sites are enumerated (in no particular order): ID = 1,2...
    Tag scheme: f'{ID} - {centroid_res}'
    Ex: 2 - A71EV2A-x0379/A/38/A
  • Conformer sites:
    Site data:
    A71EV2A-x0202+A+201+1:
      members: [A71EV2A-x0202/A/201/1]
      reference_ligand_id: A71EV2A-x0202/A/201/1
      residues: [A/64/TYR, A/65/CYS, A/68/ARG, A/71/HIS, A/67/SER, A/72/TYR, A/70/LYS,
        A/66/SER, A/69/ARG]
    
    Sites are grouped by canon site and enumerated (in no particular order): ID = a,b...
    short name: A71EV2A-x0202+A+201+1 -> A71EV2A-x0202
    Tag scheme: f'{canon site ID}{ID} - {short name}'
    Ex: 8a - A71EV2A-x0202
  • Quat assemblies:
    Assembly data:
    monomer:
        reference: A71_2A_x0090_ref_7_monomer
        biomol: A
        chains: A
    
    Sites are enumerated (in no particular order): ID = 1,2...
    Tag scheme: f'A{ID} - {assembly name}
    Ex: A1 - monomer
  • Crystalforms:
    Site data:
    A71_2A_open:
      xtalform_ref: A71_2A_x0090_ref_7_monomer
      xtalform_space_group: C 1 2 1
      xtalform_cell: {a: 86.98, b: 56.66, c: 32.62, alpha: 90.0, beta: 95.68, gamma: 90.0}
    
    Sites are enumerated (in no particular order): ID = 1,2...
    Tag scheme: f'F{ID} - {crystalform name}'
    Ex: F1 - A71_2A_open
  • Crystalform sites:
    Site data:
    A71EV2A-x0269/A/147/1:
      canonical_site_id: A71EV2A-x0395+A+148+1
      crystallographic_chain: A
      members: [A71EV2A-x0269/A/147/1, A71EV2A-x0152/A/201/1, A71EV2A-x0194/A/147/1,
        A71EV2A-x0202/A/147/1, A71EV2A-x0341/A/147/1, A71EV2A-x0375/A/147/1, A71EV2A-x0395/A/147/1,
        A71EV2A-x0395/A/148/1, A71EV2A-x0831/A/147/1, A71EV2A-x1105/A/147/1, A71EV2A-x1105/A/148/1]
      xtalform_id: A71_2A_open    
    
    chain letter: A71EV2A-x0269/A/147/1 -> A
    Sites are ordered by number of site observations and enumerated alphabetically starting from the chain letter: ID = chain letter,...
    crystalform site name sans version: A71EV2A-x0269/A/147/1 -> A71EV2A-x0269/A/147
    Tag scheme: f'F{crystalform ID}{ID} - {crystalform site name sans version}'
    Ex: F1d - A71EV2A-x0269/A/147

@mwinokan
Copy link
Collaborator

@phraenquex will take a closer look at the pseudocode before it goes into the xca/USER_GUIDE (maybe linked but not embedded directly)

@kaliif
Copy link
Collaborator

kaliif commented Jul 29, 2024

Tag generation updated (as per the comment in #1482's thread here)

Tags generated by target loader:

  • Canon sites:
    Site data:
    A71EV2A-x0379+A+147+1:
      centroid_res: A71EV2A-x0379/A/38/A
      conformer_site_ids: [A71EV2A-x0379+A+147+1]
      global_reference_dtag: A71EV2A-x0528
      reference_conformer_site_id: A71EV2A-x0379+A+147+1
      residues: [A/81/LEU, A/38/ARG, A/90/TYR, A/39/ASP, A/34/GLU, A/95/GLN, A/93/ARG,
        A/96/SER, A/41/LEU, A/37/SER, A/28/TRP, A/33/TRP, A/94/TYR, A/36/SER, A/31/LEU,
        A/35/ASP, A/97/HIS, A/40/LEU, A/20/ARG]
    
    Sites are enumerated (in no particular order): ID = 1,2...
    Tag scheme: f'{ID} - {name}'
    Ex: 2 - A71EV2A-x0379+A+147+1
  • Conformer sites:
    Site data:
    A71EV2A-x0202+A+201+1:
      members: [A71EV2A-x0202/A/201/1]
      reference_ligand_id: A71EV2A-x0202/A/201/1
      residues: [A/64/TYR, A/65/CYS, A/68/ARG, A/71/HIS, A/67/SER, A/72/TYR, A/70/LYS,
        A/66/SER, A/69/ARG]
    
    Sites are grouped by canon site and enumerated (in no particular order): ID = a,b...
    Tag scheme: f'{canon site ID}{ID} - {name}'
    Ex: 8a - A71EV2A-x0202+A+201+1
  • Quat assemblies:
    Assembly data:
    monomer:
        reference: A71_2A_x0090_ref_7_monomer
        biomol: A
        chains: A
    
    Sites are enumerated (in no particular order): ID = 1,2...
    Tag scheme: f'A{ID} - {assembly name}
    Ex: A1 - monomer
  • Crystalforms:
    Site data:
    A71_2A_open:
      xtalform_ref: A71_2A_x0090_ref_7_monomer
      xtalform_space_group: C 1 2 1
      xtalform_cell: {a: 86.98, b: 56.66, c: 32.62, alpha: 90.0, beta: 95.68, gamma: 90.0}
    
    Sites are enumerated (in no particular order): ID = 1,2...
    Tag scheme: f'F{ID} - {crystalform name}'
    Ex: F1 - A71_2A_open
  • Crystalform sites:
    Site data:
    A71EV2A-x0269/A/147/1:
      canonical_site_id: A71EV2A-x0395+A+148+1
      crystallographic_chain: A
      members: [A71EV2A-x0269/A/147/1, A71EV2A-x0152/A/201/1, A71EV2A-x0194/A/147/1,
        A71EV2A-x0202/A/147/1, A71EV2A-x0341/A/147/1, A71EV2A-x0375/A/147/1, A71EV2A-x0395/A/147/1,
        A71EV2A-x0395/A/148/1, A71EV2A-x0831/A/147/1, A71EV2A-x1105/A/147/1, A71EV2A-x1105/A/148/1]
      xtalform_id: A71_2A_open    
    
    chain letter: A71EV2A-x0269/A/147/1 -> A
    Sites are ordered by number of site observations and enumerated alphabetically starting from the chain letter: ID = chain letter,...
    Tag scheme: f'F{crystalform ID}{ID} - {crystalform site name}'
    Ex: F1d - A71EV2A-x0269/A/147/1

@phraenquex phraenquex assigned Waztom and unassigned kaliif and mwinokan Aug 20, 2024
@mwinokan
Copy link
Collaborator

@Waztom please check this against the new diagram and update documentation (w/ @phraenquex )

@mwinokan mwinokan added the 2024-06-14 mint Data dissemination 2 label Sep 13, 2024
@phraenquex phraenquex added 2024-09-17 olive data curation big items (too big for mint) and removed 2024-03-13 green Data dissemination labels Sep 26, 2024
@mwinokan mwinokan removed the 2024-09-17 olive data curation big items (too big for mint) label Sep 26, 2024
@mwinokan mwinokan added 2024-11-20 mint:docs v2 documentation and removed 2024-06-14 mint Data dissemination 2 labels Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: v2 Documentation
Development

No branches or pull requests

4 participants