-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path03-getting-started.Rmd
58 lines (35 loc) · 6.06 KB
/
03-getting-started.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# Getting Started with Genomic Data
## What is sequencing data?
Sequencing data is the result of determining the precise order of nucleotides (the building blocks of DNA or RNA) in a sample. This data provides a detailed map of genetic information, which is essential for understanding various biological processes, including genetics, genomics, and molecular biology.
Key Aspects of Sequencing Data
Nucleotides:
DNA: Composed of four nucleotides: adenine (A), cytosine (C), guanine (G), and thymine (T).
RNA: Composed of four nucleotides: adenine (A), cytosine (C), guanine (G), and uracil (U), replacing thymine.
Sequencing:
Process: Sequencing involves determining the exact order of these nucleotides in a DNA or RNA molecule. This is achieved using various sequencing technologies.
Output Data:
Raw Data: Initial data generated by the sequencing process, which may include sequence reads, quality scores, and other metrics.
Processed Data: Includes aligned sequences, variant calls (e.g., mutations), and functional annotations.
Sequencing data is the result of determining the precise order of nucleotides. In simple form it is the building blocks of DNA or RNA in a sample. This data provides a detailed map of genetic information including genetics, genomics and molecular biology.
How is sequencing data created -- briefly what are the different types?
Raw Sequence Reads:
Short Reads: Small fragments of DNA or RNA sequences, typically ranging from 50 to 300 base pairs (bp).
Long Reads: Longer fragments, ranging from thousands to millions of base pairs, providing more comprehensive sequence information.
Aligned Sequences:
Sequences that have been aligned to a reference genome or transcriptome. This helps in identifying where each read fits within the larger genetic context.
Variant Data:
Information about genetic variations such as single nucleotide polymorphisms (SNPs), insertions, deletions, and structural variations compared to a reference genome.
Expression Data (from RNA-Seq):
Quantitative data on gene expression levels, showing how much of each gene is transcribed into RNA.
To sum up the applications of sequencing data here are the different applications
Genomics: Studying entire genomes to understand genetic variation, evolutionary relationships, and disease mechanisms.
Transcriptomics: Analyzing RNA sequences to study gene expression patterns, identify novel transcripts, and understand regulatory mechanisms.
Metagenomics: Investigating genetic material from environmental samples to study microbial communities and their functions.
Personalized Medicine: Using genetic information to tailor medical treatments to individual patients based on their unique genetic makeup.
## What is PII and PHI and why is it important? Why is genomic data often protected?
PII (Personally Identifiable Information): This refers to any information that can be used to identify an individual. Examples include names, social security numbers, addresses, phone numbers, email addresses, and other personal details. PII is important because unauthorized access or disclosure of this information can lead to identity theft, fraud, and other forms of exploitation.
PHI (Protected Health Information): This is a subset of PII specific to the healthcare industry. It includes any health information that can be linked to a specific individual, such as medical records, health history, treatment plans, and other health-related data. PHI is crucial because improper handling of this information can result in breaches of patient confidentiality, misuse of sensitive health data, and violations of privacy rights.
Protecting PII (Personally Identifiable Information) and PHI (Protected Health Information) is crucial for several reasons. Both types of information contain sensitive details about individuals, and safeguarding them helps maintain privacy while preventing unauthorized access or misuse. For instance, PII, such as social security numbers or financial data, can be exploited for identity theft or fraud, making its protection essential in reducing these risks. PHI includes sensitive health data that, if disclosed improperly, could result in stigma, discrimination, or personal distress, thus protecting PHI is key to preserving individuals' dignity and autonomy. The compliance with laws and regulations like GDPR in Europe and HIPAA in the U.S. requires robust protection of PII and PHI, with non-compliance potentially leading to significant legal and financial repercussions. Handling this information responsibly also fosters trust with customers, clients, and patients, which is vital for maintaining strong relationships and a positive reputation. Additionally, safeguarding PII and PHI helps defend against unauthorized access and cyber threats, ensuring that sensitive data remains secure from malicious actors.
See this course: https://hutchdatascience.org/Ethical_Data_Handling_for_Cancer_Research/
What is an IRB?
IRBs evaluate research proposals to ensure that they adhere to ethical principles, including respect for persons, beneficence (maximizing benefits and minimizing harm), and justice (fairness in research practices). They assess whether the informed consent process is adequate, ensuring that participants are fully informed about the nature, risks, and benefits of the research before agreeing to participate. IRBs review the risk-to-benefit ratio of research studies, aiming to minimize potential harm to participants and ensuring that any risks are justified by the potential benefits of the research. They ensure that research complies with relevant regulations and guidelines, such as those set forth by the U.S. Department of Health and Human Services (HHS) and the Food and Drug Administration (FDA), as well as institutional policies. IRBs monitor ongoing research to ensure continued compliance with ethical standards and review any changes to the research protocol that might affect participant welfare. Overall, the IRB plays a crucial role in protecting the rights and welfare of research participants, maintaining the integrity of the research process, and ensuring that research is conducted responsibly and ethically.