Download Diapositiva 1

Document related concepts

Genome-wide association study wikipedia , lookup

Computational and Statistical Genetics wikipedia , lookup

Tag SNP wikipedia , lookup

Predictive genomics wikipedia , lookup

Association mapping wikipedia , lookup

Transcript
Estudio de asociación genética
en la leucemia linfocítica crónica
Angel Carracedo
Fundación Gallega de Medicina Genómica
(SERGAS)
CeGen-ISCIII Universidad de Santiago
XXXV REUNION DE LA ASOCIACION GALLEGA DE
HEMATOLOGIA Y HEMOTERAPIA. FERROL, MARZO 2011
Biological Complex Systems
• Why is it so complicated?
• Can we make sense of this complexity?
• Can we convey our understanding of this
complexity?
Systems
biology
Genotype
Environment
Geness
Phenotype
microRNA
Enviroment
Epigenetics
Proteome
Transcriptome
Better understanding of mendelian and
complex disease
Genes
40%
Environment
60%
Better classification of diseases
Risk stratification
Pharmacogenetics and pharmacogenomics
CONTENT
- How to look for the genetic component of a disease
-An example with CLL
How to look for low penetrance genes?
Allelic heterogeneity
Locus heterogeneity
Phenocopy
Phenotypic variability
Trait heterogeneity
Gene-gene interactions
Gene-environment
interactions
Genetic Strategies
Traditional (from the 1980s or earlier)
– Linkage analysis on pedigrees
– Allele-sharing methods: candidate genes, genome screen
– Association studies: candidate genes
– Animal models: identifying candidate genes
Newer (from the 1990s)
– Focus on special populations Haplotype-sharing (Jesus M.
Hernández-Fam. Gam.)
– Congenic/consomic lines in mice (new for complex traits)Animal models
– Single-nucleotide polymorphism (SNPs)-Whole genome
scans (Association studies)
– Admixture mapping
– Functional analyses: finding candidate genes
Linkage analysis or association studies ?
•linkage analysis is usually more robust in the identification
of mendelian traits
• association studies have more power to detect genes with
small effects (Risch & Merikangas, Science 1996)
magnitude of effect
Linkage analysis of families
obtainable sample size
association studies in
populations
frequency of trait in the population
Human Genetic Association Study Design
Phenotype A
Phenotype B
Allele 1
Allele 2
SNP A:
Allele 1 =
Allele 2 =
SNP A is associated
with Phenotype
SNP: SINGLE NUCLEOTIDE POLYMORPHISM
ATCGGCGTACCTGATTCCGAATCCGTATCG
ATCGGCGTACCTGAATCCGAATCCGTATCG
• 1,000,000 SNPs
1000 PERSONAS -1,000,000 SNPS = 1,000 MILLONES DE ANÁLISIS Y DE DATOS
Characteristics of SNP Variation
• Clustering is observed on
all the autosomes:
Haplotype blocks: Blocks
with little evidence
of recombination
• Some clusters appear
functional : MHC on
chromosome 6 (with
extensive replication)
Gabriel et al. Science, 296,2002
LD blocks (little or no
recombination)
1Mb windows
cM
Mb
recombination hotspots
HapMap (2002)
• Catalogue of variation at Single
nucleotide polymorphisms (SNPs)
genome-wide in different populations
• Touted for disease gene identification via
linkage disequilibrium mapping
• ‘Tag’ SNPs can cover whole genome
• Reduction of SNPs required to examine
the entire genome for association with a
phenotype from 20 million to 1,000,000
tagSNPs
SNP: SINGLE NUCLEOTIDE POLYMORPHISM
ATCGGCGTACCTGATTCCGAATCCGTATCG
ATCGGCGTACCTGAATCCGAATCCGTATCG
• 1,000,000 SNPs
1000 PERSONAS -1,000,000 SNPS = 1,000 MILLONES DE ANÁLISIS Y DE DATOS
Spanish National Genotyping Center
GeGen-ISCIII
Scientific International Committee
Ethical International Committeel
Coordination
NODE 1
Barcelona
(CRG)
NODE 2
Santiago de
Compostela (USC)
NODE 3
Madrid
(CNIO)
Illumina
Sequenom /
Illumina
Affymetrix
ASSOCIATION STUDIES CARRIED OUT IN CEGEN
30
CANCER
PSYCHIATRY
NEUROLOGY
ENDOC-METAB
RHEUMATOL
OPHTAL
CARDIOVAS
OTHERS
25
20
15
10
5
0
2005: 55 PROJECTS
2006: 75 PROJECTS
2007: 114 PROJECTS
2008: 135 PROJECTS
2009: 4 GWAs
2010: 15 GWAS
Association studies
Candidate gene approach
-Causative hypothesis or
candidate genes
Genome wide analysis (GWAs)
-No need of gene selection
-Lack of bias towards specific
genes
Both approaches are complementary
OXALIPLATIN
Previous case-control (association) studies to
identify common, low-penetrance cancer
genes
• Many small-scale studies in past, candidate
genes
• Many positive reports
• A priori p(false+) >>> p(true+)
• Publication bias, failure to match cases and
controls/population stratification, lack of
correction for multiple comparisons, lack of
replication
Correction for multple comparisons
P> 10-7 required
TYPE I ERRORS: Population stratification
EPICOLON GWAS
PCA analysis on genotypes: checked genotyping dates, geographical
origin, Nsp-Sty and collection hospital
0.05
Meixoeiro
N=366
Donosti
N=167
N=944
Type I errors: random
Corrections for multiple comparisons (p= 0.01
1 false positive every 100 comparisons)
•
Bonferroni method
Pcor = 1-(1-Pnoncor)n  new signif = alfa/n.
comparisons
-Very conservative-Assumption of independence
• Permutations (the most commonly used methodcomputational intensive!)
• Other methods:
-False discovery rate (FDR)
-Sum Statistics
-Single Nucleotide Polymorphism Spectral
Decomposition
-Others
T. Manolio/ N Engl J Med 2010;363:166-76
Whole-genome association analysis
1 million
Genome-wide association study
(GWAS) to identify low-penetrance
genes
• Require many (>1000) cases and
controls (but not always)-Consortia
• Can improve power by selecting cases
(early-onset, familial) and controls
(cancer-free)
• Search for alleles or genotypes overrepresented in cases
• Verify in other sample sets
Chronic lymphocytic leukemia
Chronic lymphocytic leukemia accounts for ~25% of all
leukemia and is the most common form of lymphoid
malignancy in Western countries.
Despite a strong familial basis to CLL, with risks in firstdegree relatives of cases being increased ~8-fold, to date the
inherited genetic basis of the disease is largely unknown.
.
All association studies with candidate genes inconsistent
CLL
299,983 tagging SNPs
Stage 1: 505 cases and 1,438 controls (UK/Spain)
Stage 2: 180 SNPs in 540 UK cases
Stage 3: 19 SNPs
UK replication series 2 (660 cases, 809 controls)
Spanish replication series (424 cases, 450 controls).
Stage 4, 10 SNPs with the strongest association from a
combined analysis of stages 1–3 in a Swedish replication
series (395 cases, 397 controls)
T. Manolio/ N Engl J Med 2010;363:166-76
D. Crowther-Swanepoel D, Ana Vega, K.. Smedby, C. Ruiz-Ponte, J. Jurlander,
E. Campo, A. Carracedo, R. Houlston, British Journal of Haematology, 2010
Cumulative impact of 10 common genetic
variants on colorectal cancer risk in 42,333
individuals from eight populations (Lancet, in
press)
This study demonstrates that
population subgroups can be
identified with a predicted
absolute CRC risk sufficiently
high as to merit
surveillance/intervention,
although individualized CRC
risk profiling is not currently
feasible. Nonetheless, the
findings provide the first
tangible evidence of public
health relevance for data
from genome-wide studies in
CRC
Spanish data GWAS
Birdsuite uses two different approaches for CNV detection:
- Canary: 1500 probes directed to CN
Polymorphisms (as
described in the Human Variation Database browser)
-Birdseye: CNV detection
These data were also analysed
with CNVAssoc
Preliminary results pending on
stratification correction
From tagSNP to causal variation …..
Why is this important?
•
•
•
•
Population portability
Targeted interventions
Learn more about how cancer develops
Plan: Resequencing
• Check information from WGS
Nature last week: Identified the first CLL genetic mutations through NGS
NGS:
SOLiD 4 System
Throughput: Up to 100 Gb/run
Fragment length:
Fragment: 50 bp
Mate-pair: 2 x 50 bp
Paired-end: 50 x 25 bp
Multiplexing:
96 DNA barcodes
48 RNA barcodes
Panels
o All Exon Kit (50 Mb Exome)
oAll Exon Kit (38
o Mb Exome) (tiling 1x)
o All Exon Plus Kit (38 Mb Exome + 3,3 Mb custom)
Targeted resequencing
Custom
Exome sequencing
o < 200 Kb
o 200 -500 Kb Whole genome sequencing
o 500 Kb – 1,5 Mb
o 1,5 – 3 Mb
Ion Torrent
Personal Genome
Machine (PGMTM)
Throughput: Up to 10/100 Mb/run
2012 - 1 Gb/run
2 hours/run
Fragment length:
Fragment: 100-150 bp
Unidirectional sequencing
Bidirectional sequencing
Multiplexing:
2011 - 96 DNA barcodes
SINGLE DNA MOLECULE SEQUENCING
Genetic variegation
of clonal
architecture
and propagating
cells in leukaemia,
Anderson et al.
Nature 2010
Isidro Sánchez-GarcÍa.
U. Salamanca
GWAS in pharmacogenetics-Differences with common
diseases
Sample size: For ADRs the number of cases and controls can be much lower than
for common diseases. However a number of published GWAs on
pharmacogenomics have failed to show a large enough effect for genome-wide
signifcance; the main reason for this is probably the small sample size with
insufficient power to detect small or moderate effects.
Reasons: Phenotypic characterization- Some pharmacogenomics effects tend to
be larger and involve fewer genes than in studies on common complex diseases.
Obtaining adequate number of cases for pharmacogenomics GWAs is more
challenging than for common diseases. In many case serious ADRs often only
affect on in every 10,000 to 100,000 patients treated.
Manhattan plot of −log P-value against chromosomal
position of each marker from a study on simvastatininduced muscle toxicity on 85 cases and 90 drug-exposed
controls (A. Daly, 2009).
GWAS for pharmacogenomics
SNPs (chromosomal locations) shown previously to be associated
with CRC risk are: rs6983267 (chr 8q24), rs4779584 (chr 15q23),
rs4939827 (chr 18q21), rs3802842(chr 11q23), rs10795668 (chr
10p14), rs16892766 (chr 8q23), rs4444235(chr 14q22), rs9929218
(chr 16q22), rs10411210 (chr 19q13), rs961253 (chr 20p12).
EPICOLON GWAS TOXICITY
5FU, oxaliplatino e irinotecan
Eficacia: RECIST
Toxicidad: CTC
Strong: Al menos un “grado 3-4” entre todos los efectos secundarios.
Weak: Al menos un “grado 3-4” entre diarrea y náuseas, o al menos un “grado
1-2” en el resto de efectos secundarios (que se consideran más graves que
diarrea o náuseas).
Digestive: Al menos un “grado 3-4” entre diarrea y náuseas, o al menos un
“grado 1-2” en mucositis.
Circulatory: Al menos un “grado 1-2” entre leucopenia, trombopenia, anemia, y
neutropenia.
Others: Al menos un “grado 1-2” entre neuropatía y síndrome mano/pie.
EPICOLON GWAS (300 cases) 9 SNPs (p< 10-10) being replicated
OXFORD GWAS (620 cases, Capecitabine (5FU) and then randomised to
oxaliplatin or no oxaliplatin, phenotypes by syntoms (i.e.Diarrhoea and
Handfoot syndrome) 11 SNPs being replicated
Methotrexate consolidation treatment according to pharmacogenetics of
MTHFR ameliorates event-free survival in childhood acute lymphoblastic
leukemia Running Title Methotrexate pharmacogenetics in childhood
acute lymphoblastic leukemia
Salazar et al. 2011 (Pharmacogenomics Journal ,submitted)
We investigated the usefulness of the MTHFR genotype to increase the
methotrexate dosage in the consolidation phase in 141 childhood ALL patients
enrolled in the ALL/SHOP-2005 protocol. Patients with a favourable MTHFR
genotype (normal enzymatic activity) treated with methotrexate doses of 5 g/m2
had a significantly lower-risk of suffering an event than patients with an
unfavourable MTHFR genotype (reduced enzymatic activity) that were treated
with the classical methotrexate dose of 3 g/m2 (p=0.012). Our results indicate
that analysis of the MTHFR genotype is a useful tool to optimize methotrexate
therapy in childhood patients with ALL.
Fenotipo
Genotipo
Cuanto mejor definido es el fenotipo más fácil es encontrar el gen
del que depende. Cuanto más complejo es el sistema más difícil es
de definir su fenotipo
En todo aquello que tiene variación y esta es relevante clínicamente
se puede buscar el gen causal y así empezar a entender el fenotipo
y en consecuencia la enfermedad.