Package picard.fingerprint
Class HaplotypeProbabilities
- java.lang.Object
-
- picard.fingerprint.HaplotypeProbabilities
-
- Direct Known Subclasses:
HaplotypeProbabilitiesFromGenotype
,HaplotypeProbabilitiesFromGenotypeLikelihoods
,HaplotypeProbabilitiesFromSequence
,HaplotypeProbabilityOfNormalGivenTumor
public abstract class HaplotypeProbabilities extends Object
Abstract class for storing and calculating various likelihoods and probabilities for haplotype alleles given evidence.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected static class
HaplotypeProbabilities.Genotype
Log10(P(evidence| haplotype)) for the 3 different possible haplotypes {aa, ab, bb}
-
Constructor Summary
Constructors Modifier Constructor Description protected
HaplotypeProbabilities(HaplotypeBlock haplotypeBlock)
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description abstract HaplotypeProbabilities
deepCopy()
HaplotypeBlock
getHaplotype()
Returns the haplotype for which the probabilities apply.abstract double[]
getLikelihoods()
Returns the likelihoods, in order, of the AA, Aa and aa haplotypes given the evidencedouble
getLodMostProbableGenotype()
Returns the LOD score between the most probable haplotype and the second most probable.double[]
getLogLikelihoods()
DiploidGenotype
getMostLikelyGenotype(Snp snp)
Gets the genotype for this Snp given the most likely haplotype.DiploidHaplotype
getMostLikelyHaplotype()
Gets the most likely haplotype given the probabilities.int
getObsAllele1()
Returns the number of observations of alleles supporting the first/major haplotype allele.int
getObsAllele2()
Returns the number of observations of alleles supporting the second/minor haplotype allele.double[]
getPosteriorLikelihoods()
Returns the probabilities, in order, of the AA, Aa and aa haplotypes.double[]
getPosteriorProbabilities()
Returns the probabilities, in order, of the AA, Aa and aa haplotypes.double[]
getPriorProbablities()
abstract Snp
getRepresentativeSnp()
Returns a representative SNP for this haplotype.int
getTotalObs()
Returns the total number of observations of any allele.boolean
hasEvidence()
Returns true if evidence has been added, false if the probabilities are just the priors.abstract HaplotypeProbabilities
merge(HaplotypeProbabilities other)
Merges in the likelihood information from the supplied haplotype probabilities object.double
scaledEvidenceProbabilityUsingGenotypeFrequencies(double[] genotypeFrequencies)
This function returns the scaled probability of the evidence collected given a vector of priors on the haplotype using the internal likelihood, which may be scaled by an unknown factor.double
shiftedLogEvidenceProbability()
Returns log (p(evidence)) + c assuming that the prior on haplotypes is given by the internal haplotypeFrequenciesdouble
shiftedLogEvidenceProbabilityGivenOtherEvidence(HaplotypeProbabilities otherHp)
returns the log-probability the evidence, using as priors the posteriors of another objectdouble
shiftedLogEvidenceProbabilityUsingGenotypeFrequencies(double[] genotypeFrequencies)
-
-
-
Constructor Detail
-
HaplotypeProbabilities
protected HaplotypeProbabilities(HaplotypeBlock haplotypeBlock)
-
-
Method Detail
-
getHaplotype
public HaplotypeBlock getHaplotype()
Returns the haplotype for which the probabilities apply.
-
getPriorProbablities
public double[] getPriorProbablities()
-
getPosteriorProbabilities
public double[] getPosteriorProbabilities()
Returns the probabilities, in order, of the AA, Aa and aa haplotypes. Mathematically, this is P(H | D, F) where and H is the vector of possible haplotypes {AA,Aa,aa}. D is the data seen by the class, and F is the population frequency of each genotype. Returns the posterior normalized probabilities using the population frequency as a prior.
-
getPosteriorLikelihoods
public double[] getPosteriorLikelihoods()
Returns the probabilities, in order, of the AA, Aa and aa haplotypes. Mathematically, this is P(H | D, F) where and H is the vector of possible haplotypes {AA,Aa,aa}. D is the data seen by the class, and F is the population frequency of each genotype. Returns the unnormalized likelihoods using the population frequency as a prior.
-
getLikelihoods
public abstract double[] getLikelihoods()
Returns the likelihoods, in order, of the AA, Aa and aa haplotypes given the evidenceMathematically this is P(evidence | haplotype) where haplotype={AA,Aa,aa}. Will be normalized.
-
getLogLikelihoods
public double[] getLogLikelihoods()
-
getRepresentativeSnp
public abstract Snp getRepresentativeSnp()
Returns a representative SNP for this haplotype. Different subclasses may implement this in different ways, but should do so in a deterministic/repeatable fashion.
-
getObsAllele1
public int getObsAllele1()
Returns the number of observations of alleles supporting the first/major haplotype allele. Strictly this doesn't make sense for all subclasses, but it's nice to have it part of the API so a default implementation is provided here.- Returns:
- int
-
getObsAllele2
public int getObsAllele2()
Returns the number of observations of alleles supporting the second/minor haplotype allele. Strictly this doesn't make sense for all subclasses, but it's nice to have it part of the API so a default implementation is provided here.- Returns:
- int
-
getTotalObs
public int getTotalObs()
Returns the total number of observations of any allele. Strictly this doesn't make sense for all subclasses, but it's nice to have it part of the API so a default implementation is provided here.- Returns:
- int
-
hasEvidence
public boolean hasEvidence()
Returns true if evidence has been added, false if the probabilities are just the priors.
-
merge
public abstract HaplotypeProbabilities merge(HaplotypeProbabilities other)
Merges in the likelihood information from the supplied haplotype probabilities object.
-
getMostLikelyHaplotype
public DiploidHaplotype getMostLikelyHaplotype()
Gets the most likely haplotype given the probabilities.
-
getMostLikelyGenotype
public DiploidGenotype getMostLikelyGenotype(Snp snp)
Gets the genotype for this Snp given the most likely haplotype.
-
scaledEvidenceProbabilityUsingGenotypeFrequencies
public double scaledEvidenceProbabilityUsingGenotypeFrequencies(double[] genotypeFrequencies)
This function returns the scaled probability of the evidence collected given a vector of priors on the haplotype using the internal likelihood, which may be scaled by an unknown factor. This factor causes the result to be scaled, hence the name.Mathematically:
P(Evidence| P(h_i)=F_i) = \sum_i P(Evidence | h_i) P(h_i) = \sum_i P(Evidence | h_i) F_i = c * \sum_i Likelihood_i * F_i
Here, h_i are the three possible haplotypes, F_i are the given priors, and Likelihood_i are the stored likelihoods which are scaled from the actually likelihoods by an unknown factor, c. Note that the calculation ignores the internal haplotype probabilities (i.e. priors)
- Parameters:
genotypeFrequencies
- vector of (possibly scaled) probabilities of the three haplotypes- Returns:
- P(evidence | P_h)) / c
-
shiftedLogEvidenceProbabilityUsingGenotypeFrequencies
public double shiftedLogEvidenceProbabilityUsingGenotypeFrequencies(double[] genotypeFrequencies)
-
shiftedLogEvidenceProbabilityGivenOtherEvidence
public double shiftedLogEvidenceProbabilityGivenOtherEvidence(HaplotypeProbabilities otherHp)
returns the log-probability the evidence, using as priors the posteriors of another object- Parameters:
otherHp
- an additional HaplotypeProbabilities object representing the same underlying HaplotypeBlock- Returns:
- log10(P(evidence| P(h_i)=P(h_i|otherHp) ) + c where c is an unknown constant
-
shiftedLogEvidenceProbability
public double shiftedLogEvidenceProbability()
Returns log (p(evidence)) + c assuming that the prior on haplotypes is given by the internal haplotypeFrequencies
-
getLodMostProbableGenotype
public double getLodMostProbableGenotype()
Returns the LOD score between the most probable haplotype and the second most probable.
-
deepCopy
public abstract HaplotypeProbabilities deepCopy()
-
-