IntArray
, DuplicatesGTRec
, GTRec
, MarkerContainer
public final class VcfRecord extends java.lang.Object implements GTRec
Class VcfRecord
represents a VCF record.
Instances of class VcfRecord
are immutable.
Modifier and Type | Field | Description |
---|---|---|
static java.lang.String |
GL_FORMAT |
The VCF FORMAT code for log-scaled genotype likelihood data: "GL".
|
static java.lang.String |
PL_FORMAT |
The VCF FORMAT code for phred-scaled genotype likelihood data: "PL".
|
Modifier and Type | Method | Description |
---|---|---|
int |
allele1(int sample) |
Returns the first allele for the specified sample or
-1 if the allele is missing.
|
int |
allele2(int sample) |
Returns the second allele for the specified sample or
-1 if the allele is missing.
|
int[] |
alleles() |
Returns an array of length
this.size() whose j -th
element is equal to this.allele(j } |
java.lang.String |
filter() |
Returns the FILTER field.
|
java.lang.String |
format() |
Returns the FORMAT field.
|
java.lang.String[] |
formatData(java.lang.String formatCode) |
Returns an array of length
this.nSamples()
containing the specified FORMAT subfield data for each sample. |
int |
formatIndex(java.lang.String formatCode) |
Returns the index of the specified FORMAT subfield if the
specified subfield is defined for this VCF record, and returns -1
otherwise.
|
java.lang.String |
formatSubfield(int subfieldIndex) |
Returns the specified FORMAT subfield.
|
static VcfRecord |
fromGL(VcfHeader vcfHeader,
java.lang.String vcfRecord,
float maxLR) |
Constructs and returns a new
VcfRecord instance from a
VCF record and its GL or PL format subfield data. |
static VcfRecord |
fromGT(VcfHeader vcfHeader,
java.lang.String vcfRecord) |
Constructs and returns a new
VcfRecord instance from a
VCF record and its GT format subfield data |
static VcfRecord |
fromGTGL(VcfHeader vcfHeader,
java.lang.String vcfRecord,
float maxLR) |
Constructs and returns a new
VcfRecord instance from a VCF
record and its GT, GL, and PL format subfield data. |
int |
get(int hap) |
Returns the specified allele for the specified haplotype or
-1 if the allele is missing.
|
float |
gl(int sample,
int allele1,
int allele2) |
Returns the probability of the observed data for the specified sample
if the specified pair of ordered alleles is the true ordered genotype.
|
static int |
gtIndex(int a1,
int a2) |
Returns the VCF genotype index for the specified pair of alleles.
|
boolean |
hasFormat(java.lang.String formatCode) |
Returns
true if the specified FORMAT subfield is
present, and returns false otherwise. |
java.lang.String |
info() |
Returns the INFO field.
|
boolean |
isGTData() |
Returns
true if the value returned by this.gl() is
determined by a called or missing genotype, and returns false
otherwise. |
boolean |
isPhased() |
Returns
true if every genotype for each sample is a phased,
non-missing genotype, and returns false otherwise. |
boolean |
isPhased(int sample) |
Returns
true if the genotype for the specified sample is
a phased, nonmissing genotype, and returns false otherwise. |
Marker |
marker() |
Returns the marker.
|
int |
nAlleles() |
Returns the number of marker alleles.
|
int |
nFormatSubfields() |
Returns the number of FORMAT subfields.
|
int |
nSamples() |
Returns the number of samples.
|
java.lang.String |
qual() |
Returns the QUAL field.
|
java.lang.String |
sampleData(int sample) |
Returns the data for the specified sample.
|
java.lang.String |
sampleData(int sample,
int subfieldIndex) |
Returns the specified data for the specified sample.
|
java.lang.String |
sampleData(int sample,
java.lang.String formatCode) |
Returns the specified data for the specified sample.
|
Samples |
samples() |
Returns the list of samples.
|
int |
size() |
Returns the number of haplotypes.
|
java.lang.String |
toString() |
Returns the VCF record.
|
VcfHeader |
vcfHeader() |
Returns the VCF meta-information lines and the VCF header line.
|
public static final java.lang.String GL_FORMAT
public static final java.lang.String PL_FORMAT
public static int gtIndex(int a1, int a2)
a1
- the first allelea2
- the second allelejava.lang.IllegalArgumentException
- if a1 < 0 || a2 < 0
public static VcfRecord fromGT(VcfHeader vcfHeader, java.lang.String vcfRecord)
VcfRecord
instance from a
VCF record and its GT format subfield datavcfHeader
- meta-information lines and header line for the
specified VCF record.vcfRecord
- a VCF record with a GL format field corresponding to
the specified vcfHeader
objectVcfRecord
instancejava.lang.IllegalArgumentException
- if the VCF record does not have a
GT format fieldjava.lang.IllegalArgumentException
- if a VCF record format error is
detectedjava.lang.IllegalArgumentException
- if there are not
vcfHeader.nHeaderFields()
tab-delimited fields in the
specified VCF recordjava.lang.NullPointerException
- if
vcfHeader == null || vcfRecord == null
public static VcfRecord fromGL(VcfHeader vcfHeader, java.lang.String vcfRecord, float maxLR)
VcfRecord
instance from a
VCF record and its GL or PL format subfield data. If both
GL and PL format subfields are present, the GL format field will be used.
If the maximum normalized genotype likelihood is 1.0 for a sample,
then any other genotype likelihood for the sample that is less than
lrThreshold
is set to 0.vcfHeader
- meta-information lines and header line for the
specified VCF recordvcfRecord
- a VCF record with a GL format field corresponding to
the specified vcfHeader
objectmaxLR
- the maximum likelihood ratioVcfRecord
instancejava.lang.IllegalArgumentException
- if the VCF record does not have a
GL format fieldjava.lang.IllegalArgumentException
- if a VCF record format error is
detectedjava.lang.IllegalArgumentException
- if there are not
vcfHeader.nHeaderFields()
tab-delimited fields in the
specified VCF recordjava.lang.NullPointerException
- if
vcfHeader == null || vcfRecord == null
public static VcfRecord fromGTGL(VcfHeader vcfHeader, java.lang.String vcfRecord, float maxLR)
VcfRecord
instance from a VCF
record and its GT, GL, and PL format subfield data.
If the GT format subfield is present and non-missing, the
GT format subfield is used to determine genotype likelihoods. Otherwise
the GL or PL format subfield is used to determine genotype likelihoods.
If both the GL and PL format subfields are present, only the GL format
subfield will be used. If the maximum normalized genotype likelihood
is 1.0 for a sample, then any other genotype likelihood for the sample
that is less than lrThreshold
is set to 0.vcfHeader
- meta-information lines and header line for the
specified VCF recordvcfRecord
- a VCF record with a GT, a GL or a PL format field
corresponding to the specified vcfHeader
objectmaxLR
- the maximum likelihood ratioVcfRecord
java.lang.IllegalArgumentException
- if the VCF record does not have a
GT, GL, or PL format fieldjava.lang.IllegalArgumentException
- if a VCF record format error is
detectedjava.lang.IllegalArgumentException
- if there are not
vcfHeader.nHeaderFields()
tab-delimited fields in the
specified VCF recordjava.lang.NullPointerException
- if
vcfHeader == null || vcfRecord == null
public java.lang.String qual()
public java.lang.String filter()
public java.lang.String info()
public java.lang.String format()
public int nFormatSubfields()
public java.lang.String formatSubfield(int subfieldIndex)
subfieldIndex
- a FORMAT subfield indexjava.lang.IndexOutOfBoundsException
- if
subfieldIndex < 0 || subfieldIndex >= this.nFormatSubfields()
public boolean hasFormat(java.lang.String formatCode)
true
if the specified FORMAT subfield is
present, and returns false
otherwise.formatCode
- a FORMAT subfield codetrue
if the specified FORMAT subfield is
presentpublic int formatIndex(java.lang.String formatCode)
formatCode
- the format subfield code-1
otherwisepublic java.lang.String sampleData(int sample)
sample
- a sample indexjava.lang.IndexOutOfBoundsException
- if
sample < 0 || sample >= this.nSamples()
public java.lang.String sampleData(int sample, java.lang.String formatCode)
sample
- a sample indexformatCode
- a FORMAT subfield codejava.lang.IllegalArgumentException
- if
this.hasFormat(formatCode)==false
java.lang.IndexOutOfBoundsException
- if
sample < 0 || sample >= this.nSamples()
public java.lang.String sampleData(int sample, int subfieldIndex)
sample
- a sample indexsubfieldIndex
- a FORMAT subfield indexjava.lang.IndexOutOfBoundsException
- if
field < 0 || field >= this.nFormatSubfields()
java.lang.IndexOutOfBoundsException
- if
sample < 0 || sample >= this.nSamples()
public java.lang.String[] formatData(java.lang.String formatCode)
this.nSamples()
containing the specified FORMAT subfield data for each sample. The
k
-th element of the array is the specified FORMAT subfield data
for the k
-th sample.formatCode
- a format subfield codethis.nSamples()
containing the specified FORMAT subfield data for each samplejava.lang.IllegalArgumentException
- if
this.hasFormat(formatCode) == false
public Samples samples()
GTRec
public int nSamples()
DuplicatesGTRec
this.size()/2
.nSamples
in interface DuplicatesGTRec
public VcfHeader vcfHeader()
public Marker marker()
MarkerContainer
marker
in interface MarkerContainer
public int allele1(int sample)
DuplicatesGTRec
this.unphased(marker, sample) == false
.allele1
in interface DuplicatesGTRec
sample
- a sample indexpublic int allele2(int sample)
DuplicatesGTRec
this.unphased(marker, sample) == false
.allele2
in interface DuplicatesGTRec
sample
- a sample indexpublic int get(int hap)
DuplicatesGTRec
this.unphased(marker, hap/2) == false
.get
in interface DuplicatesGTRec
get
in interface IntArray
hap
- a haplotype indexpublic int[] alleles()
DuplicatesGTRec
this.size()
whose j
-th
element is equal to this.allele(j
}alleles
in interface DuplicatesGTRec
this.size()
whose j
-th
element is equal to this.allele(j
}public boolean isPhased(int sample)
DuplicatesGTRec
true
if the genotype for the specified sample is
a phased, nonmissing genotype, and returns false
otherwise.isPhased
in interface DuplicatesGTRec
sample
- a sample indextrue
if the genotype for the specified sample
is a phased, nonmissing genotypepublic boolean isPhased()
DuplicatesGTRec
true
if every genotype for each sample is a phased,
non-missing genotype, and returns false
otherwise.isPhased
in interface DuplicatesGTRec
true
if the genotype for each sample is a phased,
non-missing genotypepublic boolean isGTData()
GTRec
true
if the value returned by this.gl()
is
determined by a called or missing genotype, and returns false
otherwise.public float gl(int sample, int allele1, int allele2)
GTRec
public int nAlleles()
MarkerContainer
nAlleles
in interface MarkerContainer
public int size()
DuplicatesGTRec
2*this.nSamples()
.size
in interface DuplicatesGTRec
size
in interface IntArray
public java.lang.String toString()
toString
in class java.lang.Object