Class SRAIndex

  • All Implemented Interfaces:
    BAMIndex, BrowseableBAMIndex, Closeable, AutoCloseable

    public class SRAIndex
    extends Object
    implements BrowseableBAMIndex
    Emulates BAM index so that we can request chunks of records from SRAFileReader Here is how it works: SRA allows reading of alignments by Reference position fast, so we divide our "file" range for alignments as a length of all references. Reading unaligned reads is then fast if we use read positions for lookup and (internally) filter out aligned fragments. Total SRA "file" range is calculated as sum of all reference lengths plus number of reads (both aligned and unaligned) in SRA archive. Now, we can use Chunks to lookup for aligned and unaligned fragments. We emulate BAM index bins by mapping SRA reference positions to bin numbers. And then we map from bin number to list of chunks, which represent SRA "file" positions (which are simply reference positions). We only emulate last level of BAM index bins (and they refer to a portion of reference SRA_BIN_SIZE bases long). For all other bins RuntimeException will be returned (but since nobody else creates bins, except SRAIndex class that is fine). But since the last level of bins was not meant to refer to fragments that only partially overlap bin reference positions, we also return chunk that goes 5000 bases left before beginning of the bin to assure fragments that start before the bin positions but still overlap with it can be retrieved by SRA reader. Later we will add support to NGS API to get a maximum number of bases that we need to go left to retrieve such fragments. Created by andrii.nikitiuk on 9/4/15.
    • Field Detail

      • SRA_BIN_SIZE

        public static final int SRA_BIN_SIZE
        Number of reference bases bins in last level can represent
        See Also:
        Constant Field Values
      • SRA_CHUNK_SIZE

        public static final int SRA_CHUNK_SIZE
        Chunks of that size will be created when using SRA index
        See Also:
        Constant Field Values
    • Constructor Detail

    • Method Detail

      • getLevelSize

        public int getLevelSize​(int levelNumber)
        Gets the size (number of bins in) a given level of a BAM index.
        Specified by:
        getLevelSize in interface BrowseableBAMIndex
        Parameters:
        levelNumber - Level for which to inspect the size.
        Returns:
        Size of the given level.
      • getLevelForBin

        public int getLevelForBin​(Bin bin)
        SRA only operates on bins from last level
        Specified by:
        getLevelForBin in interface BrowseableBAMIndex
        Parameters:
        bin - The bin for which to determine the level.
        Returns:
        bin level
      • getFirstLocusInBin

        public int getFirstLocusInBin​(Bin bin)
        Gets the first locus that this bin can index into.
        Specified by:
        getFirstLocusInBin in interface BrowseableBAMIndex
        Parameters:
        bin - The bin to test.
        Returns:
        first position that associated with given bin number
      • getLastLocusInBin

        public int getLastLocusInBin​(Bin bin)
        Gets the last locus that this bin can index into.
        Specified by:
        getLastLocusInBin in interface BrowseableBAMIndex
        Parameters:
        bin - The bin to test.
        Returns:
        last position that associated with given bin number
      • getBinsOverlapping

        public BinList getBinsOverlapping​(int referenceIndex,
                                          int startPos,
                                          int endPos)
        Provides a list of bins that contain bases at requested positions
        Specified by:
        getBinsOverlapping in interface BrowseableBAMIndex
        Parameters:
        referenceIndex - sequence of desired SAMRecords
        startPos - 1-based start of the desired interval, inclusive
        endPos - 1-based end of the desired interval, inclusive
        Returns:
        a list of bins that contain relevant data
      • getSpanOverlapping

        public BAMFileSpan getSpanOverlapping​(Bin bin)
        Description copied from interface: BrowseableBAMIndex
        Perform an overlapping query of all bins bounding the given location.
        Specified by:
        getSpanOverlapping in interface BrowseableBAMIndex
        Parameters:
        bin - The bin over which to perform an overlapping query.
        Returns:
        The file pointers
      • getSpanOverlapping

        public BAMFileSpan getSpanOverlapping​(int referenceIndex,
                                              int startPos,
                                              int endPos)
        Description copied from interface: BAMIndex
        Gets the compressed chunks which should be searched for the contents of records contained by the span referenceIndex:startPos-endPos, inclusive. See the BAM spec for more information on how a chunk is represented.
        Specified by:
        getSpanOverlapping in interface BAMIndex
        Parameters:
        referenceIndex - The contig.
        startPos - Genomic start of query.
        endPos - Genomic end of query.
        Returns:
        A file span listing the chunks in the BAM file.
      • getStartOfLastLinearBin

        public long getStartOfLastLinearBin()
        Description copied from interface: BAMIndex
        Gets the start of the last linear bin in the index.
        Specified by:
        getStartOfLastLinearBin in interface BAMIndex
        Returns:
        a position where aligned fragments end
      • getMetaData

        public BAMIndexMetaData getMetaData​(int reference)
        Description copied from interface: BAMIndex
        Gets meta data for the given reference including information about number of aligned, unaligned, and noCoordinate records
        Specified by:
        getMetaData in interface BAMIndex
        Parameters:
        reference - the reference of interest
        Returns:
        meta data for the reference
      • close

        public void close()
        Description copied from interface: BAMIndex
        Close the index and release any associated resources.
        Specified by:
        close in interface AutoCloseable
        Specified by:
        close in interface BAMIndex
        Specified by:
        close in interface Closeable