org.apache.commons.math3.stat.inference
Class OneWayAnova

java.lang.Object
  extended by org.apache.commons.math3.stat.inference.OneWayAnova

public class OneWayAnova
extends Object

Implements one-way ANOVA (analysis of variance) statistics.

Tests for differences between two or more categories of univariate data (for example, the body mass index of accountants, lawyers, doctors and computer programmers). When two categories are given, this is equivalent to the TTest.

Uses the commons-math F Distribution implementation to estimate exact p-values.

This implementation is based on a description at http://faculty.vassar.edu/lowry/ch13pt1.html

 Abbreviations: bg = between groups,
                wg = within groups,
                ss = sum squared deviations
 

Since:
1.2
Version:
$Id: OneWayAnova.java 1462423 2013-03-29 07:25:18Z luc $

Nested Class Summary
private static class OneWayAnova.AnovaStats
          Convenience class to pass dfbg,dfwg,F values around within OneWayAnova.
 
Constructor Summary
OneWayAnova()
          Default constructor.
 
Method Summary
 double anovaFValue(Collection<double[]> categoryData)
          Computes the ANOVA F-value for a collection of double[] arrays.
 double anovaPValue(Collection<double[]> categoryData)
          Computes the ANOVA P-value for a collection of double[] arrays.
 double anovaPValue(Collection<SummaryStatistics> categoryData, boolean allowOneElementData)
          Computes the ANOVA P-value for a collection of SummaryStatistics.
private  OneWayAnova.AnovaStats anovaStats(Collection<double[]> categoryData)
          This method calls the method that actually does the calculations (except P-value).
private  OneWayAnova.AnovaStats anovaStats(Collection<SummaryStatistics> categoryData, boolean allowOneElementData)
          This method actually does the calculations (except P-value).
 boolean anovaTest(Collection<double[]> categoryData, double alpha)
          Performs an ANOVA test, evaluating the null hypothesis that there is no difference among the means of the data categories.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

OneWayAnova

public OneWayAnova()
Default constructor.

Method Detail

anovaFValue

public double anovaFValue(Collection<double[]> categoryData)
                   throws NullArgumentException,
                          DimensionMismatchException
Computes the ANOVA F-value for a collection of double[] arrays.

Preconditions:

This implementation computes the F statistic using the definitional formula

   F = msbg/mswg
where
  msbg = between group mean square
  mswg = within group mean square
are as defined here

Parameters:
categoryData - Collection of double[] arrays each containing data for one category
Returns:
Fvalue
Throws:
NullArgumentException - if categoryData is null
DimensionMismatchException - if the length of the categoryData array is less than 2 or a contained double[] array does not have at least two values

anovaPValue

public double anovaPValue(Collection<double[]> categoryData)
                   throws NullArgumentException,
                          DimensionMismatchException,
                          ConvergenceException,
                          MaxCountExceededException
Computes the ANOVA P-value for a collection of double[] arrays.

Preconditions:

This implementation uses the commons-math F Distribution implementation to estimate the exact p-value, using the formula

   p = 1 - cumulativeProbability(F)
where F is the F value and cumulativeProbability is the commons-math implementation of the F distribution.

Parameters:
categoryData - Collection of double[] arrays each containing data for one category
Returns:
Pvalue
Throws:
NullArgumentException - if categoryData is null
DimensionMismatchException - if the length of the categoryData array is less than 2 or a contained double[] array does not have at least two values
ConvergenceException - if the p-value can not be computed due to a convergence error
MaxCountExceededException - if the maximum number of iterations is exceeded

anovaPValue

public double anovaPValue(Collection<SummaryStatistics> categoryData,
                          boolean allowOneElementData)
                   throws NullArgumentException,
                          DimensionMismatchException,
                          ConvergenceException,
                          MaxCountExceededException
Computes the ANOVA P-value for a collection of SummaryStatistics.

Preconditions:

This implementation uses the commons-math F Distribution implementation to estimate the exact p-value, using the formula

   p = 1 - cumulativeProbability(F)
where F is the F value and cumulativeProbability is the commons-math implementation of the F distribution.

Parameters:
categoryData - Collection of SummaryStatistics each containing data for one category
allowOneElementData - if true, allow computation for one catagory only or for one data element per category
Returns:
Pvalue
Throws:
NullArgumentException - if categoryData is null
DimensionMismatchException - if the length of the categoryData array is less than 2 or a contained SummaryStatistics does not have at least two values
ConvergenceException - if the p-value can not be computed due to a convergence error
MaxCountExceededException - if the maximum number of iterations is exceeded
Since:
3.2

anovaStats

private OneWayAnova.AnovaStats anovaStats(Collection<double[]> categoryData)
                                   throws NullArgumentException,
                                          DimensionMismatchException
This method calls the method that actually does the calculations (except P-value).

Parameters:
categoryData - Collection of double[] arrays each containing data for one category
Returns:
computed AnovaStats
Throws:
NullArgumentException - if categoryData is null
DimensionMismatchException - if the length of the categoryData array is less than 2 or a contained double[] array does not contain at least two values

anovaTest

public boolean anovaTest(Collection<double[]> categoryData,
                         double alpha)
                  throws NullArgumentException,
                         DimensionMismatchException,
                         OutOfRangeException,
                         ConvergenceException,
                         MaxCountExceededException
Performs an ANOVA test, evaluating the null hypothesis that there is no difference among the means of the data categories.

Preconditions:

This implementation uses the commons-math F Distribution implementation to estimate the exact p-value, using the formula

   p = 1 - cumulativeProbability(F)
where F is the F value and cumulativeProbability is the commons-math implementation of the F distribution.

True is returned iff the estimated p-value is less than alpha.

Parameters:
categoryData - Collection of double[] arrays each containing data for one category
alpha - significance level of the test
Returns:
true if the null hypothesis can be rejected with confidence 1 - alpha
Throws:
NullArgumentException - if categoryData is null
DimensionMismatchException - if the length of the categoryData array is less than 2 or a contained double[] array does not have at least two values
OutOfRangeException - if alpha is not in the range (0, 0.5]
ConvergenceException - if the p-value can not be computed due to a convergence error
MaxCountExceededException - if the maximum number of iterations is exceeded

anovaStats

private OneWayAnova.AnovaStats anovaStats(Collection<SummaryStatistics> categoryData,
                                          boolean allowOneElementData)
                                   throws NullArgumentException,
                                          DimensionMismatchException
This method actually does the calculations (except P-value).

Parameters:
categoryData - Collection of double[] arrays each containing data for one category
allowOneElementData - if true, allow computation for one catagory only or for one data element per category
Returns:
computed AnovaStats
Throws:
NullArgumentException - if categoryData is null
DimensionMismatchException - if allowOneElementData is false and the number of categories is less than 2 or a contained SummaryStatistics does not contain at least two values


Copyright (c) 2003-2013 Apache Software Foundation