\name{xxtabs}
\title{Compute multiple summary statistics of data subsets}
\alias{xxtabs}
\alias{as.data.frame.xxtabs}
\alias{as.data.frame.xxtabs.list}
\alias{print.xxtabs}
\alias{print.xxtabs.list}
\alias{ALL}
\description{
  Splits the data into subsets, computes several summary statistics for
  each, and returns the results in a data frame. This is a friendlier
  interface to \code{aggregate} or \code{xtabs} when you want to compute multiple
  summary statistics for each subset.
}
\usage{
xxtabs(formula = ~., data = parent.frame(), subset,
       na.action=na.pass, drop.unused.levels=TRUE,
       FUN=list(n=sum),
       nocount.val=if(length(formula)==2) 0 else NA)
\method{as.data.frame}{xxtabs}(x,row.names=NULL,optional=TRUE,drop=TRUE,col.var=NULL,order=order.slowfirst,...)
\method{as.data.frame}{xxtabs.list}(x,row.names=NULL,optional=TRUE,drop=TRUE,col.var=NULL,order=order.slowfirst,...)
\method{print}{xxtabs}(x,...,na.print=NA)
\method{print}{xxtabs.list}(x,...)
ALL(a)
}
\arguments{
  \item{formula}{a formula of the form \code{~var1+...+varn} or
    \code{response~var1+...+varn}, where \code{var1,...,varn} are the
    cross-classifying variables. The left hand side of the formula may be either blank
    or a vector; it indicates what data will be fed to the FUN
    functions. Terms on the right hand side of the formula may
    optionally be
    wrapped in \code{ALL()}; see Details for an explanation.}
  \item{FUN}{a function or named list of functions. If the left hand
    side of the formula is a vector, this vector is split into subsets,
    and each function is applied to each subset. If the left hand side
    is left blank, the entire data frame is split into subsets, and each
    function is applied to each subset.}
  \item{data}{an optional matrix or data frame containing the variables
    in \code{formula}}
  \item{subset}{an optional vector specifying a subset of observations
    to be used}
  \item{na.action}{a function which indicates what should happen when
    the cross-classifying variables contain \code{NA}s. Useful values
    are \code{na.pass} and \code{na.omit}; the former will keep any
    \code{NA}s in the cross-classifying variables; the latter will drop
    those records}
  \item{drop.unused.levels}{a logical indicating whether to drop unused
    levels in the classifying factors. For example, if factor \code{fac}
    has levels \code{'a','b','c'} and there are no records at all with
    value \code{'c'}, then level \code{'c'} might be omitted.}
  \item{drop}{a logical indicating whether to drop unused levels of the
    classifying factors. For example, if there are no records for which
    both factor \code{fac1='a'} and \code{fac2='b'}, then row
    \code{fac1='a',fac2='b'} of the data frame will be omitted.}
  \item{nocount.val}{the value to use for cells with no data. If this is
    NULL, each of the functions is called with an empty argument;
    otherwise the specified value is used. The default is to use 0 for
    counts, NA for other response variables.}
  \item{x}{the output of \code{xxtabs}}
  \item{col.var}{an optional string. If present, it should be one of the
    cross-classifying variables; then the data frame is
    reshaped so that this variable is used for column headings}
  \item{order}{a function indicating how the data frame should be
    ordered. The default value, \code{order.slowfirst}, says that the
    first cross-classifying variable listed in the formula should vary
    the slowest. The order function should take a numeric vector of dimensions, and
    return a numeric vector of indexes into an array with that many
    dimensions.}
  \item{na.print}{what to print for NA values in the cross-table}
  \item{row.names, optional}{passed to \code{as.data.frame}}
  \item{...}{other arguments, ignored}
  \item{a}{any object}
}
\details{
  Typical usages:
  \preformatted{
xxtabs(~a+b)
xxtabs(x~a+b, FUN)
xxtabs(x~a+ALL(b), FUN)
xxtabs(x~a+b, FUN=list(f,g))
xxtabs(~a+b, data=df, FUN)
}
The first is like \code{xtabs}. The second splits \code{x} by \code{a}
and \code{b}, and applies \code{FUN} to each subset. The third
introduces an extra level for \code{b} in its output, which encompasses
data at all levels of \code{b}.
The fourth applies
both \code{f} and \code{g} to each subset. The fifth splits the data
frame into subsets by \code{a} and \code{b}, and applies \code{FUN} to
each subset.
  }
\value{
  If there is only one function listed in \code{FUN}, \code{xxtabs} returns
  a table similar to what \code{xtabs} returns, with class \code{xxtabs}. If there are several
  functions, the return value is a list of tables, with class
  \code{xxtabs.list}. Each individual table has an attribute
  \code{responsevar} which is a character string with the name of the
  function. The overall object has an attribute \code{counts}, which is
  a table indicating the count in each cell.
}
\author{DJW}
\seealso{\code{\link{aggregate}}, \code{\link{xtabs}}}
\examples{
data(ChickWeight)

## xxtabs can be used as a replacement for xtabs
xtabs(~Diet, data=ChickWeight)
xxtabs(~Diet, data=ChickWeight)

xtabs(weight~Diet, data=ChickWeight)
xxtabs(weight~Diet, data=ChickWeight)

## xxtabs will tell you about missing values in the cross-classifying variables, unless you tell it not to
ChickWeight[c(1,10),'Diet'] <- NA
xtabs(~Diet, data=ChickWeight)
xxtabs(~Diet, data=ChickWeight, na.action=na.omit)
xxtabs(~Diet, data=ChickWeight)

## xxtabs can report table margins, an alternative to margin.table
#  Here, there are some NA values, which xtabs misses but xxtabs keeps.
margin.table(xtabs(~Diet, data=ChickWeight))
xxtabs(~ALL(Diet), data=ChickWeight)

## xxtabs makes it easy to report multiple summary statistics
#  Here we have converted the xxtabs into a data frame, to see all the
#  summary statistics side by side
data.frame(xxtabs(weight~ALL(Diet), data=ChickWeight, FUN=list(n=length,sd=sd)))


## If your functions need access to several columns of a data frame, put
#  the data frame on the left hand side of the formula
data.frame(xxtabs(ChickWeight~Diet, data=ChickWeight,
  FUN=list(nobs=nrow, totweight=function(x) sum(x$weight), nchick=function(x) length(unique(x$Chick)))))

## When you convert the table to a data frame, the function values are
#  all put as column headings. You can also specify one of the
#  cross-classifying variables to be used as a function heading
ChickWeight$age <- sample(c('young','old'),size=nrow(ChickWeight),replace=TRUE)
as.data.frame(xxtabs(weight~Diet+age, data=ChickWeight, FUN=list(n=length,sd=sd)), col.var='age')
}
\keyword{manip}
