[R] Validation of R

Thu Apr 17 17:03:07 CEST 2003

On Thu, 17 Apr 2003, Brett Magill wrote:

> The national institute of standards and technology offers reference data
> sets and expected results for various statistical procedures using these
> data sets.  From the web site:
>
> "The purpose of this project is to improve the accuracy of statistical
> software by providing reference datasets with certified computational
> results that enable the objective evaluation of statistical software."
>

Yes, but the NIST tests I have seen (possibly an unrepresentative subset)
have been testing IMO the wrong thing. That is, they are good for finding
out where the rounding errors come in when the procedures are really
stressed.

They say
 "In response to industrial concerns about the numerical accuracy of
  computations from statistical software, the Statistical Engineering and
  Mathematical and Computational Sciences Divisions of NIST's Information
  Technology Laboratory are providing datasets with certified values for a
  variety of statistical methods."

In practice I think there's more danger from the wrong calculations being
done rather that from the results being accurate to 6 not  10
digits. Or from the wrong maximum being found in a multi-modal function,
which again is difficult to test.

It's perhaps also worth noting that the worst situation I know of in
recent years arose from accepting a poor default for a user-adjustable
precision setting (in S-PLUS, not that it really matters where).

	-thomas