Free statistical software

Free statistical software comes from a variety of sources, including governments, non-governmental organizations (NGOs) like UNESCO, universities, and developed by individuals. Most of it is fairly easy to learn, using menu systems, while a few are command driven. Many of these free software packages have been used in academic research in peer-reviewed journals or in publications from major organizations. Some are very popular, while others are much less frequently used. In general, though, free statistical software can be seen as a practical alternative to commercial packages.

Sources of free statistical software
Some of the free software packages are from governmental or NGO organizations, such as Epi Info, from CDC, and IDAMS from UNESCO. Some other software packages are from smaller or independent organizations or universities, such as Instat or Irristat. A couple of other packages are being developed by groups of volunteer individuals. PSPP, from the GNU project, is developing into a clone of SPSS, but is free. The R project is also very frequently used. A large proportion of free statistical software packages, however, are from individuals. Some commonly used software packages from individuals include Easyreg, MicrOsiris , OpenStat , and Zelig.

In some cases, the statistical software packages were developed for the purposes of making key technologies available to those who could not otherwise afford them, to empower development,. In other cases, the packages were developed as teaching aids,. Other packages were developed for specific purposes but can be more generally used. Examples are Irristat, developed for agricultural analysis, and Epi Info , developed for public health. A couple of packages don't appear to give any statements about why they were developed, other than just general use for statistical analysis, ,.

Reviews of free statistical software
There are a few reviews of free statistical software. There were two reviews in journals (but not peer reviewed), one by Zhu and Kuljaca and another article by Grant that included mainly a brief review of R. Zhu and Kuljaca outlined some useful characteristics of software, such as ease of use, having a number of statistical procedures and ability to develop new procedures. They reviewed several programs and identified which ones, at that time, had the most functionality. At that time, several of the programs may not have had all of the desired ability for advanced statistics. Grant reviewed some of the programing features of R, and briefly mentioned the availability of other programs. One other paper reviewed statistical packages, mainly commercial, but includes R. One article reviewed EasyReg and included a discussion of it's accuracy.

There has been only one review that compared the output of various packages. In this review, all of the packages read either CSV (Comma Separated Values - text files in which all values are separated by commas) files or excel format. All of the packages gave exactly the same results for correlation and regression. The free software packages also gave the same regression results as did excel. One of the main differences among the packages was how they handled missing data. With the example data sets used in the review, and for the package versions available in November 2006 when this review was conducted, two packages, MicrOsiris and Epi Info, could read files with blanks for missing. Two other programs, Stat4U and WinIdams need something for the missing, like -9 or -9.99. The other packages could only handle data sets with no missing values.

A couple of websites that list software also have very brief reviews of each package. The two sites that have these are by StatCon and by Pezzullo. These sites mainly offer a brief list of the features available in the packages. Similarly, one bachelors thesis compares the statistical procedures available on free statistical packages. In this review, R had all of the procedures, OpenStat had 16, MacAnova had 15, and Microsiris had 12. The others had from 8 to 11 of the procedures.

There is also a journal specifically for statistical software, although the main focus is on commercial software, R and some coding snippets.

These free software packages have been used in a number of scholarly publications, so that at least various journals, NGOs or other organizations regard the packages as valid. For example, OpenStat was used in a research letter to JAMA and in several published studies, ,. Irristat is used in this agricultural report, EasyReg is listed or used in these papers , , and WinIdams was used in these papers ,.

Using free statistical software
Before using any statistical packages, it is generally a good idea to have a solid background in Statistics. Then the packages can be used to the best advantage, for example, to choose the most appropriate test, to make sure all the necessary assumptions are met, so that the appropriate conclusions can be drawn.

Once the statistical issues are understood, the next step is to decide which package to use. Most of these packages are menu driven, and can be learned a couple of hours at most, except R, which is generally code driven and requires a much longer time to learn, and to some extent CDC's Epi Info, which also takes some time to learn.

Several of the packages also have tutorials. These tutorials help with a basic introduction and learning the basics of programs. For example, CDC has these tutorials about Epi Info,. The CDC page also lists a video slide show tutorial from the University of Nebraska, and another site has on line training classes. R has a large number of tutorials and manuals, in English and other languages, , , and a faq site. A few of the packages have a email discussion lists including R, OpenStat , and PSPP.

Some of the packages have on line manuals, guides or help pages. These manuals or guides are useful when there are questions about specific procedures or statistical tests. Some manuals or guides are for R, EasyReg , OpenStat , PSPP , Vista , WinIdams , , and Zelig. The CDC EpiInfo site itself does not have a manual, but one faculty member from Emory's School of Public Health has an introductory manual.