########## Version 0.15.5 (July 2023) ############## * Issues arising from clang17 corrected ########## Version 0.15.4 (January 2023) ############## * Deprecated C function sprintf was replaced with snprintf ########## Version 0.15.3 (August 2022) ############### * Minor changes to some .Rd files for compatibility with KaTeX ########## Version 0.15 (March 2020) ############### * Changes to some C files ########## Version 0.14 (February 2020) ############## * Updates for the C++ One Definition Rules * Change to createHmat.R file ########## Version 0.13 (July 2017) ################### * added registration of native routines ########## Version 0.12-6 (July 2016) ################## * added importFROM to NAMESPACE file, to eliminate the CRAN NOTEs ########## Version 0.12-5 (April 2015) ################## * S3 methods added to the NAMESPACE file ########## Version 0.12-4 (September 2014) ################## * Several small changes to .cpp files as a result of CRAN warnings * The old src/Makevars file was deleted ########## Version 0.12-3 (July 2013) ####################### * R version 3.0 changed the "eigen" function, so that if the "symmetric=TRUE" option is not given explicitely, *and if rownames and colnames are not the same* it is assumed that the matrix is *not* symmetric. This was causing complex eigenvalues to appear in the submatrices of the "trim.matrix" function (only the main matrix was given the "symmetric=TRUE" option. The bug was reported by a user on 13.6.13. * Subdirectory "vignettes" was created for vignette sources, as suggest post R 2.14. * Lines wider than 90 characters were cut down in *.Rd files. ########## Version 0.12-2 (August 2012) ####################### * Bug correction in the eleaps function. In versions 0.11-1 through 0.12-1, for a few data conditions the processing of the variables provided in include argument was interfering with the checks of singularity. Version 0.12-2 fixes this problem. ########## Version 0.12-1 (July 2012) ####################### * patch for non-standard C++ code which gave an error mesage when compiled on Solaris platforms. ########## Version 0.12-0 (July 2012) ####################### * Changes were made to the code, so as to allow also the anneal, improve and genetic algorithms to keep running, when called with criteria "tau2", "xi2", "zeta2" or "ccr12", if a subset associated with a singular "mat" matrix is selected. Whereas until now this would break the execution of the search algorithms, as of now an artificially poor criterion value is associated with such subsets (to ensure that they are nor selected), but the algorithm proceeds. At the end, a warning is passed on to the user if some subsets associated with ill-conditioned submatrices were indeed found and dropped in the course of the searches. * The memory management of the Fortran and C++ code was improved, making the anneal, improve, genetic and eleaps funtions more robust when handling larger (up to a few hundred original variables) problems. * An additional search sequence was made available to the eleaps function, making it more adaptable to characteristics of the data. In particular, when the original "mat" matrix is numerically singular and optimality is not likely to be proved within the specified timelimit, eleaps now uses a diferent sequencing that tries to ensure that the potentially more interesting subsets are evaluated first. * The numerical procedures for detecting and handling ill-conditioned matrices were improved. * Two bugs were corrected in eleaps function: (i) The argument "timelimit" was not working correctly with large (several thousand seconds) values. (ii) When called with non-NULL include and exclude arguments, searches according to the RV criterion were not working correctly. * The default value for the "popsize" argument in the genetic function was changed from 100 to max(100,2*ncol(mat)), trying to ensure that for larger problems reasonable genetic diversity is available by default. * The output returned by genetic function, when stopped because the initial population did not guarantee enough genetic diversity, was changed. Previously, in this case genetic returned NA, and now it returns the best solutions found in the search before it was aborted. * The output given by searches that were not able to find numericaly reliable solutions for all cardinalities sought was simpliflied and restricted to the cardinalities where reliable subsets were found. * Some warning messages are now displayed more selectively and only in the presence of non-expected problems. The warning message about ill-conditioned matrices is only shown when it is not known in advance that the input matrix "mat" was singular (in particular, this message is not displayed when it is known that "mat" was originated from a problem with more original variables than observations). Likewise, the warning regarding restricted searchs is now displayed only if ill-conditioned subsets were indeed dropped in the course of the search. * The lda method for the pre-processing function ldaHmat was dropped. This method worked incorrectly if between creating an MASS lda object and calling the ldaHmat function, the x or grouping objects were changed. ########## Version 0.11-3 (December 2011) ####################### * More changes in the C++ and R code to try and avoid remaining errors in the Solaris compilation and Notes in five other compilations. * Bug correction in eleaps function: the change made in July 2011 to allow for "partial" searches when the mat matrix is singular was also conducting partial searches when the mat matrix was OK. ########## Version 0.11-2 (October 2011) ####################### * Minor changes in the C++ and R code to try and avoid errors in the Solaris compilation, a warning in Fedora compilation and Notes in all other compilations. * "using namespace std" added to several *.cpp files. ########## Version 0.11-1 (July 2011) ####################### * The "leaps" function was renamed "eleaps", to distinguish it from other "leaps" functions in R packages. The former "leaps" function is deprecated, but a "leaps" wrapper for the eleaps function was created in order to ensure backward compatibility. * A Sweave vignette was created in the new inst/doc subdirectory. * Changes were made to the code, so as to allow the eleaps search algorithm to keep running when called with criteria tau2, xi2, zeta2 or ccr12, if a subset associated with a singular mat matrix is selected. Whereas until now this would break the execution of the search algorithms, as of now an artificially poor criterion value is associated with such subsets (to ensure that they are nor selected), but the algorithm proceeds. At the end, a warning is passed on to the user. This change has not yet been introduced in the anneal, improve and genetic algorithms. * A "data" directory was created. A "farm" data set and its help file were created. * A new attribute "FisherI" created, to indicate that an object is of class glm and that the Wald criterion should be the default criterion. * A Subversion repository was created for the subselect project ########## Version 0.10.1 (December 2009) ################### * A new criterion for variable selection in a generalized linear model context: the Wald criterion. Files wald.R (in /R) and wald.Rd (in /man) were added. * All search algorithm files (anneal.R, genetic.R, improve.R and leaps.R) were changed to cater for the Wald criterion. * All algorithm man files were changed to account for the Wald criterion. * The default criterion has been changed to "default" and is set to "RM" if r=0 and to "TAU2" if r>0. * Changes were made to the validation.R file, to cater for the Wald criterion. * For compatibility with the output of other search functions, "drop=FALSE" was added to the subsetting of the output of the "genetic" function, when "kabort" is involved. * Changes were made to ensure that the C++ code followed the standards. Problems had arisen with gcc 4.4 and the SunStudio compilers. ########## Version 0.9.9993 (January 2009) ########### Changes in the C++ code which broke the package. ("#include " added in sscma.cpp) ########## Version 0.9.9992 (February 2007) ########### Changes in the Fortran code which broke the package on Linux 64-bit machines (e.g.: conversions between types were smoothed out). ########## Version 0.9.9991 (January 2007) ########### Changes in the C code which could break the package on Linux 64-bit machines (e.g.: use of "long" data type was changed to "int"). ########## Version 0.9.999 (October 2006) ############ 1) Changes in the C code required by the transition to R 2.4 (and the use of new C compiler). 2) The leaps.R, lmHmat.default and glhHmat.default functions have suffered minor changes for greater numerical stability. 3) The "qr" criterion used when checking for positive definiteness of the input matrix in the validmat.R function was removed for coherence with the use of the "tolval" argument. ########## Version 0.9.99 (June 2006) ############ 1) Three new functions have been added to "pre-process" the data for submission to the search algorithms using any of the four new "multivariate linear hypothesis" criteria: function \code{lmHmat} creates the input for a linear model (linear regression), function \code{ldaHmat} does the same for a linear discriminant analysis and \code{glhHmat} does likewise for the general linear hypothesis. 2) The \code{call} item in the output list of the \code{leaps} function (which had been mistakenly dropped in version 0.9-9) was re-added. ########## Version 0.9.9 (May 2006) ############ Some major changes, as well as some minor ones. 1) Four new criteria have been added: "CCR1_2", "TAU_2", "XI_2" and "ZETA_2". All are criteria for subset selection in the context of the multivariate linear model. The four criteria were added as options for the "criterion" argument in the subset selection algorithms (anneal, genetic, improve and leaps). 2) Four new functions were created to compute the value of the new criteria when given a total matrix, a model effects matrix (H) with its expected rank (r) and a subset of variables. These new functions are called "ccr12.coef", "tau2.coef", "xi2.coef" and "zeta2.coef", and join the similar functions "gcd.coef", "rm.coef" and "rv.coef" which already existed. 3) Two new auxiliary functions were created: "validmat" and "validnovcrit", which include validation checks for the symmetry and positive definiteness of the total matrix and for the suitability of the H and r arguments of the new criteria, respectively. These are not to be called directly by the user. 4) The validation part of the old and new functions has been further modularized, with calls to the new validation functions mentioned above. 5) A new argument "tolsym", with default value 1000*.Machine$double.eps, has been added to the four search functions (anneal, genetic, improve and leaps). It is used in the symmetry checks of the total and effects (H) matrices. When the supplied matrices have corresponding elements differing by more than this value, they are rejected. If corresponding elements differ, but by less than this value, the supplied matrix is replaced by its symmetric part and a warning is provided. 6) Man files were written for the new functions, and the help files for the search functions were improved, with examples using the new criteria. ########## Version 0.9.1 (June 2005) ############# 1) The default value of argument "tolval" in functions "trim.matrix", "leaps", "anneal", "genetic" and "improve" has been changed to 10*.Machine$double.eps (previously .Machine$double.eps). 2) Correction of minor mistakes in the "leaps" help page and in function "validation" (which is not in the Namespace) - changes made to the "leaps" function between versions 0.8 and 0.9 had not been updated in this function and help page. ########## Version 0.9 (March 2005) ############# 1) Namespace added. 2) A new function "trim.matrix" was introduced, to deal with the issue of multicollinearities in the input data. 3) The four search functions ("leaps", "anneal", "genetic" and "improve") now perform a test for multicollinearity, which consists in comparing the ratio of the smallest to the largest eigenvalue of the input covariance/correlation matrix with the new argument "tolval" (.Machine$double.eps, by default). If this ratio is smaller than "tolval", the function exits with a message to the user requiring that the input matrix be cleared of its ill-conditionment (perhaps through the use of the new function "trim.matrix") before resubmitting it to the search function. The use of the search functions can be forced, by lowering the value of "tolval" (negative values of "tolval" are not allowed), but with risks of numerical inaccuracy. 4) The previous maximum number of variables allowed for the search functions (300) was eliminated. The code now runs for any number of variables. A test for input data with more than 400 variables was introduced in the search functions, and when that condition tests positive, the function stops execution and passes a message to the user, who may choose to run the function anyways, by setting a new function argument ("force") to TRUE. The code will now run with any number of variables, but may crash the R session due to memory problems. 5) Non-standard C++ code in the routines was changed and code which caused memory errors was also modified. The new code passes the "valgrind" memory checks. 6) Some Fortran code which did not comply with Fortran 77 standards was changed (e.g.: symbols such as ">" changed to ".GT."; comments using "*" in the first column; etc.). Some non-standard code was kept (use of variable array sizes in subroutine declaratives; use of do-loops and do-while-loops). 7) The default behaviour when a non-square matrix is passed as input data to the search functions has been changed: instead of assuming that covariance matrix of the input data matrix was wanted, the new option is to compute the input data matrix's *correlation* matrix. 8) The default values for "pcindices" whas been changed in the "leaps" function and is now "first_k" as in the "anneal", "genetic" and "improve" functions. This change is possible due to changes in the "leaps" source code. ########## Version 0.8 (May 2004) ############## 1) Output lists for the four search functions (leaps, anneal, genetic and improve) now have a fifth object: "$call", which gives the match.call() of the command that produced the output. 2) The C++ routine to compute the eigendecomposition for the "GCD" criterion in the "leaps" function has been improved. A more exact routine based on the QR decomposition is now used (see file "qldiagon.cpp" in the /src subdirectory). 3) Validation of input for the four search functions has now been transfered to separate modules ("validation.R", "validannimp.R" and "validgenetic.R"), which are mentioned in "/man/subselect-internal.Rd". Some minor changes to these validation checks were made. 4) The default values for "pcindices" were changed in the "leaps" function (where it is now NULL) and in the "anneal", "genetic" and "improve" functions (where it is now "first_k"). This seeks to highlight the behaviour of the functions when the GCD criterion (which uses "pcindices") is invoked with the default value of pcindices: "leaps" explicitely requires a set of pcindices, whereas the other three search functions will, by default, compare, for each cardinality "k" that has been requested, the k-variable subsets with the first k PCs. ########## Version 0.7.1 (March 2004) ############## 1) Two lines defining constants "TRUE" and "FALSE" in the C++ file src/matrixb.h, which gave errors when compiling in Mac OS X, were changed. The same change was made in file SR_vsda.h, and similar changes with the constants "EPSLON" were made in files lagmat.cpp, gaussjel.cpp, gausstrg.cpp, matfact.cpp, pdiagon.cpp and simgaussjell.cpp. Constants "MIN" and "MAX" in files SR_vsda.h, sr_sscma.cpp and sr_wrkf1.cpp were renamed MINIMZ and MAXIMZ, respectively. 2) Warnings in the R code and help files were introduced, to caution against the difference in default behaviour of the "leaps" function - when compared to the search algorithm functions "anneal", "genetic" and "improve" - in case the "GCD" criterion is requested. In the latter functions, GCD values compare (by default) the subspaces spanned by k-variable subsets and by the first k PCs. In the leaps function, GCD values, by default, compare the subspaces spanned by k-variable subsets and by the first *kmin* PCs (i.e., the cardinalities of the subsets of variables and PCs are not equal, except for the first cardinality requested). This option is directly related to the nature of the leaps algorithm. ########## Version 0.7 (March 2004) ############## 1) A major new function has been included: "leaps". Leaps implements Duarte Silva's adaptation of Furnival and Wilson's Leaps and Bounds Algorithm for variable selection in Regression Analysis. The bulk of computations are carried out by C++ routines. 2) New checks for symmetry and positive-definiteness of the input matrix were added. 3) The references in the help files were updated. ########## Version 0.4.1 (February 2004) ######### 1) Changed file gcd.Rd, which had old default option for the argument pcindices. This generated a codoc mismatch. ########## Version 0.4 (April 2003) ############## 1) Version 1.7 of R includes LAPACK. The configure scripts to check for the presence of LAPACK in the system have been deleted. 2) Argument "initialsol" in functions "anneal" and "improve", as well as argument "initialpop" in function "genetic" are now NULL by default. The validations and initializations have been changed as a result. 3) Argument "pcindices" in function "gcd.coef" is initially NULL and will be set to 1:kmax unless the user has specified a set of PC indices. 4) Validation tests in functions "gcd.coef", "rm.coef" and "rv.coef" to check whether user-specified variable (and or PC) indices are integers. 5) Validation tests in functions "anneal", "improve" and "genetic" to test whether inital solutions or population are specified as arrays of integers. 6) A bug was fixed in functions "anneal", "genetic" and "improve" in those cases when argument "pcindices" is defined by the user. A logical comparison between a vector "pcindices" and another vector was being treated as if a global value for equality were returned. ########## Version 0.3 (October 2002) ############## 1) Changes to the configure scripts necessary for configuration in the MacOSX. 2) Functions anneal, genetic and improve now include checks for covariance/correlation matrices not of full rank. If non-full rank matrices are given, the functions exit with a warning. ########## Version 0.2 (June 2002) ################# The following changes were made in Version 0.2 of R package subselect: 1) A bug was corrected in function anneal: when more than one solution (nsol > 1) was requested, the initial temperature for subsequent solutions was not being updated. 2) The Fortran code was changed to use R's default random number generator, introducing a call to R as described in pages 48 and 50-51 of the "Writing R Extensions" manual, version 1.5. As a consequence, the LAPACK routine SLARUV is no longer used and the "iseed" argument of functions "anneal", "genetic" and "improve" has been replaced by the logical argument "setseed". 3) Initial solutions (variable subsets) may now be fed by the user to each of functions "anneal" (argument "initialsol"), "genetic" (argument "initialpop") and "improve" (argument "initialsol"). 4) The Fortran code was changed to pass messages directly to the R GUI (which did not occur in the Windows version), as described in page 50 of the "Writing R Extensions" manual, version 1.5. 5) The "indices" argument of functions "gcd.coef", "rm.coef" and "rv.coef" can now be 2-d or 3-d arrays. In this way, the "$subsets" and "$bestsets" output options of functions "anneal", "genetic" and "improve" can be passed directly to functions "rm.coef", "rv.coef" and "gcd.coef", allowing easy computation of values of any criteria for subsets produced by the search algorithms using a different criterion. 6) The "printfile" option of functions "anneal", "genetic" and "improve" has been dropped, as printing to files is best done using the R environment. 7) Function "anneal" has a new argument, "coolfreq", which controls the frequency with which the simulated annealing temperature is cooled (used to be hard-coded as every 20 iterations in the Fortran code of version 0.1). 8) Function "genetic" has a new argument "mutprob", which controls the probability of each offspring in the genetic algorithm undergoing a mutation (i.e., being fed to the restricted local improvement algorithm), if the "mutate" option is set to TRUE (previously *all* children were mutated, which considerably slowed down run times). 9) Function "full.k.search" has been dropped from the package. It was computationally inefficient and not in the general spirit of the package. For computationally efficient full-search algorithms, see program SSCMA by Pedro Duarte Silva of the Universidade Catolica do Porto, available at http://porto.ucp.pt/psilva 10) The help files were updated accordingly. Details of the "rv.coef" and "rm.coef" functions were also introduced.