[R] Chi2 algorithm - R

Olivier Crouzet olivier.crouzet at univ-nantes.fr
Wed Nov 23 22:56:54 CET 2016


Hi,

(1) If the package has been installed from CRAN then it's been tested (which does not imply that it's'exempt of any bugs), any other source (e.g. Github) has not been tested... according to my understanding;

(2) The GNU GPL explicitly states that you install and use the software "at your own risk"... This is the case for any GPL licensed software, even for base-R... so this is no surprise!

Olivier.



--
Olivier Crouzet
LLING - Laboratoire de Linguistique de Nantes
UMR 6310 CNRS / Université de Nantes

-----Original Message-----
From: Luke Skywalker <mattered91 at gmail.com>
Sender: "R-help" <r-help-bounces at r-project.org>Date: Wed, 23 Nov 2016 22:26:45 
To: peter dalgaard<pdalgd at gmail.com>
Cc: <r-help at r-project.org>
Subject: Re: [R] Chi2 algorithm - R

What does it mean to "have a mantainer"? Is he a third party? Is he an
individual developer and you can install whose package on your risk? Are
the package created by maintainers not tested?

Anyway, I wrote him. I'm waiting for response.

Regards

Il 23/Nov/2016 22:21, "peter dalgaard" <pdalgd at gmail.com> ha scritto:

> Notice that this relates to an R _package_, which has a maintainer. You
> cannot expect general R users or developers to know about the details of
> the package. It doesn't look like there is dcoumentation beyond the help
> pages, so you may need to contact the maintainer or study the actual code.
>
> -pd
>
> > On 23 Nov 2016, at 17:08 , Luke Skywalker <mattered91 at gmail.com> wrote:
> >
> > Good evening,
> >
> > I'm encountering a different kind of discretization with respect to the
> > 1997 Liu and Setiono's one descripted in their papers, using Chi2
> algorithm
> > for feature selection with discretization.
> >
> > As stated in R documentation (discretization - R (from CRAN)
> > <https://cran.r-project.org/web/packages/discretization/
> discretization.pdf>),
> > R package discretizion offers the function Chi2, which comes to life in
> the
> > following papers:
> >
> > Liu, H. and Setiono, R. (1995). Chi2: Feature selection and
> discretization
> > of numeric attributes, Tools with Artificial Intelligence, 388–391.
> >
> > Liu, H. and Setiono, R. (1997). Feature selection and discretization,
> IEEE
> > transactions on knowledge and data engineering, Vol.9, no.4, 642–645.
> >
> > I wrote the following R programming language code, in which I have set
> > alpha and delta equal to the ones set in the papers above. Finally, the
> > following code prints out the discretized dataframe. I used Iris
> dataframe,
> > as in one of the examples in the two papers. The first paper above states
> > that alfa = 0.5 and delta = 5%, and that "the originally odd numbered
> data
> > are selected for training (75 patterns) and rest for testing (75
> > patterns)". With this asset, Sepal attributes should be removed.
> >
> > library(discretization)
> > data(iris)
> > df1 <- iris[FALSE,]for(i in 1:nrow(iris)){
> >    if(i %% 2 != 0){
> >        df1 <- rbind(df1, iris[i,])
> >    }}
> > chi2(df1, alp=0.5, del=0.05)$Disc.data
> >
> > The point is that, observing the dataframe printed out by the last
> > instruction, you can see that no attribute is removed. The discretized
> data
> > frame still have 4 attributes discretized: if I correctly understood the
> > above papers, Sepal Length and Sepal Width should have been both
> > discretized in just one interval by Chi2 algorithm.
> >
> > I have posted a question here: http://stats.stackexchange.com/questions/
> > 247499/why-does-not-r-chi2-algorithm-discretize-in-the-
> > same-manner-as-in-the-paper-by-l?noredirect=1#comment470974_247499.
> >
> >
> > Moreover, it's really hard to understand the cut points that Chi2
> algorithm
> > implemented in R makes. For example:
> >
> > res <- chi2(iris, 0.5, 0.05)
> >
> > cut(iris$Sepal.Length, res$cutp, labels=FALSE) is different from
> > res$Disc.data$Sepal.Length
> >
> > Help me understand, please
> >
> > Best regards
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>
>
>
>
>
>
>
>
>
>

	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


More information about the R-help mailing list