[R] Chi2 algorithm - R

Luke Skywalker mattered91 at gmail.com
Wed Nov 23 22:26:45 CET 2016


What does it mean to "have a mantainer"? Is he a third party? Is he an
individual developer and you can install whose package on your risk? Are
the package created by maintainers not tested?

Anyway, I wrote him. I'm waiting for response.

Regards

Il 23/Nov/2016 22:21, "peter dalgaard" <pdalgd at gmail.com> ha scritto:

> Notice that this relates to an R _package_, which has a maintainer. You
> cannot expect general R users or developers to know about the details of
> the package. It doesn't look like there is dcoumentation beyond the help
> pages, so you may need to contact the maintainer or study the actual code.
>
> -pd
>
> > On 23 Nov 2016, at 17:08 , Luke Skywalker <mattered91 at gmail.com> wrote:
> >
> > Good evening,
> >
> > I'm encountering a different kind of discretization with respect to the
> > 1997 Liu and Setiono's one descripted in their papers, using Chi2
> algorithm
> > for feature selection with discretization.
> >
> > As stated in R documentation (discretization - R (from CRAN)
> > <https://cran.r-project.org/web/packages/discretization/
> discretization.pdf>),
> > R package discretizion offers the function Chi2, which comes to life in
> the
> > following papers:
> >
> > Liu, H. and Setiono, R. (1995). Chi2: Feature selection and
> discretization
> > of numeric attributes, Tools with Artificial Intelligence, 388–391.
> >
> > Liu, H. and Setiono, R. (1997). Feature selection and discretization,
> IEEE
> > transactions on knowledge and data engineering, Vol.9, no.4, 642–645.
> >
> > I wrote the following R programming language code, in which I have set
> > alpha and delta equal to the ones set in the papers above. Finally, the
> > following code prints out the discretized dataframe. I used Iris
> dataframe,
> > as in one of the examples in the two papers. The first paper above states
> > that alfa = 0.5 and delta = 5%, and that "the originally odd numbered
> data
> > are selected for training (75 patterns) and rest for testing (75
> > patterns)". With this asset, Sepal attributes should be removed.
> >
> > library(discretization)
> > data(iris)
> > df1 <- iris[FALSE,]for(i in 1:nrow(iris)){
> >    if(i %% 2 != 0){
> >        df1 <- rbind(df1, iris[i,])
> >    }}
> > chi2(df1, alp=0.5, del=0.05)$Disc.data
> >
> > The point is that, observing the dataframe printed out by the last
> > instruction, you can see that no attribute is removed. The discretized
> data
> > frame still have 4 attributes discretized: if I correctly understood the
> > above papers, Sepal Length and Sepal Width should have been both
> > discretized in just one interval by Chi2 algorithm.
> >
> > I have posted a question here: http://stats.stackexchange.com/questions/
> > 247499/why-does-not-r-chi2-algorithm-discretize-in-the-
> > same-manner-as-in-the-paper-by-l?noredirect=1#comment470974_247499.
> >
> >
> > Moreover, it's really hard to understand the cut points that Chi2
> algorithm
> > implemented in R makes. For example:
> >
> > res <- chi2(iris, 0.5, 0.05)
> >
> > cut(iris$Sepal.Length, res$cutp, labels=FALSE) is different from
> > res$Disc.data$Sepal.Length
> >
> > Help me understand, please
> >
> > Best regards
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>
>
>
>
>
>
>
>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list