[R] RegExp question
Andrej
andrej.kastrin at gmail.com
Wed Jun 16 19:05:06 CEST 2010
Sorry, I apologize. Below is the minimal example.
library(RWeka)
model <- J48(as.factor(Species)~., data = iris)
> model
J48 pruned tree
------------------
Petal.Width <= 0.6: setosa (50.0)
Petal.Width > 0.6
| Petal.Width <= 1.7
| | Petal.Length <= 4.9: versicolor (48.0/1.0)
| | Petal.Length > 4.9
| | | Petal.Width <= 1.5: virginica (3.0)
| | | Petal.Width > 1.5: versicolor (3.0/1.0)
| Petal.Width > 1.7: virginica (46.0/1.0)
Number of Leaves : 5
Size of the tree : 9
So, the task is to extract the number of leases.
Andrej
On Jun 16, 6:58 pm, David Winsemius <dwinsem... at comcast.net> wrote:
> Publicly produce something we can work with. I have no idea how to
> create an example that will match such an object.
>
> ?dput
> ?dump
>
> Read Posting Guide.
> --
> David.
>
> On Jun 16, 2010, at 12:54 PM, Andrej wrote:
>
>
>
> > Thanks David for your fast reply, but now I realized tat "string" is
> > of type:
>
> >> class(string)
> > [1] "jobjRef"
> > attr(,"package")
> > [1] "rJava"
>
> > so I get an error when i try with gsub or sub:
>
> >> sub("^.+\\t(\\d+)\\n.+$", "\\1", string)
> > Error in as.character.default(x) :
> > no method for coercing this S4 class to a vector
>
> > I think that there should be trivial solution, but... Any further
> > idea?
>
> > Regards, Andrej
>
> > On Jun 16, 6:47 pm, David Winsemius <dwinsem... at comcast.net> wrote:
> >> On Jun 16, 2010, at 12:04 PM, Andrej wrote:
>
> >>> Dear all,
>
> >>> I'm trying to filter out the "number of leaves" (it should be 1 in
> >>> the
> >>> example below) from the following string:
>
> >>>> string
> >>> [1] "Java-Object{J48 pruned tree\n------------------\n: 0
> >>> (15.0/3.0)\n
> >>> \nNumber of Leaves : \t1\n\nSize of the tree : \t1\n}"
>
> >>> Any idea how to do that as simple as possible? Thanks in advance for
> >>> any advice.
>
> >> ?sub # or ?gsub if you need more than one pattern matched (they are
> >> on the same page).
>
> >> This should find the first occurrence of digits following a tab
> >> terminated by a line feed and then return only the digits:
>
> >> string <- "Java-Object{J48 pruned tree\n------------------\n: 0
> >> (15.0/3.0)\n \nNumber of Leaves : \t1\n\nSize of the tree : \t1\n}"
> >> sub("^.+\\t(\\d+)\\n.+$", "\\1", string)
> >> [1] "1"
>
> >> The parens within the search pattern are matched to "\\1". Need to
> >> double backslashed within patterns.
>
> >>> Regards, Andrej
>
> >> --
>
> >> David Winsemius, MD
> >> West Hartford, CT
>
> >> ______________________________________________
> >> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/
> >> listinfo/r-help
> >> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
> > ______________________________________________
> > R-h... at r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list