[R] merging corpora and metadata
Joshua Wiley
jwiley.psych at gmail.com
Fri Nov 18 03:01:14 CET 2011
Hi Michael,
require(sos)
findFn("{meta}", sortby = "Function")
## see that only two functions have the exact name, 'meta'
## one is titled, "Meta Data Management" in the package 'tm'
## seems a pretty likely choice
Also, the fact that it is a truly terrible idea does not mean it is not easy:
mvir <- new.env()
mvir$c <- function(x, ...) {cat("sure you can!\n"); mean(x, ...)}
attach(mvir)
c(x = 1:10)
detach(mvir)
rm(mvir)
Cheers,
Josh
On Thu, Nov 17, 2011 at 5:25 PM, R. Michael Weylandt
<michael.weylandt at gmail.com> <michael.weylandt at gmail.com> wrote:
> What package is all this from()?
>
> You might check if there is a special rbind/cbind method provided. I don't think you can easily change the behavior of c()
>
> Michael
>
> On Nov 17, 2011, at 4:43 PM, Henri-Paul Indiogine <hindiogine at gmail.com> wrote:
>
>> Greetings!
>>
>> I loose all my metadata after concatenating corpora. This is an
>> example of what happens:
>>
>>> meta(corpus.1)
>> MetaID cid fid selfirst selend fname
>> 1 0 1 11 2169 2518 WCPD-2001-01-29-Pg217.scrb
>> 2 0 1 14 9189 9702 WCPD-2003-01-13-Pg39.scrb
>> 3 0 1 14 2109 2577 WCPD-2003-01-13-Pg39.scrb
>>
>> ....
>> ....
>>
>> 17 0 1 114 17863 18256 WCPD-2007-04-30-Pg515.scrb
>>
>>
>>> meta(corpus.2)
>> MetaID cid fid selfirst selend fname
>> 1 0 2 2 11016 11600 DCPD-200900595.scrb
>> 2 0 2 6 19510 20098 DCPD-201000636.scrb
>> 3 0 2 6 23935 24573 DCPD-201000636.scrb
>>
>> ....
>> ....
>>
>> 94 0 2 127 16225 17128 WCPD-2009-01-12-Pg22-3.scrb
>>
>>
>>> tot.corpus <- c(corpus.1, corpus.2)
>>> meta(tot.corpus)
>>
>> MetaID
>> 1 0
>> 2 0
>> 3 0
>>
>> ....
>> ....
>>
>> 111 0
>>>
>>
>> This is from the structure of corpus.1
>>
>> ..$ MetaData:List of 2
>> .. ..$ create_date: POSIXlt[1:1], format: "2011-11-17 21:09:57"
>> .. ..$ creator : chr "henk"
>> ..$ Children: NULL
>> ..- attr(*, "class")= chr "MetaDataNode"
>> - attr(*, "DMetaData")='data.frame': 17 obs. of 6 variables:
>> ..$ MetaID : num [1:17] 0 0 0 0 0 0 0 0 0 0 ...
>> ..$ cid : int [1:17] 1 1 1 1 1 1 1 1 1 1 ...
>> ..$ fid : int [1:17] 11 14 14 17 46 80 80 80 91 91 ...
>> ..$ selfirst: num [1:17] 2169 9189 2109 8315 9439 ...
>> ..$ selend : num [1:17] 2518 9702 2577 8881 10102 ...
>> ..$ fname : chr [1:17] "WCPD-2001-01-29-Pg217.scrb"
>> "WCPD-2003-01-13-Pg39.scrb" "WCPD-2003-01-13-Pg39.scrb"
>> "WCPD-2004-05-17-Pg856.scrb" ...
>> - attr(*, "class")= chr [1:3] "VCorpus" "Corpus" "list"
>>
>>
>> Any idea on what I could do to keep the metadata in the merged corpus?
>>
>> Thanks,
>> Henri-Paul
>>
>>
>> --
>> Henri-Paul Indiogine
>>
>> Curriculum & Instruction
>> Texas A&M University
>> TutorFind Learning Centre
>>
>> Email: hindiogine at gmail.com
>> Skype: hindiogine
>> Website: http://people.cehd.tamu.edu/~sindiogine
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/
More information about the R-help
mailing list