[R] merging corpora and metadata

Henri-Paul Indiogine hindiogine at gmail.com
Thu Nov 17 22:43:30 CET 2011


Greetings!

I loose all my metadata after concatenating corpora. This is an
example of what happens:

> meta(corpus.1)
   MetaID cid fid selfirst selend                         fname
1       0   1  11     2169   2518    WCPD-2001-01-29-Pg217.scrb
2       0   1  14     9189   9702     WCPD-2003-01-13-Pg39.scrb
3       0   1  14     2109   2577     WCPD-2003-01-13-Pg39.scrb

....
....

17      0   1 114    17863  18256    WCPD-2007-04-30-Pg515.scrb


> meta(corpus.2)
   MetaID cid fid selfirst selend                         fname
1       0   2   2    11016  11600           DCPD-200900595.scrb
2       0   2   6    19510  20098           DCPD-201000636.scrb
3       0   2   6    23935  24573           DCPD-201000636.scrb

....
....

94      0   2 127    16225  17128   WCPD-2009-01-12-Pg22-3.scrb


> tot.corpus <- c(corpus.1, corpus.2)
> meta(tot.corpus)

    MetaID
1        0
2        0
3        0

....
....

111      0
>

This is from the structure of corpus.1

..$ MetaData:List of 2
  .. ..$ create_date: POSIXlt[1:1], format: "2011-11-17 21:09:57"
  .. ..$ creator    : chr "henk"
  ..$ Children: NULL
  ..- attr(*, "class")= chr "MetaDataNode"
 - attr(*, "DMetaData")='data.frame':	17 obs. of  6 variables:
  ..$ MetaID  : num [1:17] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ cid     : int [1:17] 1 1 1 1 1 1 1 1 1 1 ...
  ..$ fid     : int [1:17] 11 14 14 17 46 80 80 80 91 91 ...
  ..$ selfirst: num [1:17] 2169 9189 2109 8315 9439 ...
  ..$ selend  : num [1:17] 2518 9702 2577 8881 10102 ...
  ..$ fname   : chr [1:17] "WCPD-2001-01-29-Pg217.scrb"
"WCPD-2003-01-13-Pg39.scrb" "WCPD-2003-01-13-Pg39.scrb"
"WCPD-2004-05-17-Pg856.scrb" ...
 - attr(*, "class")= chr [1:3] "VCorpus" "Corpus" "list"


Any idea on what I could do to keep the metadata in the merged corpus?

Thanks,
Henri-Paul


-- 
Henri-Paul Indiogine

Curriculum & Instruction
Texas A&M University
TutorFind Learning Centre

Email: hindiogine at gmail.com
Skype: hindiogine
Website: http://people.cehd.tamu.edu/~sindiogine



More information about the R-help mailing list