[R] How to subset an 'ff' object?

arun smartpink111 at yahoo.com
Sat Dec 28 19:33:58 CET 2013


Hi Christofer,
You can check ?subset.ff from library(ffbase)

subset(ffd,m1>1)
ffdf (all open) dim=c(2,5), dimorder=c(1,2) row.names=NULL
ffdf virtual mapping
   PhysicalName VirtualVmode PhysicalVmode  AsIs VirtualIsMatrix
m1          ffm      integer       integer FALSE           FALSE
m2          ffm      integer       integer FALSE           FALSE
m3          ffm      integer       integer FALSE           FALSE
m4          ffm      integer       integer FALSE           FALSE
v             v      integer       integer FALSE           FALSE
   PhysicalIsMatrix PhysicalElementNo PhysicalFirstCol PhysicalLastCol
m1             TRUE                 1                1               1
m2             TRUE                 1                2               2
m3             TRUE                 1                3               3
m4             TRUE                 1                4               4
v             FALSE                 2                1               1
   PhysicalIsOpen
m1           TRUE
m2           TRUE
m3           TRUE
m4           TRUE
v            TRUE
ffdf data
  m1 m2 m3 m4  v
1  2  5  8 11  2
2  3  6  9 12  3
A.K.







On Saturday, December 28, 2013 1:29 PM, Christofer Bogaso <bogaso.christofer at gmail.com> wrote:

Hi Arun,

I will look into why dput is giving error......... In the mean time I am attaching the csv file with this mail, however not sure if R-hep will accept it.

I tried to converting it data.frame. However as 'Dat' is of huge size (that is why I loaded it via ff route), converting the entire data to data.frame giving memory allocation problem.

Thanks and regards,



On Sun, Dec 29, 2013 at 12:03 AM, arun <smartpink111 at yahoo.com> wrote:

HI,
>
>The dput() is showing error message.
>Is it not possible to convert it to data.frame and subset?
>
>Using the example from ?ffdf
> m <- matrix(1:12, 3, 4, dimnames=list(c("r1","r2","r3"), c("m1","m2","m3","m4")))
>       v <- 1:3
>       ffm <- as.ff(m)
>       ffv <- as.ff(v)
>  ffd <- ffdf(ffm, v=ffv, row.names=row.names(ffm))
>d1 <- data.frame(ffd)
>
>
>subset(d1,m1>1)
>   m1 m2 m3 m4 v
>r2  2  5  8 11 2
>r3  3  6  9 12 3
>
>
>A.K.
>
>
>
>On Saturday, December 28, 2013 12:29 PM, Christofer Bogaso <bogaso.christofer at gmail.com> wrote:
>Hi again,
>
>I have loaded a huge csv file in R using 'ff' package, however could not
>understand how can I subset the loaded object. Below is my try:
>
>> suppressMessages(library(ff))
>>
>> Dat <- read.csv.ffdf(file = "f:/Book1.csv", header = F, colClasses =
>c('Date', 'factor'))
>> Dat
>ffdf (all open) dim=c(4,2), dimorder=c(1,2) row.names=NULL
>ffdf virtual mapping
>   PhysicalName VirtualVmode PhysicalVmode  AsIs VirtualIsMatrix
>PhysicalIsMatrix PhysicalElementNo PhysicalFirstCol PhysicalLastCol
>PhysicalIsOpen
>V1           V1       double        double FALSE           FALSE
>FALSE                 1                1               1           TRUE
>V2           V2      integer       integer FALSE           FALSE
>FALSE                 2                1               1           TRUE
>ffdf data
>          V1         V2
>1 2013-12-28 a
>2 2013-12-28 b
>3 2013-12-27 c
>4 2013-12-27 c
>>
>> subset(Dat, Dat$V1 == as.Date('2013-12-27'))
>ffdf (all open) dim=c(4,0), dimorder=c(1,2) row.names=NULL
>ffdf virtual mapping
>[1] PhysicalName      VirtualVmode      PhysicalVmode     AsIs
>VirtualIsMatrix   PhysicalIsMatrix  PhysicalElementNo PhysicalFirstCol
>PhysicalLastCol
>[10] PhysicalIsOpen
><0 rows> (or 0-length row.names)
>ffdf data
>[1] "[empty matrix]"
>
>
>
>My resulting object is showing '0' rows!
>
>
>The 'Dat' object looks like below:
>
>> dput(Dat)
>structure(list(virtual = structure(list(VirtualVmode = c("double",
>"integer"), AsIs = c(FALSE, FALSE), VirtualIsMatrix = c(FALSE,
>FALSE), PhysicalIsMatrix = c(FALSE, FALSE), PhysicalElementNo = 1:2,
>    PhysicalFirstCol = c(1L, 1L), PhysicalLastCol = c(1L, 1L)), .Names =
>c("VirtualVmode",
>"AsIs", "VirtualIsMatrix", "PhysicalIsMatrix", "PhysicalElementNo",
>"PhysicalFirstCol", "PhysicalLastCol"), row.names = c("V1", "V2"
>), class = "data.frame", Dim = c(4L, 2L), Dimorder = 1:2), physical =
>structure(list(
>    V1 = structure(list(), physical = <pointer: 0x0298f498>, virtual =
>structure(list(), Length = 4L, Symmetric = FALSE, ramclass = "Date"), class
>= c("ff_vector",
>    "ff")), V2 = structure(list(), physical = <pointer: 0x0298f4c8>,
>virtual = structure(list(), Length = 4L, Symmetric = FALSE, Levels = c("a",
>    "b", "c"), ramclass = "factor"), class = c("ff_vector", "ff"
>    ))), .Names = c("V1", "V2")), row.names = NULL), .Names = c("virtual",
>"physical", "row.names"), class = "ffdf")
>
>
>Can experts here guide me how to subset that?
>
>Thanks for your time.
>
>    [[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list