[R] How to subset an 'ff' object?
arun
smartpink111 at yahoo.com
Sat Dec 28 19:40:43 CET 2013
Hi,
I tried your example dataset:
Dat <- read.csv.ffdf(file="Book1.csv",header=FALSE,colClasses=c('Date','factor'),sep="")
subset(Dat,V1=='2013-12-27')
ffdf (all open) dim=c(2,2), dimorder=c(1,2) row.names=NULL
ffdf virtual mapping
PhysicalName VirtualVmode PhysicalVmode AsIs VirtualIsMatrix
V1 V1 double double FALSE FALSE
V2 V2 integer integer FALSE FALSE
PhysicalIsMatrix PhysicalElementNo PhysicalFirstCol PhysicalLastCol
V1 FALSE 1 1 1
V2 FALSE 2 1 1
PhysicalIsOpen
V1 TRUE
V2 TRUE
ffdf data
V1 V2
1 2013-12-27 c
2 2013-12-27 c
A.K.
On , arun <smartpink111 at yahoo.com> wrote:
Hi Christofer,
You can check ?subset.ff from library(ffbase)
subset(ffd,m1>1)
ffdf (all open) dim=c(2,5), dimorder=c(1,2) row.names=NULL
ffdf virtual mapping
PhysicalName VirtualVmode PhysicalVmode AsIs VirtualIsMatrix
m1 ffm integer integer FALSE FALSE
m2 ffm integer integer FALSE FALSE
m3 ffm integer integer FALSE FALSE
m4 ffm integer integer FALSE FALSE
v v integer integer FALSE FALSE
PhysicalIsMatrix PhysicalElementNo PhysicalFirstCol PhysicalLastCol
m1 TRUE 1 1 1
m2 TRUE 1 2 2
m3 TRUE 1 3 3
m4 TRUE 1 4 4
v FALSE 2 1 1
PhysicalIsOpen
m1 TRUE
m2 TRUE
m3 TRUE
m4 TRUE
v TRUE
ffdf data
m1 m2 m3 m4 v
1 2 5 8 11 2
2 3 6 9 12 3
A.K.
On Saturday, December 28, 2013 1:29 PM, Christofer Bogaso <bogaso.christofer at gmail.com> wrote:
Hi Arun,
I will look into why dput is giving error......... In the mean time I am attaching the csv file with this mail, however not sure if R-hep will accept it.
I tried to converting it data.frame. However as 'Dat' is of huge size (that is why I loaded it via ff route), converting the entire data to data.frame giving memory allocation problem.
Thanks and regards,
On Sun, Dec 29, 2013 at 12:03 AM, arun <smartpink111 at yahoo.com> wrote:
HI,
>
>The dput() is showing error message.
>Is it not possible to convert it to data.frame and subset?
>
>Using the example from ?ffdf
> m <- matrix(1:12, 3, 4, dimnames=list(c("r1","r2","r3"), c("m1","m2","m3","m4")))
> v <- 1:3
> ffm <- as.ff(m)
> ffv <- as.ff(v)
> ffd <- ffdf(ffm, v=ffv, row.names=row.names(ffm))
>d1 <- data.frame(ffd)
>
>
>subset(d1,m1>1)
> m1 m2 m3 m4 v
>r2 2 5 8 11 2
>r3 3 6 9 12 3
>
>
>A.K.
>
>
>
>On Saturday, December 28, 2013 12:29 PM, Christofer Bogaso <bogaso.christofer at gmail.com> wrote:
>Hi again,
>
>I have loaded a huge csv file in R using 'ff' package, however could not
>understand how can I subset the loaded object. Below is my try:
>
>> suppressMessages(library(ff))
>>
>> Dat <- read.csv.ffdf(file = "f:/Book1.csv", header = F, colClasses =
>c('Date', 'factor'))
>> Dat
>ffdf (all open) dim=c(4,2), dimorder=c(1,2) row.names=NULL
>ffdf virtual mapping
> PhysicalName VirtualVmode PhysicalVmode AsIs VirtualIsMatrix
>PhysicalIsMatrix PhysicalElementNo PhysicalFirstCol PhysicalLastCol
>PhysicalIsOpen
>V1 V1 double double FALSE FALSE
>FALSE 1 1 1 TRUE
>V2 V2 integer integer FALSE FALSE
>FALSE 2 1 1 TRUE
>ffdf data
> V1 V2
>1 2013-12-28 a
>2 2013-12-28 b
>3 2013-12-27 c
>4 2013-12-27 c
>>
>> subset(Dat, Dat$V1 == as.Date('2013-12-27'))
>ffdf (all open) dim=c(4,0), dimorder=c(1,2) row.names=NULL
>ffdf virtual mapping
>[1] PhysicalName VirtualVmode PhysicalVmode AsIs
>VirtualIsMatrix PhysicalIsMatrix PhysicalElementNo PhysicalFirstCol
>PhysicalLastCol
>[10] PhysicalIsOpen
><0 rows> (or 0-length row.names)
>ffdf data
>[1] "[empty matrix]"
>
>
>
>My resulting object is showing '0' rows!
>
>
>The 'Dat' object looks like below:
>
>> dput(Dat)
>structure(list(virtual = structure(list(VirtualVmode = c("double",
>"integer"), AsIs = c(FALSE, FALSE), VirtualIsMatrix = c(FALSE,
>FALSE), PhysicalIsMatrix = c(FALSE, FALSE), PhysicalElementNo = 1:2,
> PhysicalFirstCol = c(1L, 1L), PhysicalLastCol = c(1L, 1L)), .Names =
>c("VirtualVmode",
>"AsIs", "VirtualIsMatrix", "PhysicalIsMatrix", "PhysicalElementNo",
>"PhysicalFirstCol", "PhysicalLastCol"), row.names = c("V1", "V2"
>), class = "data.frame", Dim = c(4L, 2L), Dimorder = 1:2), physical =
>structure(list(
> V1 = structure(list(), physical = <pointer: 0x0298f498>, virtual =
>structure(list(), Length = 4L, Symmetric = FALSE, ramclass = "Date"), class
>= c("ff_vector",
> "ff")), V2 = structure(list(), physical = <pointer: 0x0298f4c8>,
>virtual = structure(list(), Length = 4L, Symmetric = FALSE, Levels = c("a",
> "b", "c"), ramclass = "factor"), class = c("ff_vector", "ff"
> ))), .Names = c("V1", "V2")), row.names = NULL), .Names = c("virtual",
>"physical", "row.names"), class = "ffdf")
>
>
>Can experts here guide me how to subset that?
>
>Thanks for your time.
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>
More information about the R-help
mailing list