[R] subset dataframe

arun smartpink111 at yahoo.com
Tue Apr 23 00:11:59 CEST 2013


Hi,
Just noticed that there is a space after "Angola "
$ X    : Factor w/ 39 levels "Angola ",

If that is the case:
set.seed(15)
 agoa<- data.frame(X.1=rep(c("AGOA ","GSP","CST"),3),X1996= sample(1:20000,9,replace=TRUE),X2000=sample(40:30000,9,replace=TRUE))
  subset(agoa,X.1=="AGOA")
#[1] X.1   X1996 X2000
#<0 rows> (or 0-length row.names)
agoaSub<- subset(agoa,X.1=="AGOA ")
 str(agoaSub)
#'data.frame':    3 obs. of  3 variables:
# $ X.1  : Factor w/ 3 levels "AGOA ","CST",..: 1 1 1
# $ X1996: int  12043 13019 16304
# $ X2000: int  24950 15292 25260

#To drop the levels:
agoaSub[]<- lapply(agoaSub,function(x) x[drop=TRUE])
 str(agoaSub)
#'data.frame':    3 obs. of  3 variables:
# $ X.1  : Factor w/ 1 level "AGOA ": 1 1 1
# $ X1996: int  12043 13019 16304
# $ X2000: int  24950 15292 25260
A.K.



----- Original Message -----
From: arun <smartpink111 at yahoo.com>
To: Mihai Nica <mihainica at yahoo.com>
Cc: R help <r-help at r-project.org>
Sent: Monday, April 22, 2013 5:57 PM
Subject: Re: [R] subset dataframe

HI,
Could you provide an example dataset?

set.seed(15)
agoa<- data.frame(X.1=rep(c("AGOA","GSP","CST"),3),X1996= sample(1:20000,9,replace=TRUE),X2000=sample(40:30000,9,replace=TRUE))
 str(agoa)
#'data.frame':    9 obs. of  3 variables:
# $ X.1  : Factor w/ 3 levels "AGOA","CST","GSP": 1 3 2 1 3 2 1 3 2
# $ X1996: int  12043 3901 19330 13019 7342 19778 16304 5080 13745
# $ X2000: int  24950 3175 19399 15292 21211 25875 25260 13445 28942


 subset(agoa,X.1=="AGOA")
 #  X.1 X1996 X2000
#1 AGOA 12043 24950
#4 AGOA 13019 15292
#7 AGOA 16304 25260
agoa[agoa$X.1=="AGOA",]
 #  X.1 X1996 X2000
#1 AGOA 12043 24950
#4 AGOA 13019 15292
#7 AGOA 16304 25260
A.K.


________________________________
From: Mihai Nica <mihainica at yahoo.com>
To: "r-help at r-project.org" <r-help at r-project.org> 
Sent: Monday, April 22, 2013 5:14 PM
Subject: [R] subset dataframe


I can't understand what is happening. This is the code and results:

> agoa <- read.table(file = "C:/Users/HTPC/Documents/_Documents/Research/WithDidia/AGOAUSImports.txt", header = T, sep = "\t", dec = ".", na.strings = "NA", stringsAsFactors = T)#
> str(agoa); names(agoa)

'data.frame':109 obs. of  19 variables:
 $ X    : Factor w/ 39 levels "Angola ","Benin ",..: 1 1 1 2 2 3 3 3 4 4 ...
 $ X.1  : Factor w/ 3 levels "AGOA ","GSP ",..: 3 1 2 3 2 3 1 2 3 1 ...
 $ X1996: int  2687143 0 2 18084 70 23356 0 3624 3835 0 ...
 $ X1997: int  2427824 0 356492 4303 3437 18758 0 5882 930 0 ...
 $ X1998: int  1205545 0 1045996 1335 2269 14010 0 5660 503 0 ...
 $ X1999: int  1596052 0 828761 6042 11788 12071 0 4824 2695 0 ...
 $ X2000: int  2178246 0 1378777 1026 1414 38024 0 2922 502 0 ...
 $ X2001: int  464083 0 2635482 1108 178 19429 0 1221 4919 0 ...
 $ X2002: int  386118 0 2728387 680 0 25014 3707 871 2862 0 ...
 $ X2003: int  441647 0 3822701 602 0 7293 6343 0 788 0 ...
 $ X2004: int  471009 1349411 2700750 1310 215 52840 20119 7 474 0 ...
 $ X2005: int  1081143 3662774 3740324 509 4 148102 30044 7 1962 0 ...
 $ X2006: int  1670746 4127605 5920870 531 24 224382 27688 27 954 6 ...
 $ X2007: int  2346392 3898345 6262784 5076 0 155818 31331 304 1415 0 ...
 $ X2008: int  8151345 8119377 2639949 31010 0 202938 15803 104 495 0 ...
 $ X2009: int  5257573 3018965 1062246 425 16 119540 12362 8 2096 0 ...
 $ X2010: int  6542843 4741574 662450 271 4 158147 11559 8 2368 2 ...
 $ X2011: int  8423316 5174087 70 1957 14 276223 15479 1585 3599 2 ...
 $ X2012: int  8017601 1761068 45196 2625 49 204337 10427 1757 2233 5 ...

 [1] "X"     "X.1"   "X1996" "X1997" "X1998" "X1999" "X2000" "X2001" "X2002"
[10] "X2003" "X2004" "X2005" "X2006" "X2007" "X2008" "X2009" "X2010" "X2011"
[19] "X2012"

> agoa.AGOA <- subset(agoa, agoa$X.1 == "AGOA")
> str(agoa.AGOA)

'data.frame':0 obs. of  19 variables:
 $ X    : Factor w/ 39 levels "Angola ","Benin ",..: 
 $ X.1  : Factor w/ 3 levels "AGOA ","GSP ",..: 
 $ X1996: int 
 $ X1997: int 
 $ X1998: int 
 $ X1999: int 
 $ X2000: int 
 $ X2001: int 
 $ X2002: int 
 $ X2003: int 
 $ X2004: int 
 $ X2005: int 
 $ X2006: int 
 $ X2007: int 
 $ X2008: int 
 $ X2009: int 
 $ X2010: int 
 $ X2011: int 
 $ X2012: int 
> 
> 
​I did try :

agoa.AGOA = agoa[X.1 == AGOA,]

with similar results.  All the help I looked over gives these as solutions...
 
mike
    [[alternative HTML version deleted]]


______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list