[R] How to subset data, by sorting names alphabetically.
Leandro Roser
learoser at gmail.com
Fri Feb 13 00:47:44 CET 2015
Hi, a solution could be:
# example matrix a:
a <- matrix(1:100, 10, 10)
a[, 1] <- (sample(c("aa","bb" , "ab"), 10, rep=TRUE))
a <- a[order(a[, 1]), ] # order the matrix by row = 1
#subsetting a:
lev <- levels(as.factor(a[, 1]))
subs <- list()
for(i in 1:length(lev)) {
subs[[i]] <- a[a[, 1] %in% lev[i], ]
}
#result:
subs
## an alternative, with column 1 as name of list:
# example matrix a:
a <- matrix(1:100, 10, 10)
a[, 1] <- (sample(c("aa","bb" , "ab"), 10, rep=TRUE))
a <- a[order(a[, 1]), ] # order the matrix by row = 1
lev <- levels(as.factor(a[, 1]))
subs <- list()
for(i in 1:length(lev)) {
subs[[i]] <- a[a[, 1] %in% lev[i], -1]
}
names(subs) <- lev
#result:
subs
2015-02-12 19:20 GMT-03:00 Greg Snow <538280 at gmail.com>:
> The split function does essentially this, but puts the results into a list
> rather than using the dangerous and messy assign function. The overall
> syntax is simpler as well.
>
> On Thu, Feb 12, 2015 at 3:14 AM, Jim Lemon <drjimlemon at gmail.com> wrote:
>
>> Hi Samarvir,
>> Assuming that you want to generate a separate data frame for each
>> value of "Name",
>>
>> # name of initial data frame is ssdf
>> for(nameval in unique(ssdf$Name)) assign(nameval,ssdf[ssdf$Name==nameval,])
>>
>> This will produce as many data frames as there are unique values of
>> ssdf$Name, each named by the values it contains.
>>
>> Jim
>>
>>
>> On Thu, Feb 12, 2015 at 3:57 PM, samarvir singh <samarvir1996 at gmail.com>
>> wrote:
>> > hello,
>> >
>> > I am cleaning some large data with 4 million observation and 7 variable.
>> > Of the 7 variables , 1 is name/string
>> >
>> > I want to subset data, which have same name
>> >
>> > Example-
>> >
>> > Name var1 var2 var3 var4 var5 var6
>> > aa - - - - - -
>> > ab
>> > bd
>> > ac
>> > ad
>> > af
>> > ba
>> > bd
>> > aa
>> > av
>> >
>> > i want to sort the data something like this
>> >
>> > aa
>> > aa
>> > all aa in a same subset
>> >
>> > and all ab in same subset
>> >
>> > every column with same name in a subset
>> >
>> >
>> >
>> > thanks in advance.
>> > I am new to R community.
>> > appreciate your help
>> > - Samarvir
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> 538280 at gmail.com
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Lic. Leandro Gabriel Roser
Laboratorio de Genética
Dto. de Ecología, Genética y Evolución,
F.C.E.N., U.B.A.,
Ciudad Universitaria, PB II, 4to piso,
Nuñez, Cdad. Autónoma de Buenos Aires,
Argentina.
tel ++54 +11 4576-3300 (ext 219)
fax ++54 +11 4576-3384
More information about the R-help
mailing list