[R] regex expression to select row or column

jim holtman jholtman at gmail.com
Sat Jul 25 15:27:11 CEST 2009


Does this do what you want:

> x
     ID S1-a S1-b S2-a S2-b
1 001-A    1    2    3    4
2 001-B    5    6    7    8
3 002-A    9   10   11   12
4 002-B   13   14   15   16
> # split the column names
> t.n <- strsplit(names(x)[-1], '-')
> t.n <- sort(sapply(t.n, function(a) paste(a[2],a[1])))
> # create the new ordering of column
> t.new <- sapply(strsplit(t.n, ' '), function(a) paste(a[2], a[1], sep='-'))
> # now split by ID after the '-'
> x.s <- split(x, sub('.*-(.*)', '\\1', x$ID))
> # reorder the columns of the data
> x.s[] <- lapply(x.s, function(a) a[,c("ID", t.new)])
>
>
> x.s
$A
     ID S1-a S2-a S1-b S2-b
1 001-A    1    3    2    4
3 002-A    9   11   10   12

$B
     ID S1-a S2-a S1-b S2-b
2 001-B    5    7    6    8
4 002-B   13   15   14   16


On Sat, Jul 25, 2009 at 5:14 AM, Junqian Gordon Xu<xjqian at gmail.com> wrote:
> You're right. Using read.csv, the first column is a factor, not string (or
> should I use str?). The following is a 2x2 version of the data frame after
> read.csv
>
>      ID    S1-a   S1-b  S2-a  S2-b
> 1  001-A    1      2     3     4
> 2  001-B    5      6     7     8
> 3  002-A    9     10    11    12
> 4  002-B   13     14    15    16
>
> the resulting data frame I want is (whether or not to retain the factor ID
> info in the resulting data frame is not important)
>
>   S1-a S2-a       S1-b S2-b
> 1     1    3     1    2    4
> 2     9   11     2   10   12
>
>   S1-a S2-a       S1-b S2-b
> 1     5    7     1    6    8
> 2    13   15     2   14   16
>
> Hope it's clearer.
>
> On 07/25/2009 03:37 AM, jim holtman wrote:
>>
>> Are you using 'read.csv'?  At least include an 'str' of the object you
>> are wanting to convert so that we know the structure of it, since we
>> can not guess at what it is.
>>
>> On Sat, Jul 25, 2009 at 4:32 AM, Junqian Gordon Xu<xjqian at gmail.com>
>> wrote:
>>>
>>> Actually when I read the spreadsheet from cvs file, "S1-[abcd]" are the
>>> header and "T1-[abcd]" are the strings in first column of the data frame.
>>>
>>> Gordon
>>>
>>> On 07/25/2009 03:13 AM, jim holtman wrote:
>>>>
>>>> It it not entirely clear what the format of your data is. If you have
>>>> a dataframe that you would like to separate into several different one
>>>> based on the value in a column, then something like this will work:
>>>>
>>>> df.list <- split(yourDF, yourDF$column)
>>>>
>>>> This will create a list of dataframes, split according to the contents
>>>> of "column".
>>>>
>>>> On Fri, Jul 24, 2009 at 9:20 PM, Junqian Gordon Xu<xjqian at gmail.com>
>>>> wrote:
>>>>>
>>>>> I have a multidimensional data which looks like the following:
>>>>>
>>>>>     "S1-a" "S2-b" "S3-c" "S4-d" "S5-a" "S6-b" "S7-c" "S8-d"
>>>>> "T1-A"
>>>>> "T1-B"
>>>>> "T1-C"
>>>>> "T1-D"
>>>>> "T2-A"
>>>>> "T2-B"
>>>>> "T2-C"
>>>>> "T2-D"
>>>>>
>>>>> I read it from csv file and would like to have 16 separate data frames
>>>>> like
>>>>> this
>>>>>
>>>>>   "S1-a" "S2-a"     "S1-b" "S2-b"     "S1-c" "S2-c"    "S1-d" "S2-d"
>>>>> "T1-A"            "T1-A"            "T1-A"            "T1-A"
>>>>> "T2-A"            "T2-A"            "T2-A"            "T2-A"
>>>>>
>>>>>   "S1-b" "S2-b"   ...
>>>>> "T1-B"               ...
>>>>> "T1-B"               ...
>>>>>
>>>>> ...
>>>>> ...
>>>>>
>>>>> One way is to use loops to cycle through, but I think it's even simpler
>>>>> to
>>>>> use regex expression to separate them since "abcd" and "ABCD" are
>>>>> unique
>>>>> strings in the table. Does anybody have any pointer on how to do this?
>>>>>
>>>>> Thanks
>>>>> Gordon
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?




More information about the R-help mailing list