[R] recode according to specific sequence of characters within a string variable
Marc Schwartz
marc_schwartz at me.com
Fri Feb 4 14:42:54 CET 2011
Do you mean something like:
> with(DF.new, paste(person, year, paste("team", team, sep = ""), sep = "_"))
[1] "jeff_2001_teamx" "jeff_2002_teamy" "robert_2002_teamz"
[4] "mary_2002_teamz" "mary_2003_teamy"
?
See ?paste and ?with for more information, if so.
HTH,
Marc
On Feb 4, 2011, at 7:26 AM, Denis Kazakiewicz wrote:
> Dear R people
> Could you please help
> I have similar but opposite question
> How to reshape data from DF.new to DF from example, Mark kindly
> provided?
>
> Thank you
> Denis
>
> On Пят, 2011-02-04 at 07:09 -0600, Marc Schwartz wrote:
>> On Feb 4, 2011, at 6:32 AM, D. Alain wrote:
>>
>>> Dear R-List,
>>>
>>> I have a dataframe with one column "name.of.report" containing character values, e.g.
>>>
>>>
>>>> df$name.of.report
>>>
>>> "jeff_2001_teamx"
>>> "teamy_jeff_2002"
>>> "robert_2002_teamz"
>>> "mary_2002_teamz"
>>> "2003_mary_teamy"
>>> ...
>>> (i.e. the bit of interest is not always at same position)
>>>
>>> Now I want to recode the column "name.of.report" into the variables "person", "year","team", like this
>>>
>>>> new.df
>>>
>>> "person" "year" "team"
>>> jeff 2001 x
>>> jeff 2002 y
>>> robert 2002 z
>>> mary 2002 z
>>>
>>> I tried with grep()
>>>
>>> df$person<-grep("jeff",df$name.of.report)
>>>
>>> but of course it didn't exactly result in what I wanted to do. Could not find any solution via RSeek. Excuse me if it is a very silly question, but can anyone help me find a way out of this?
>>>
>>> Thanks a lot
>>>
>>> Alain
>>
>>
>> There will be several approaches, all largely involving the use of ?regex. Here is one:
>>
>>
>> DF <- data.frame(name.of.report = c("jeff_2001_teamx", "teamy_jeff_2002",
>> "robert_2002_teamz", "mary_2002_teamz",
>> "2003_mary_teamy"))
>>
>>> DF
>> name.of.report
>> 1 jeff_2001_teamx
>> 2 teamy_jeff_2002
>> 3 robert_2002_teamz
>> 4 mary_2002_teamz
>> 5 2003_mary_teamy
>>
>>
>> DF.new <- data.frame(person = gsub("[_0-9]|team.", "", DF$name.of.report),
>> year = gsub(".*([0-9]{4}).*","\\1", DF$name.of.report),
>> team = gsub(".*team(.).*","\\1", DF$name.of.report))
>>
>>
>>> DF.new
>> person year team
>> 1 jeff 2001 x
>> 2 jeff 2002 y
>> 3 robert 2002 z
>> 4 mary 2002 z
>> 5 mary 2003 y
>>
>>
>>
>> HTH,
>>
>> Marc Schwartz
More information about the R-help
mailing list