[R] Creating Functions in R

Sarah Goslee sarah.goslee at gmail.com
Thu Jul 24 18:25:08 CEST 2014


Hi,

On Thu, Jul 24, 2014 at 11:08 AM, Pavneet Arora
<pavneet.arora at uk.rsagroup.com> wrote:
> Hello Sarah
>
> Thank you the detailed explanation, it helped me understand a lot. However,
> I don't understand what you meant by - " It's a really good idea to
> explicitly mark the loop with { } too, to reduce confusion."

Instead of
>   for(k in 1:length(sub))
>   deviation <- sub[k]- target

it's clearer to use

for(k in 1:length(sub)) {
 deviation <- sub[k]- target
}

so it's explicit what's being looped over.

But as I already explained, along with several other people, you not
only don't need a loop, but your loop is overwriting each iteration
and not at all doing what you think it is.

> Also as per your suggestion I tried to say sub$value, but i get the same
> value "-2.01" for each row. Not sure what I did wrong there?

Where did you do that?

>
> This is my code now:
> vmask <- function(sub,target){
>   for(k in 1:length(sub))
>   deviation <- sub[k]- target
>   dev <- data.frame(sub,deviation=deviation)
>   dev
>
>    cusums <- cumsum(dev$deviation)
>    cusums <- data.frame(dev,cusums=cusums)
>    cusums
>
> }
> vmask(sub,10)
> View(dev)
>
> cusums <- vmask(sub,10)
> View(cusums)
>
>
> Also when I try the cusums command, I get the following error:
> Error in data.frame(dev, cusums = cusums) : arguments imply differing number
> of rows: 30, 0
> What does this mean? And how can i fix it?

It means your function is still a mess and you haven't read the
Introduction to R guide.
Your question isn't clear: "when you try it" where? On its own? When
running the function?
There is no "cusums command" either.

If you're determined to use a function, needed or not, lose the loop.

sub <- structure(list(week = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
29, 30), value = c(9.45, 7.99, 9.29, 11.66, 12.16, 10.18, 8.04,
11.46, 9.2, 10.34, 9.03, 11.47, 10.51, 9.4, 10.08, 9.37, 10.62,
10.31, 10, 13, 10.9, 9.33, 12.29, 11.5, 10.6, 11.08, 10.38, 11.62,
11.31, 10.52)), .Names = c("week", "value"), row.names = c(NA,
-30L), class = "data.frame")


vmask <- function(sub,target){
   deviation <- sub$value - target
   cusums <- cumsum(deviation)
    data.frame(sub, deviation=deviation,cusums=cusums)
}

vmask(sub, 10)

Note that this makes substantial assumptions about the structure of
the sub argument, namely that it has a column named value.

Sarah




> PS: Thank you so much for helping me with this.
>
>
>
>
> From:        Sarah Goslee <sarah.goslee at gmail.com>
> To:        Pavneet Arora/UK/RoyalSun at RoyalSun
> Cc:        r-help <r-help at r-project.org>
> Date:        24/07/2014 15:04
> Subject:        Re: [R] Creating Functions in R
> ________________________________
>
>
>
> Hi,
>
>
>
> On Thu, Jul 24, 2014 at 9:35 AM, Pavneet Arora
> <pavneet.arora at uk.rsagroup.com> wrote:
>> Hello Guys
>> I am new at writing Functions in R, and as a result am struggling with it.
>> I am trying to use Google & other resources, but it's hard to find
>> solutions when you don't know what to look for.
>
> How about the introduction to R that comes with your installation?
> It's got a section on writing
> functions, and some other useful information that you seem to not have
> learned yet.
>
>
>> I have the following small dataset
>>> dput(sub)
>> structure(list(week = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
>> 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
>> 29, 30), value = c(9.45, 7.99, 9.29, 11.66, 12.16, 10.18, 8.04,
>> 11.46, 9.2, 10.34, 9.03, 11.47, 10.51, 9.4, 10.08, 9.37, 10.62,
>> 10.31, 10, 13, 10.9, 9.33, 12.29, 11.5, 10.6, 11.08, 10.38, 11.62,
>> 11.31, 10.52)), .Names = c("week", "value"), row.names = c(NA,
>> -30L), class = "data.frame")
>>
>> I want to take each of the value and subtract from a target {in this case
>> its 10}.
>
> Thank you for providing data with dput()!
>
> There are a bunch of things wrong with your function, starting with
> the lack of need for a function.
>
> If I understand your description correctly, what you actually want is:
>
>
> sub$deviation <- sub$value - 10
>
> But for educational purposes, here goes:
>
>
>> This is what I have written in my function so far:
>> vmask <- function(data,target){
>> for(k in 1:length(data))
>
> this actually loops through the COLUMNS of data, so first you're subtracting
> target from week, then from value
>
>
>> deviation <- data[k]- target
>
> but coincidentally it gives you what you thought you were getting,
> because you're overwriting deviation with each value of k, so the week
> -target column is never saved. It's a really good idea to explicitly
> mark the loop with { } too, to reduce confusion.
>
>> dev <- return(data.frame(cbind(data,deviation)))
>
> Hm. I don't know what you're trying to do with return() here, and
> using both data.frame() and cbind() is superfluous. It isn't always
> necessary, but I find it useful to explicitly name the columns of your
> data frame when you create it, which gives
>
> dev <- data.frame(data, deviation = deviation))
>
>> return(dev)
>
> The last item of a function is what's returned, so all you really need here
> is
>
> dev
>
>> }
>> vmask(sub,10)
>> View(dev)
>
> dev only exists within the scope of the function. But you didn't
> assign the return value of the function to anything. If you assign it
> to an object named dev, then dev will exist in the global environment:
>
> dev <- vmask(sub, 10)
>
>
>
>> But when I run this I get the results as expected. But I expected the new
>> coloumn to be called "deviation", whereas R just calls in "value.1". How
>> can I fix this?
>> Also I was hoping to see this new dataset with columns "week", "value",
>> and now "deviation" when I use "View(dev) - but it comes up with error
>> 'dev not found'. How can i fix this? Also is there anyway instead of me
>> making a new dataset called "dev" with the 3 columns, I can just re-use my
>> original dataset "sub" and give me all the 3 new columns?
>>
>> The next step I want to do is to perform a cumulative sum. So looking at
>> the results, I want a new coloumn in existing dataset (or new dataset),
>> which will now have 4 columns. The 4th column I want to be called "CuSum".
>> So the first row of Cusum will be "-0.55", the second = "-0.55+(-2.01)"
>> which will give me "-2.56" and so on forth.
>>
>> How can I do this in R using a function? Please help
>
> You don't need a function. Just add the cumulative sum as a new column.
>
> sub$Cusum <- cumsum(sub$deviation)
>
>
> Sarah
> --
> Sarah Goslee
> http://www.functionaldiversity.org
>
>
>
-- 
Sarah Goslee
http://www.functionaldiversity.org



More information about the R-help mailing list