# [R] how to apply sample function to each row of a data frame?

Petr Savicky savicky at cs.cas.cz
Sat Nov 20 09:51:52 CET 2010

```On Fri, Nov 19, 2010 at 07:22:57PM -0800, wangwallace wrote:
> actually, what I meant is to draw two random numbers from each row
> separately to create a new data frame. for example, an example output could
> be:
>
> 1 3
> 4 5
> 9 8

This may be done, for example

X <- matrix(1:9, ncol = 3, byrow = TRUE)
colnames(X) <- c("M", "P", "Q")
X <- data.frame(X)
Y <- t(apply(X, 1, sample, 2))

Y is a matrix, since apply() uses as.matrix() on its first argument,
if it is a data frame. If the samples from all rows have the same
column names, Y gets these column names, otherwise no column names
are used. We may get something like

M P
[1,] 1 2
[2,] 4 5
[3,] 7 8

but more typically something like

[,1] [,2]
[1,]    1    2
[2,]    5    6
[3,]    9    8

> Finally, since the column names of the sampled two numbers across these
> three rows will probably be different, I guess I cannot use rbind to put all
> these three rows together.

Combining rows with different column names is possible for matrices.
The column names of the first row are used for the result.
For example

Z <- as.matrix(X)
r1 <- sample(Z[1, ], 2)
r2 <- sample(Z[2, ], 2)
r3 <- sample(Z[3, ], 2)
rbind(r1, r2, r3)

P Q
r1 2 3
r2 5 4
r3 9 7

> Is there anything else (I don't want use list) I
> can use to align three rows with different column names together? Also, if I
> can write a function for it.

Such a function can be written also for data frames, if it sets the names
of the input rows explicitly to the same vector of names before rbind().

> May I use some syntax like the following to
> repeat the whole process 1000 times (i.e., 1000 samples)?
>
> > result<-vector("list",1000)
> > for(i in 1:1000)result[[i]]<-fff(data)#fff(data) is the function name
> > result

This should work. Aternatively, it is possible to use something like

replicate(5, list(t(apply(X, 1, sample, 2))))

[]
[,1] [,2]
[1,]    1    3
[2,]    5    6
[3,]    9    7

[]
[,1] [,2]
[1,]    2    3
[2,]    4    5
[3,]    7    8

[]
[,1] [,2]
[1,]    1    2
[2,]    5    4
[3,]    9    7

[]
M P
[1,] 1 2
[2,] 4 5
[3,] 7 8

[]
[,1] [,2]
[1,]    3    2
[2,]    4    6
[3,]    7    9

where t(apply(X, 1, sample, 2)) may be replaced by a function, which
always produces a matrix with column names.

PS.

```