[R] Non-Parametric Adventures in R
Jamesp
james.jrp015 at gmail.com
Sat Oct 2 23:27:09 CEST 2010
I just started using R and I'm having all sorts of "fun" trying different
things.
I'm going to document the different things I'm doing here as a kind of case
study. I'm hoping that I'll get help from the community so that I can use R
properly.
Anyways, in this study, I have demographic data, drug usage data, and side
effect data. All of this is loaded into a csv file. I'm using Rweb as an
interface, so I had to modify the cgi-bin code slightly, but it works pretty
well. I'm looking for frequency counts, some summary data for columns where
it makes sense, plots and X-squared tests. My data frame is named X since
that's what Rweb names it.
----------------------------------------------------------------------------------------------------
1) I was thinking I'd have to go through each nominal variable (i.e.
table(X$race) ), but I think I have it figured out now. summary(X) is nice,
but I need to recode nominal data with labels so the results are meaningful.
-----------------------------------------------------------------------------------------------------
2) I had an issue with multiple plots overwriting each other, and I managed
to bypass that with:
par(mfrow=c(2,1))
I have to update it to correspond to the number of plots I think. There's
probably a better way to do this.
barplot(table(X$race)) prints out a barplot so that's great
-----------------------------------------------------------------------------------------------------
3) I was able to code my data so it shows up in tables better with
X$race <- factor(X$race, levels = c(0,2), labels = c("African
American","White,Non-Hispanic"))
----------------------------------------------------------------------------------------------------
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
----------------------------------------------------------------------------------------------------
4) The coding for all of my drug variables is identical, and I'd like to
create a loop that goes through and labels accordingly
I'm not having good success with this yet, but here's what I'm trying.
X[1,] <- factor(X[1,], levels = c(0,1,2,3,4,5), labels= c("none","last
week","last 3 month","last year","regular use at least 3 months","unknown
length of usage"))
I know I would need to replace the [1,] with something that gives me the
column, but I'm not sure what to put syntactically at the moment.
----------------------------------------------------------------------------------------------------
5) I had more success creating new variables based on the old ones. So I
end up with yes/no answers to drug usage
for (i in 24:56)
{
X[,i+173] <- ifelse(X[,i] >0,c(1),c(0))
}
I'd like to have been able to make a new variable name based off of the old
variable name (i.e. dropping "_when" from the end of each and replace it
with "_yn")
---------------------------------------------------------------------------------------------------
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
---------------------------------------------------------------------------------------------------
6) I'm able to make a cross-tabulated table and perform a X-squared test
just fine with my recoded variable
table(X$race,X[,197])
prop.test(table(X$race,X[,197]))
but I would like to be able to do so with all of my drugs, although I can't
seem to make that work
for (i in 197:229)
{
table(X$race,X[,i])
prop.test(table(X$race,X[,i]))
}
-------------------------------------------------------------------------------------------------
Thanks for reading over this and I do appreciate any help. I understand
that there's "an R way" of doing things, and I look forward to learning the
method.
--
View this message in context: http://r.789695.n4.nabble.com/Non-Parametric-Adventures-in-R-tp2952754p2952754.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list