[R] matrix(unlist(strsplit(""))) 'missing value' issue
MaartenJacobs
maart_jacobs at hotmail.com
Tue Mar 27 16:55:40 CEST 2012
*I'm still a R noob, just had a couple of lectures about it in our research
master.
There is a Deal or no deal experiment where I have to write some code for.
Someone wrote a website to gather the data and write it in a .xlsx file.
These are seperate files for seperate participants so first I have to import
the seperate datafiles. I do that like this:
# Merge the xlsx files into one dataframe
alldata <- rbind(read.xlsx('experimentdata.xlsx',1),
read.xlsx('experimentdata_1.xlsx',1),
read.xlsx('experimentdata_2.xlsx',1)
#etc..#read.xlsx('filepath',1)
)
The website is poorly written and some of the variables are not conveniant.
I have the variables 'bankoffer.1', 'bankoffer.3', 'bankoffer.5' etc.
These variables look like the following:
alldata$bankoffer.1
[1] 246000:accepted 267000:notaccepted 200000:notaccepted
Levels: 246000:accepted 267000:notaccepted 200000:notaccepted
> alldata$bankoffer.3
[1] 9999999 429000:notaccepted 48000:notaccepted
Levels: 9999999 429000:notaccepted 48000:notaccepted
The problem is that the values in the cells are weird, they constitude for
example of /'246000:accepted'/I would decompose that so that /246000 /is in
one variable and /accepted /in another
no problem just do this:
> as.data.frame(matrix(unlist(strsplit(as.character(alldata$bankoffer.1),":")),
> ncol = 2, byrow = TRUE))
V1 V2
1 246000 accepted
2 267000 notaccepted
3 200000 notaccepted
However when there are missing values, like in bankoffer.3, there is a
problem
> as.data.frame(matrix(unlist(strsplit(as.character(alldata$bankoffer.3),":")),
> ncol = 2, byrow = TRUE))
V1 V2
1 9999999 429000
2 notaccepted 48000
3 notaccepted 9999999
Warning message:
In matrix(unlist(strsplit(as.character(alldata$bankoffer.3), ":")), :
data length [5] is not a sub-multiple or multiple of the number of rows
[3]
R does not encounter a ':' in the 9999999 and therefor places the 429000 in
the second colomn, this should however be in the first one. Like this:
V1 V2
1 9999999 9999999
2 429000 notaccepted
3 48000 notaccepted
How can I tell R to place 9999999 in both colomns when he/she encounters a
9999999. Or any other solotion to my problem is also good. I for example
thought about making R add ':9999999' whenever it encounters 9999999 as a
sort of a workaround for the problem but I have no idea how to do that.
I hope I made it a little clear what the problem is and what I eventually
want. If not please ask.
Greetings Maarten
--
View this message in context: http://r.789695.n4.nabble.com/matrix-unlist-strsplit-missing-value-issue-tp4509065p4509065.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list