[R] Splitting one column value into multiple rows
David Winsemius
dwinsemius at comcast.net
Mon Jul 18 06:09:41 CEST 2011
On Jul 17, 2011, at 11:47 PM, David Winsemius wrote:
>
> On Jul 17, 2011, at 11:27 PM, Madana_Babu wrote:
>
>> Hi David,
>>
>> PFB
What ever that TLA means ....
>> the details of my query. Request your help in getting this resolved.
>>
>> # TESTING is my dataset with almost 40K rows.
A small dataset.
>> I am importing this dataset
>> from my local desktop
>>
>> TESTING <- read.table("/Users/madana/Desktop/testing.txt",
>> header=FALSE,
>> sep="\t", na.strings="", dec=".", strip.white=TRUE)
>
> This is the start of problems. Any text column will come in as a
> factor.
You should also get in the habit of looking at your data as soon as it
comes in with str() and summary()
>>
>> TESTING
>>
>> # I tried the following two ways. Let me know if i am using right
>> syntax.
>>
>> Lines <- readLines(textConnection(data.frame(TESTING$V1)))
>
> You would need to instead use:
>
> Lines <- readLines(textConnection(as.character(TESTING$V1)))
I was lying in bed about to go to sleep and realized that this
untested strategy was unnecessary (even if it does work which suspect
it may not.)
Lines <- as.character(TESTING$V1) # should be enough.
The goal here it to get a character with which to work.
Good night.
--
David,
>
> (Or you could have just read in the entire dataset with rreadLines
> instead of read.table>)
>
> (Or you could have used read.table with as.is=TRUE or
> stringsAsFactors = FALSE)
>
> Seekers of advice take heed. Madana_Babu violated the advice in the
> Posting Guide to include his code in the his two earlier postings.
> Those of use who make efforts at offering advice are unable to read
> minds.
>
>>
>> # Error message is:
>> Error in textConnection(data.frame(TESTING$V1)) : invalid 'text'
>> argument
>>
>> Lines <- readLines(textConnection(data.frame("TESTING", header=FALSE,
>> sep="\t", na.strings="", dec=".", strip.white=TRUE)))
>>
>> # Error message is:
>> Error in textConnection(data.frame("TESTING", header = FALSE, sep =
>> "\t", :
>> argument 'object' must deparse to a single character string
>>
>> closeAllConnections()
>> newlines <- strsplit(Lines, ":")
>>
>> # Error message is:
>> Error in strsplit(Lines, ":") : non-character argument
>>
>> newlines2 <- unlist(newlines)
>>
>>
>> cleaned_data <- read.table(textConnection(newlines2), sep=",")
>>
>> # Error message is:
>> Error in textConnection(newlines2) : invalid 'text' argument
>>
>> My machine Config is: Dual Core.
>
> I doubt that makes any difference, and furthermore it does not temm
> me your OS or your version of R which in some cases does made a
> difference, but again I think it was the default stringsAsFactors
> setting, which is a universal pitfall..
>
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list