[R] Splitting one column value into multiple rows

David Winsemius dwinsemius at comcast.net
Mon Jul 18 05:47:00 CEST 2011


On Jul 17, 2011, at 11:27 PM, Madana_Babu wrote:

> Hi David,
>
> PFB the details of my query. Request your help in getting this  
> resolved.
>
> # TESTING is my dataset with almost 40K rows. I am importing this  
> dataset
> from my local desktop
>
> TESTING <- read.table("/Users/madana/Desktop/testing.txt",  
> header=FALSE,
>  sep="\t", na.strings="", dec=".", strip.white=TRUE)

This is the start of problems. Any text column will come in as a factor.
>
> TESTING
>
> # I tried the following two ways. Let me know if i am using right  
> syntax.
>
> Lines <- readLines(textConnection(data.frame(TESTING$V1)))

You would need to instead use:

Lines <- readLines(textConnection(as.character(TESTING$V1)))

(Or you could have just read in the entire dataset with rreadLines  
instead of read.table>)

(Or you could have used read.table with as.is=TRUE or stringsAsFactors  
= FALSE)

Seekers of advice take heed. Madana_Babu violated the advice in the  
Posting Guide to include his code in the his two earlier postings.  
Those of use who make efforts at offering advice are unable to read  
minds.

>
> # Error message is:
> Error in textConnection(data.frame(TESTING$V1)) : invalid 'text'  
> argument
>
> Lines <- readLines(textConnection(data.frame("TESTING", header=FALSE,
>  sep="\t", na.strings="", dec=".", strip.white=TRUE)))
>
> # Error message is:
> Error in textConnection(data.frame("TESTING", header = FALSE, sep =  
> "\t",  :
>  argument 'object' must deparse to a single character string
>
> closeAllConnections()
> newlines <- strsplit(Lines, ":")
>
> # Error message is:
> Error in strsplit(Lines, ":") : non-character argument
>
> newlines2 <- unlist(newlines)
>
>
> cleaned_data <- read.table(textConnection(newlines2), sep=",")
>
> # Error message is:
> Error in textConnection(newlines2) : invalid 'text' argument
>
> My machine Config is: Dual Core.

I doubt that makes any difference, and furthermore it does not temm me  
your OS or your version of R which in some cases does made a  
difference, but again I think it was the default stringsAsFactors  
setting, which is a universal pitfall..


David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list