[R] Read

Jeff Newmiller jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Tue Feb 23 02:33:54 CET 2021


This gets it into a data frame. If you know which columns should be numeric you can convert them.

s <- 
"x1  x2  x3 x4
1 B22
2         C33
322 B22      D34
4                 D44
51         D53
60 D62            
"

tc <- textConnection( s )
lns <- readLines(tc)
close(tc)
if ( "" == lns[ length( lns ) ] )
  lns <- lns[ -length( lns ) ]

L <- strsplit( lns, " +" )
m <- do.call( rbind, lapply( L[-1], function(v) if (length(v)<length(L[[1]])) c( v, rep(NA, length(L[[1]]) - length(v) ) ) else v ) )
colnames( m ) <- L[[1]]
result <- as.data.frame( m, stringsAsFactors = FALSE )
result

On February 22, 2021 4:42:57 PM PST, Val <valkremk using gmail.com> wrote:
>That is my problem. The spacing between columns is not consistent.  It
>  may be  single space  or multiple spaces (two or three).
>
>On Mon, Feb 22, 2021 at 6:14 PM Bill Dunlap <williamwdunlap using gmail.com>
>wrote:
>>
>> You said the column values were separated by space characters.
>> Copying the text from gmail shows that some column names and column
>> values are separated by single spaces (e.g., between x1 and x2) and
>> some by multiple spaces (e.g., between x3 and x4.  Did the mail mess
>> up the spacing or is there some other way to tell where the omitted
>> values are?
>>
>> -Bill
>>
>> On Mon, Feb 22, 2021 at 2:54 PM Val <valkremk using gmail.com> wrote:
>> >
>> > I Tried that one and it did not work. Please see the error message
>> > Error in read.table(text = "x1  x2  x3 x4\n1 B12 \n2       C23
>> > \n322 B32      D34 \n4            D44 \n51     D53\n60 D62        
>",
>> > :
>> >   more columns than column names
>> >
>> > On Mon, Feb 22, 2021 at 5:39 PM Bill Dunlap
><williamwdunlap using gmail.com> wrote:
>> > >
>> > > Since the columns in the file are separated by a space character,
>" ",
>> > > add the read.table argument sep=" ".
>> > >
>> > > -Bill
>> > >
>> > > On Mon, Feb 22, 2021 at 2:21 PM Val <valkremk using gmail.com> wrote:
>> > > >
>> > > > Hi all, I am trying to read a messy data  but facing 
>difficulty.  The
>> > > > data has several columns separated by blank space(s).  Each
>column
>> > > > value may have different lengths across the rows.   The first
>> > > > row(header) has four columns. However, each row may not have
>the four
>> > > > column values.  For instance, the first data row has only the
>first
>> > > > two column values. The fourth data row has the first and last
>column
>> > > > values, the second and the third column values are missing for
>this
>> > > > row..  How do I read this data set correctly? Here is my sample
>data
>> > > > set, output and desired output.   To make it clear to each data
>point
>> > > > I have added the row and column numbers. I cannot use fixed
>width
>> > > > format reading because each row  may have different length for 
>a
>> > > > given column.
>> > > >
>> > > > dat<-read.table(text="x1  x2  x3 x4
>> > > > 1 B22
>> > > > 2         C33
>> > > > 322 B22      D34
>> > > > 4                 D44
>> > > > 51         D53
>> > > > 60 D62            ",header=T, fill=T,na.strings=c("","NA"))
>> > > >
>> > > > Output
>> > > >       x1  x2     x3     x4
>> > > > 1   1     B12 <NA> NA
>> > > > 2   2    C23 <NA>  NA
>> > > > 3 322  B32  D34   NA
>> > > > 4   4   D44  <NA>  NA
>> > > > 5  51 D53  <NA>   NA
>> > > > 6  60 D62  <NA>  NA
>> > > >
>> > > >
>> > > > Desired output
>> > > >    x1   x2     x3       x4
>> > > > 1   1    B22    <NA>   NA
>> > > > 2   2   <NA>  C33     NA
>> > > > 3 322  B32    NA      D34
>> > > > 4   4   <NA>   NA      D44
>> > > > 5  51  <NA>  D53     NA
>> > > > 6  60   D62   <NA>   NA
>> > > >
>> > > > Thank you,
>> > > >
>> > > > ______________________________________________
>> > > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more,
>see
>> > > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > > PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> > > > and provide commented, minimal, self-contained, reproducible
>code.
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.
-- 
Sent from my phone. Please excuse my brevity.



More information about the R-help mailing list