[R] Read
Jeff Newmiller
jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Tue Feb 23 02:33:54 CET 2021
This gets it into a data frame. If you know which columns should be numeric you can convert them.
s <-
"x1 x2 x3 x4
1 B22
2 C33
322 B22 D34
4 D44
51 D53
60 D62
"
tc <- textConnection( s )
lns <- readLines(tc)
close(tc)
if ( "" == lns[ length( lns ) ] )
lns <- lns[ -length( lns ) ]
L <- strsplit( lns, " +" )
m <- do.call( rbind, lapply( L[-1], function(v) if (length(v)<length(L[[1]])) c( v, rep(NA, length(L[[1]]) - length(v) ) ) else v ) )
colnames( m ) <- L[[1]]
result <- as.data.frame( m, stringsAsFactors = FALSE )
result
On February 22, 2021 4:42:57 PM PST, Val <valkremk using gmail.com> wrote:
>That is my problem. The spacing between columns is not consistent. It
> may be single space or multiple spaces (two or three).
>
>On Mon, Feb 22, 2021 at 6:14 PM Bill Dunlap <williamwdunlap using gmail.com>
>wrote:
>>
>> You said the column values were separated by space characters.
>> Copying the text from gmail shows that some column names and column
>> values are separated by single spaces (e.g., between x1 and x2) and
>> some by multiple spaces (e.g., between x3 and x4. Did the mail mess
>> up the spacing or is there some other way to tell where the omitted
>> values are?
>>
>> -Bill
>>
>> On Mon, Feb 22, 2021 at 2:54 PM Val <valkremk using gmail.com> wrote:
>> >
>> > I Tried that one and it did not work. Please see the error message
>> > Error in read.table(text = "x1 x2 x3 x4\n1 B12 \n2 C23
>> > \n322 B32 D34 \n4 D44 \n51 D53\n60 D62
>",
>> > :
>> > more columns than column names
>> >
>> > On Mon, Feb 22, 2021 at 5:39 PM Bill Dunlap
><williamwdunlap using gmail.com> wrote:
>> > >
>> > > Since the columns in the file are separated by a space character,
>" ",
>> > > add the read.table argument sep=" ".
>> > >
>> > > -Bill
>> > >
>> > > On Mon, Feb 22, 2021 at 2:21 PM Val <valkremk using gmail.com> wrote:
>> > > >
>> > > > Hi all, I am trying to read a messy data but facing
>difficulty. The
>> > > > data has several columns separated by blank space(s). Each
>column
>> > > > value may have different lengths across the rows. The first
>> > > > row(header) has four columns. However, each row may not have
>the four
>> > > > column values. For instance, the first data row has only the
>first
>> > > > two column values. The fourth data row has the first and last
>column
>> > > > values, the second and the third column values are missing for
>this
>> > > > row.. How do I read this data set correctly? Here is my sample
>data
>> > > > set, output and desired output. To make it clear to each data
>point
>> > > > I have added the row and column numbers. I cannot use fixed
>width
>> > > > format reading because each row may have different length for
>a
>> > > > given column.
>> > > >
>> > > > dat<-read.table(text="x1 x2 x3 x4
>> > > > 1 B22
>> > > > 2 C33
>> > > > 322 B22 D34
>> > > > 4 D44
>> > > > 51 D53
>> > > > 60 D62 ",header=T, fill=T,na.strings=c("","NA"))
>> > > >
>> > > > Output
>> > > > x1 x2 x3 x4
>> > > > 1 1 B12 <NA> NA
>> > > > 2 2 C23 <NA> NA
>> > > > 3 322 B32 D34 NA
>> > > > 4 4 D44 <NA> NA
>> > > > 5 51 D53 <NA> NA
>> > > > 6 60 D62 <NA> NA
>> > > >
>> > > >
>> > > > Desired output
>> > > > x1 x2 x3 x4
>> > > > 1 1 B22 <NA> NA
>> > > > 2 2 <NA> C33 NA
>> > > > 3 322 B32 NA D34
>> > > > 4 4 <NA> NA D44
>> > > > 5 51 <NA> D53 NA
>> > > > 6 60 D62 <NA> NA
>> > > >
>> > > > Thank you,
>> > > >
>> > > > ______________________________________________
>> > > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more,
>see
>> > > > https://stat.ethz.ch/mailman/listinfo/r-help
>> > > > PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> > > > and provide commented, minimal, self-contained, reproducible
>code.
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
--
Sent from my phone. Please excuse my brevity.
--
Sent from my phone. Please excuse my brevity.
More information about the R-help
mailing list