[R] Gobbling up a repeating, irregular list of data
MacQueen, Don
macqueen1 at llnl.gov
Fri Nov 11 16:58:44 CET 2016
Like Peter, I too will assume that all the white space consists of space
characters, not tabs.
In that case, I would probably start with read.fwf().
I would expect that to get me a data frame with lots of NA in the first
four columns. Then (also like Peter says) you'll have to figure out how to
fill the empty cells.
By the way, I wouldn't worry too much about using "bad form." If it works,
would be reasonably easy for someone else looking at your code to
understand
(or for you to understand 5 years from now), and runs fast enough,
that's good enough. But I do appreciate the satisfaction of doing
something "the R way."
Here's another way:
dat <- scan(textConnection(" 1 1.00E+00 1.24E+03 7.79E+00 1.925E-01
1.88E-01
3.850E-01 1.88E-01
5.775E-01 1.88E-01
7.700E-01 1.88E-01
9.626E-01 1.88E-01
1.155E+00 1.88E-01
1.347E+00 1.88E-01
1 2.00E+00 1.26E+03 7.80E+00 1.925E-01 2.80E-01
1.732E+00 2.80E-01
1.925E+00 2.80E-01
2.310E+00 2.93E-01
2.502E+00 2.22E-01
2.695E+00 1.88E-01
2.887E+00 1.88E-01
1 3.00E+00 1.28E+03 7.70E+00 1.925E-01 1.03E-01
3.850E-01 1.30E-01
5.775E-01 1.48E-01
7.701E-01 1.61E-01
9.626E-01 1.72E-01
1.155E+00 1.86E-01
1.347E+00 1.93E-01
1 4.00E+00 1.29E+03 7.60E+00 1.901E-01 1.80E-01
3.803E-01 1.80E-01
5.705E-01 1.38E-01
7.607E-01 1.32E-01
2.282E+00 1.86E-01
2.472E+00 1.98E-01
2.662E+00 2.00E-01"),
what=list(0,0,0,0,0,0),fill=TRUE
)
datf <- do.call(cbind, dat)
Then in datf you just have to move the first 2 columns over to be the last
two, in rows where there are missing values, and then fill in the missing
values in the first four columns from the non-missing values above them.
-Don
--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
On 11/10/16, 8:26 PM, "R-help on behalf of Morway, Eric"
<r-help-bounces at r-project.org on behalf of emorway at usgs.gov> wrote:
>What would be the sophisticated R method for reading the data shown below
>into a list? The data is output from a numerical model. Pasting the
>second block of example R commands (at the end of the message) results in
>a
>failure ("Error in scan...line 2 did not have 6 elements"). I no doubt
>could cobble together some script for reading line-by-line using for
>loops,
>and then appending vectors with values from each line, but this strikes me
>as bad form.
>
>One final note, the lines with 6 values contain important values that
>should somehow remain associated with the data appearing in columns 5 & 6
>(the continuous data). The first value, which is always 1, can be
>discarded, but the second value on these lines contain the time step
>number
>("1.00E+00", "2.00E+00", etc.), the 3rd and 4th values are contain a depth
>and thickness, respectively. Columns 5 & 6 are a depth and water content
>pairing and should be associated with the time steps.
>
>Thanks, Eric
>
>Start of example output data (Use of an R script to read in this data
>below)
>
> 1 1.00E+00 1.24E+03 7.79E+00 1.925E-01 1.88E-01
> 3.850E-01 1.88E-01
> 5.775E-01 1.88E-01
> 7.700E-01 1.88E-01
> 9.626E-01 1.88E-01
> 1.155E+00 1.88E-01
> 1.347E+00 1.88E-01
> 1 2.00E+00 1.26E+03 7.80E+00 1.925E-01 2.80E-01
> 1.732E+00 2.80E-01
> 1.925E+00 2.80E-01
> 2.310E+00 2.93E-01
> 2.502E+00 2.22E-01
> 2.695E+00 1.88E-01
> 2.887E+00 1.88E-01
> 1 3.00E+00 1.28E+03 7.70E+00 1.925E-01 1.03E-01
> 3.850E-01 1.30E-01
> 5.775E-01 1.48E-01
> 7.701E-01 1.61E-01
> 9.626E-01 1.72E-01
> 1.155E+00 1.86E-01
> 1.347E+00 1.93E-01
> 1 4.00E+00 1.29E+03 7.60E+00 1.901E-01 1.80E-01
> 3.803E-01 1.80E-01
> 5.705E-01 1.38E-01
> 7.607E-01 1.32E-01
> 2.282E+00 1.86E-01
> 2.472E+00 1.98E-01
> 2.662E+00 2.00E-01
>
>Same data as above, but scan function fails.
>
>dat <- read.table(textConnection(" 1 1.00E+00 1.24E+03 7.79E+00
> 1.925E-01 1.88E-01
> 3.850E-01 1.88E-01
> 5.775E-01 1.88E-01
> 7.700E-01 1.88E-01
> 9.626E-01 1.88E-01
> 1.155E+00 1.88E-01
> 1.347E+00 1.88E-01
> 1 2.00E+00 1.26E+03 7.80E+00 1.925E-01 2.80E-01
> 1.732E+00 2.80E-01
> 1.925E+00 2.80E-01
> 2.310E+00 2.93E-01
> 2.502E+00 2.22E-01
> 2.695E+00 1.88E-01
> 2.887E+00 1.88E-01
> 1 3.00E+00 1.28E+03 7.70E+00 1.925E-01 1.03E-01
> 3.850E-01 1.30E-01
> 5.775E-01 1.48E-01
> 7.701E-01 1.61E-01
> 9.626E-01 1.72E-01
> 1.155E+00 1.86E-01
> 1.347E+00 1.93E-01
> 1 4.00E+00 1.29E+03 7.60E+00 1.901E-01 1.80E-01
> 3.803E-01 1.80E-01
> 5.705E-01 1.38E-01
> 7.607E-01 1.32E-01
> 2.282E+00 1.86E-01
> 2.472E+00 1.98E-01
> 2.662E+00 2.00E-01"),header=FALSE)
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list