[R] Reading Text files from UK Met Office into R again...
David Winsemius
dw|n@em|u@ @end|ng |rom comc@@t@net
Wed Oct 12 17:02:26 CEST 2022
First one needs to remove the extraneous line-ends that you created by using an editor that inserts those line-ends (or perhaps it was your mail-client that added them because you failed to post in plain-text. I removed those files "by hand" and then created a text "file".
txt <- "2015-01-01 00:00, 03002, WMO, SYNOP, 1, 12, 1011, 4, 7, 200, 18, 82, , , 8, , , , , 100, 450, 1005.4, 5, , 102, 4, , 129, , , , , , , , 8.7, 7.5, 8.1,1003.6, , , , , , , 1, 1, 1, , , 1, , , , , 1, 1, 1, 1, 1, 1, , 1, , 1, 1, , , , , , , , , , 1, , , , , 2014-12-31 23:53, 0, , , , , , , , , , , , K, , , , , 91.7, A, , , ,
2015-01-01 00:00, 03005, WMO, SYNOP, 1, 9, 1011, 4, 1, 210, 26, 62, 8, 6, ,8, 8, , , 8, 30, 700, 1006, 1, 8, 54, 7, 6, 105, , , , , , , , 8.6, 7.3, 8, 996.1, , 01, , , , , 1, 1, 1, 1, 1, 1, 1, , , 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, , , , , , , , 1, , , , , 2014-12-31 23:55, 0, , , , , , , , , , , , K, , , , , 91.7, A, , , 0, 1
2015-01-01 00:00, 03006, WMO, SYNOP, 1, 10, 1011, 4, 6, 210, 23, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , 1, 1, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , 2014-12-31 23:53, 0, , , , , , , , , , , , , , , , , , A, , , ,
2015-01-01 00:00, 03010, WMO, SYNOP, 1, 17, 1011, 4, 6, 230, 21, , , , , , , , , , , 1006.1, , , , , , , , , , , , , , 9.4, 6.2, 7.9, , , , , , , , 1, 1, , , , , , , , , , , 1, 1, 1, 1, , , , , , , , , , , , , , , , , , , ,"
# Then use `count.fields`
count.fields(file=textConnection(txt))
[1] 104 106 105 81
# So i'm guessing you arbitrarily snipped in the middl of own of the text lines
dat <- read.table(text=txt, sep=",", fill=TRUE, row.names=NULL, head=FALSE)
str(dat)
'data.frame': 4 obs. of 105 variables:
$ V1 : chr "2015-01-01 00:00" "2015-01-01 00:00" "2015-01-01 00:00" "2015-01-01 00:00"
$ V2 : int 3002 3005 3006 3010
$ V3 : chr " WMO" " WMO" " WMO" " WMO"
$ V4 : chr " SYNOP" " SYNOP" " SYNOP" " SYNOP"
$ V5 : int 1 1 1 1
$ V6 : int 12 9 10 17
$ V7 : int 1011 1011 1011 1011
$ V8 : int 4 4 4 4
$ V9 : int 7 1 6 6
$ V10 : int 200 210 210 230
$ V11 : int 18 26 23 21
$ V12 : int 82 62 NA NA
$ V13 : int NA 8 NA NA
$ V14 : int NA 6 NA NA
$ V15 : int 8 NA NA NA
$ V16 : int NA 8 NA NA
$ V17 : int NA 8 NA NA
$ V18 : logi NA NA NA NA
$ V19 : logi NA NA NA NA
$ V20 : int 100 8 NA NA
#snipped about 80 lines .......
$ V99 : num 91.7 NA NA NA
[list output truncated]
ALWAYS use a programming editor and always post in plain-text.
-- David.
> On Oct 9, 2022, at 4:50 PM, Ivan Krylov <krylov.r00t using gmail.com> wrote:
>
> On Sun, 9 Oct 2022 12:01:27 +0100
> Nick Wray <nickmwray using gmail.com> wrote:
>
>> Error in read.table("midas_wxhrly_201501-201512.txt", fill = T) :
>> duplicate 'row.names' are not allowed
>
> Since you don't pass the `header` argument, I think that the automatic
> header detection is here at play. This is what ?read.table has to say
> about row names:
>
>>> If there is a header and the first row contains one fewer field than
>>> the number of columns, the first column in the input is used for the
>>> row names. Otherwise if ‘row.names’ is missing, the rows are
>>> numbered.
>
> Perhaps the "one fewer field in the header than the number of columns"
> condition is true for files after 2010? I'm too lazy to sign up for a
> CEDA account and I'm not sure I'd be given access to hourly datasets
> anyway.
>
> If this is the reason for the failure (first column used as rownames()
> and turns out to be non-unique), there's an easy way to fix that:
>
>>> Using ‘row.names = NULL’ forces row numbering.
>
> I don't see a header in your example. If there's actually no header
> containing column names, passing `header = FALSE` will both prevent the
> error and avoid eating the first line of the file.
>
> --
> Best regards,
> Ivan
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list