[R] Reading very large text files into R
Dr Eberhard W Lisse
no@p@m @end|ng |rom ||@@e@NA
Thu Sep 29 20:23:14 CEST 2022
To me this file looks like a CSV with 15 fields (on each line) not 16,
the last field being empty with the exception of the one which has the
'B'. The 14th is always empty.
I also note that it does not seem to have a new line at the end.
I can strongly recommend QSV to manipulate CSV files and CSVIEW to look
at them
After renaming the file for convenience you can do something like
qsv input --trim-fields --trim-headers sample.csv \
| qsv select -n "1,2,6,7,8,9,10" \
| qsv rename "date,c2,type,c4,c5,c6,c7" \
| csview -i5 -np0
and get something like
┌──┬────────────────┬──────┬───────┬────┬────┬──┬──┐
│# │ date │ c2 │ type │ c4 │ c5 │c6│c7│
├──┼────────────────┼──────┼───────┼────┼────┼──┼──┤
│1 │1980-01-01 10:00│226918│WAHRAIN│5124│1001│0 │ │
│2 │1980-01-01 10:00│228562│WAHRAIN│491 │1001│0 │ │
│3 │1980-01-01 10:00│231581│WAHRAIN│5213│1001│0 │ │
│4 │1980-01-01 10:00│232671│WAHRAIN│487 │1001│0 │ │
│5 │1980-01-01 10:00│232913│WAHRAIN│5243│1001│0 │ │
│6 │1980-01-01 10:00│234362│WAHRAIN│5265│1001│0 │ │
│7 │1980-01-01 10:00│234682│WAHRAIN│5271│1001│0 │ │
│8 │1980-01-01 10:00│235389│WAHRAIN│5279│1001│0 │ │
│9 │1980-01-01 10:00│236466│WAHRAIN│497 │1001│0 │ │
│10│1980-01-01 10:00│243350│SREW │484 │1001│0 │ │
│11│1980-01-01 10:00│243350│WAHRAIN│484 │1001│0 │0 │
└──┴────────────────┴──────┴───────┴────┴────┴──┴──┘
As the files do not have headers, you could, if you have multiple files,
even do something like
qsv cat rows s*.csv \
| qsv input --trim-fields --trim-headers \
| qsv select -n "1,2,6,7,8,9,10" \
| qsv rename "date,c2,type,c4,c5,c6,c7" \
| qsv dedup 2>/dev/null -o readmeintoR.csv
If it was REALLY a file with different numbers of fields you can use
CSVQ and do something like
cat s*csv \
| csvq --format CSV --no-header --allow-uneven-fields \
"SELECT c1 as date, c2, c6 as type, c7 as c4,
c8 as c5, c9 as c6, c10 as c7
FROM stdin" \
| qsv input --trim-fields --trim-headers \
| qsv dedup 2>/dev/null -o readmeintoR.csv
And, finally, depending on how long the reading of the CSV takes, I
would save it into a RDS, loading of which is very fast.
greetings, el
On 2022-09-29 17:26 , Nick Wray wrote:
> Hi Bert
>
> Right Thing is, I didn't know that there even was an instruction like
> read.csv(text = "... your text... ") so at any rate I can paste the
> original text files in by hand if there's no shorter cut
> Thanks v much Nick
[...]
More information about the R-help
mailing list