[R] reading fixed width format data with 2 types of lines

Denis Chabot chabotd at globetrotter.net
Thu Aug 12 19:57:19 CEST 2010


Hi,

I know how to read fixed width format data with read.fwf, but suddenly I need to read in a large number of old fwf files with 2 types of lines. Lines that begin with "3" in first column carry one set of variables, and lines that begin with "4" carry another set, like this:

…
3A00206546L070049016090045    99  1015002      001001008010004002004007003   001
3A00206546L070049006090030    99  1029001002001001006014002                     
3A00206546L070049002290004    99  1015            001001                        
3A00206546L070049001692559049033  1015                                 018036024
3A00206546L070049002290004    99  1001                                       002
4A00176546L068047090010111000606516400150010000001501063   065914               
4A00176546L06804709001011100040761600000000         1092   095614               
4A00196546L098000100010111001706214400005010000000051062   065914               
4A00176546L06804709001011100050591300000000         1062   065914               
4A00196546L098000100010111002604721400020010000000201042   046114               
4A00196546L098000100010111002504221400005012000000051042   046114               
4A00196546L098000100010111002903721400050012200000501032   036214               
…

I have searched for tricks to do this but I must not have used the right keywords, I found nothing.

I suppose I could read the entire file as a single character variable for each line, then subset for lines that begin with 3 and save this in an ascii file that will then be reopened with a read.fwf call, and do the same with lines that begin with 4. But this does not appear to me to be very elegant nor efficient… Is there a better method?

Thanks in advance,


Denis Chabot


More information about the R-help mailing list