[R] Unwanted Levels in R

MATT BORKOWSKI mpb170 at psu.edu
Tue May 21 16:15:07 CEST 2002


To clarify:
The lines beginning with A,B,C,D,E are part of a header file.  Below the header
are lines that contain values that correspond.  The problem is that there are 
a number of data sets combined, so the header randomly repeats after an
varying number of data lines.  Would it solve the problem to simply treat the line
that begin with A,B,C,D,, or E differently?  If so, how do they need to be treated?
I've copied a bit more of the data below to demonstrate more clearly how the 
data is arranged within the file.

A  900003024 ODEN     SWEDEN          ODEN91          NSIDC.ORG/PROJE 
B     900003     -9  1 NAN OBS         0
C 1991  9  7 13 -9 XX   90.0000     .0000 XX
D    36   10.0   10.1 4183.0 4270.7 4219.0 Z 13  0 OBSERV
E    -9.0   -9.0 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000
   25.0   25.3 -1.7050 -1.7054 31.4970 25.3313 34.8074 43.8571 -9.0000  8.630  
   50.0   50.6 -1.7400 -1.7408 32.3660 26.0382 35.5010 44.5377 -9.0000  8.280  
   89.0   90.0 -1.6550 -1.6566 32.8530 26.4320 35.8807 44.9043 -9.0000  7.430  
   109.0  110.3 -1.5420 -1.5444 33.8830 27.2659 36.6893 45.6886 -9.0000  7.360 
...
...
...
A  900002034 LOUIS ST: LAURENT   UNITED STATES   AO1994   NSIDC.ORG/PROJE 
B     900002         -9  1 NAN OBS         0
C 1994  8 20 22 -9 XX   89.0167  137.1517 XX
D    36   13.0   13.1 4075.0 4159.4 4075.0 Z 13  0 LASTLE
E    -9.0   -9.0 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 
  13.0   13.1 -1.7650 -1.7652 32.9160 26.4856 35.9403 44.9690 -9.0000  8.580 

Matt


On Tue, 21 May 2002 15:46:58 +0200, Peter Dalgaard BSA <p.dalgaard at biostat.ku.dk> wrote:

> "MATT BORKOWSKI" <mpb170 at psu.edu> writes:
> 
> > is there anyway to overcome it?  Here are a few lines of the data I'm 
> > attempting to read in:
> > 
> > A  900003024 ODEN   SWEDEN  ODEN91 NSIDC.ORG/PROJE 
> > B     900003         -9  1 NAN OBS         0
> > C 1991  9  7 13 -9 XX   90.0000     .0000 XX
> > D    36   10.0   10.1 4183.0 4270.7 4219.0 Z 13  0 OBSERV
> > E    -9.0   -9.0 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 
> >    10.0   10.1 -1.6970 -1.6971 31.4940 25.3287 34.8044 43.8535 -9.0000 
> > 
> > Here are the commands I have tried using to read in the data:
> > 
> > >alldata <- read.table("/home/mattb/xxx.dat", fill = TRUE, quote = "")
> > 
> > >alldata <- as.list(read.table("/home/mattb/xxx.dat", fill = TRUE, quote = "")
> 
> As far as I can see, there is no connection between values in the same
> position in different lines? If so, trying to make a data frame out of
> the file is simply inappropriate and you should rather use ReadLines
> and postprocess the lines according to whatever logic they are
> supposed to obey.
> 
> -- 
>    O__  ---- Peter Dalgaard             Blegdamsvej 3  
>   c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
>  (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list