[R] Unwanted Levels in R

Peter Dalgaard BSA p.dalgaard at biostat.ku.dk
Tue May 21 16:34:53 CEST 2002


"MATT BORKOWSKI" <mpb170 at psu.edu> writes:

> To clarify:
> The lines beginning with A,B,C,D,E are part of a header file.  Below the header
> are lines that contain values that correspond.  The problem is that there are 
> a number of data sets combined, so the header randomly repeats after an
> varying number of data lines.  Would it solve the problem to simply treat the line
> that begin with A,B,C,D,, or E differently?  If so, how do they need to be treated?
> I've copied a bit more of the data below to demonstrate more clearly how the 
> data is arranged within the file.
> 
> A  900003024 ODEN     SWEDEN          ODEN91          NSIDC.ORG/PROJE 
> B     900003     -9  1 NAN OBS         0
> C 1991  9  7 13 -9 XX   90.0000     .0000 XX
> D    36   10.0   10.1 4183.0 4270.7 4219.0 Z 13  0 OBSERV
> E    -9.0   -9.0 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000
>    25.0   25.3 -1.7050 -1.7054 31.4970 25.3313 34.8074 43.8571 -9.0000  8.630  
>    50.0   50.6 -1.7400 -1.7408 32.3660 26.0382 35.5010 44.5377 -9.0000  8.280  
>    89.0   90.0 -1.6550 -1.6566 32.8530 26.4320 35.8807 44.9043 -9.0000  7.430  
>    109.0  110.3 -1.5420 -1.5444 33.8830 27.2659 36.6893 45.6886 -9.0000  7.360 
> ...
> ...
> ...
> A  900002034 LOUIS ST: LAURENT   UNITED STATES   AO1994   NSIDC.ORG/PROJE 
> B     900002         -9  1 NAN OBS         0
> C 1994  8 20 22 -9 XX   89.0167  137.1517 XX
> D    36   13.0   13.1 4075.0 4159.4 4075.0 Z 13  0 LASTLE
> E    -9.0   -9.0 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 -9.0000 
>   13.0   13.1 -1.7650 -1.7652 32.9160 26.4856 35.9403 44.9690 -9.0000  8.580 

Hmm. If you're on a Unix(-like) system, I suggest you preprocess with 
grep -v "^[A-E]". On Windows, you could fetch the grep program and do
likewise (there is one in
http://www.stats.ox.ac.uk/pub/Rtools/tools.zip).

In pure R, I suppose a combination of readLines(), grep(),
writeLines() (to a temp file) and read.table() would do the trick.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list