[R] NAs introduced by coercion warning?
Sundar Dorai-Raj
sundar.dorai-raj at pdf.com
Thu Feb 19 18:02:07 CET 2004
Jonathan,
It's still hard to tell. Try this:
options(warn = 1) # see ?options for explanation
## RUN YOUR CODE
Regards,
Sundar
Jonathan Greenberg wrote:
> Its hard for me to pinpoint where this is happening, since I'm working on an
> image that¹s about 10000 x 20000 pixels, and 12 bands deep and I'm using a
> set of for-next loops to pull out subsections of data. I can guarantee the
> input values are all floating point values.
>
> To be more specific, I have created a classification tree, and I want to
> apply it to that large floating point image (all the band names match up)
> and write the prediction (probability) values to a file. What happens if a
> decision tree tries to classify a set of input values that are completely
> outside of the range of the input tree?
>
> Here's the code I was using. I should mention that this worked on a small
> subset (400 x 400 pixels) that wouldn't have any "weird" values (negative or
> zero). The output file from this is turning out to be slightly smaller than
> it should given the samples,lines,bands and number type, which I why I'm
> wondering if the tree is simply dropping those "bad" values rather than
> giving them some value (e.g. 0):
>
> ## Creating the tree
> library(tree)
> bands=12
> bandnames<-paste(c("B"),1:bands,sep="")
> treetraindata=read.csv("classtrainshad040205.csv",header=TRUE)
> names(treetraindata)[2:6]<-bandnames[1:5]
> names(treetraindata)[8:14]<-bandnames[6:12]
> treetraindata$Class_Name<-as.factor(treetraindata$Class_Name)
>
> ## Create an overfit tree
> treetrain<-tree(Class_Name ~ B1 + B2 + B3 +
> B4+B5+B6+B7+B8+B9+B10+B11+B12,treetraindata,mincut=1,minsize=2,mindev=0)
>
> ## Extracts a slice of data out of an ENVI BSQ file
> envigetslice<-function(fileconnection,samples,lines,bands,interleave,datatyp
> e,maxpixels) {
> currentloc=seek(fileconnection,where=NA,origin="current")
> ## If data is integer
> if(datatype==3) {
> numbersize=2
> datatype=integer()
> if ((samples*lines)-(currentloc/numbersize) < maxpixels)
> maxpixels=(samples*lines)-(currentloc/numbersize)
> envislice <-
> readBin(fileconnection,integer(),maxpixels,size=numbersize)
> newloc=seek(fileconnection,where=NA,origin="current")
> if (bands > 1) {
> for (i in 1:(bands-1)) {
>
> seek(fileconnection,where=currentloc+(samples*lines*numbersize*i),origin="st
> art")
> currentslice <-
> readBin(fileconnection,integer(),maxpixels,size=numbersize)
> envislice=data.frame(envislice,currentslice)
> }
> }
> }
> ## If data is floating point
> if(datatype==4) {
> numbersize=4
> if ((samples*lines)-(currentloc/numbersize) < maxpixels)
> maxpixels=(samples*lines)-(currentloc/numbersize)
> envislice <-
> readBin(fileconnection,double(),maxpixels,size=numbersize)
> newloc=seek(fileconnection,where=NA,origin="current")
> if (bands > 1) {
> for (i in 1:(bands-1)) {
>
> seek(fileconnection,where=currentloc+(samples*lines*numbersize*i),origin="st
> art")
> currentslice <-
> readBin(fileconnection,double(),maxpixels,size=numbersize)
> envislice=data.frame(envislice,currentslice)
> }
> }
> }
> seek(fileconnection,where=newloc,origin="start")
> envislice
> }
>
> ## Read ENVI files in subsets
> ## interleave: 1=bsq
> ## datatype: (follows ENVI format):
> ## 3: long integer
> ## 4:floating point
>
>
> ## Apply the classifier
> imageclasstree<-function(infile,outfile,dectree,samples,lines,bands,interlea
> ve,datatype,maxpixels) {
>
> fileconnection<-file(infile,open="rb")
> outfileconnection=file(outfile,open="wb")
>
> numpixels = samples * lines
> numslices=ceiling(numpixels/maxpixels)
> if (numslices == floor(numpixels/maxpixels)) numslices=numslices-1
>
> bandnames<-paste(c("B"),1:bands,sep="")
>
> ## Loop for processing images
> for(j in 0:numslices) {
> print((j/numslices)*100)
>
> envislice<-envigetslice(fileconnection,samples,lines,bands,interleave,dataty
> pe,maxpixels)
> names(envislice)<-bandnames
> predictslice<-predict(treetrain,envislice,type=c("vector"))
>
> predictslice<-as.integer(round(as.vector(t(predictslice*10000)),digits=0))
> predictslice
> writeBin(predictslice,outfileconnection,size=2)
> }
> close(fileconnection)
> close(outfileconnection)
> }
>
> imageclasstree("flt4aall","flt4adt", treetrain,11216,18173,12,1,4,25000)
>
> On 2/18/04 2:25 PM, "Sundar Dorai-Raj" <sundar.dorai-raj at PDF.COM> wrote:
>
>
>>
>>Jonathan Greenberg wrote:
>>
>>
>>>I'm running a decision tree on a large dataset, and I'm getting multiple
>>>instances of "NAs introduced by coercion" (> 50). What does this mean?
>>>
>>>--j
>>>
>>
>>My guess would be you're trying to convert from character to numeric and
>>are unable to do so. As in,
>>
>>
>>>as.numeric("A")
>>
>>[1] NA
>>Warning message:
>>NAs introduced by coercion
>>
>>>as.numeric("1")
>>
>>[1] 1
>>
>>But without more information from you it's impossible to tell.
>>
>>See the posting guide at
>>
>>http://www.R-project.org/posting-guide.html
>>
>>Regards,
>>Sundar
>>
>
>
>
More information about the R-help
mailing list