[R] R crashing during batch file formatting

Jon Minton jm540 at york.ac.uk
Tue Oct 31 13:35:51 CET 2006


Thanks,

Windows XP Media Centre Edition (!) Version 2002 Service Pack 2
R version 2.3.1 (already heard that 2.4.0 has better memory handling?)

I think I'll use NaN for 'not applicable' and NA for 'missing': does anyone
know how Amelia handles and distinguishes between these two (only the
latter, of course, need imputing), and so if this can be done without
further formatting?

Jon

-----Original Message-----
From: Petr Pikal [mailto:petr.pikal at precheza.cz] 
Sent: 31 October 2006 12:15
To: Jon Minton; r-help at stat.math.ethz.ch
Subject: Re: [R] R crashing during batch file formatting

Hi

you shall probably provide more information (OS, R version).
I cannot help you much with crash but here is some opinion.
I would try to do conversion interactively before I transferred it to 
a function.

However, if you want different types of NA and your data is numeric, 
you probably could make a distinction by using -Inf, Inf, NaN and NA, 
but then you need to be careful when doing analysis, as these values 
can be treated differently.

HTH
Petr



On 31 Oct 2006 at 11:43, Jon Minton wrote:

From:           	"Jon Minton" <jm540 at york.ac.uk>
To:             	<r-help at stat.math.ethz.ch>
Date sent:      	Tue, 31 Oct 2006 11:43:22 -0000
Subject:        	[R] R crashing during batch file formatting

> Hi R users:
> 
> 
> 
> I have the British Household Panel Survey (BHPS) in .tab format. I
> want to feed it through the Amelia package (which will be an
> 'interesting' job in itself)..
> 
> But first I need to convert the various types of missing value (from
> about -9 to -1) to a more generic 'NA' code.
> 
> 
> 
> I've written the following function to do this:
> 
> 
> 
> BHPS.converter <- function(from="D:/Data/BHPS/UKDA-5151-tab/tab/",
> to="D:/BHPS/NA/", ext="tab" ) {
> 
>                 from.files <- dir(from,
>                 pattern=paste(".",ext,"$",sep="") ) 
> 
>                 existing.to.files <- dir(to,
> pattern=paste(".",ext,"$",sep="") )
> 
>                 still.to.do.index <- 1:length(from.files)
> 
>                 still.to.do.index <-
> still.to.do.index[-match(existing.to.files, from.files)]
> 
>                 obs.to.do <- length(still.to.do.index)
> 
>                 for (i in 1:obs.to.do){
> 
>                                 temp.table <-
> read.delim(paste(from,from.files[still.to.do.index[i]], sep=""))
> 
>                                 print(paste("read:",
> from.files[still.to.do.index[i]]))
> 
>                                 temp.table[temp.table < 0 ] <- NA
> 
>                                 write.table(temp.table,
> file=paste(to,from.files[still.to.do.index[i]], sep=""))
> 
>                                 print(paste("written:",
> from.files[still.to.do.index[i]]))
> 
>                 }
> 
> 
> 
> 
> 
>                 rm(i, from.files, existing.to.files,
>                 still.to.do.index,
> obs.to.do, temp.table)
> 
> }
> 
> 
> 
> It checks for existing files in the 'to' directory (where files which
> have been modified with R- -> NA) because when I tried to do this
> conversion operation previously it got about ˝ way through then
> crashed.
> 
> 
> 
> The problem is that it crashes *this time* too, without displaying a
> prompt to say it's read a single file. 
> 
> 
> 
> The file it gets stuck on is about 75mb in size. 
> 
> 
> 
> I am using a dual-core 3.2Ghz Pentium D processor with 2 Gb memory (&
> 2Gb virtual memory), and (unfortunately) Windows XP.
> 
> 
> 
> Questions:
> 
>  1) Any general tips on how to increase the amount of memory available
>  to
> process the file?
> 
> 2) Can you see a more efficient way of doing what I'm doing?
> 
> 3) What's the best way of coding for multiple forms of NA? - the BHPS
> code '-8' (meaning 'inapplicable', not routed for this respondent)
> should really be distinguished from other forms of nonresponse...
> 
> 
> 
> 
> 
> Thanks,
> 
> 
> 
> Jon
> 
> 
> 
> 
> 
> p.s. Apologies if this is slightly too vague/long winded...
> 
> 
> 
> 
> 
> Jon Minton
> 
> 
> 
> 
> 
> 
>  [[alternative HTML version deleted]]
> 
> 

Petr Pikal
petr.pikal at precheza.cz



More information about the R-help mailing list