[R] how to improve the efficiency of the following lapply codes
Weiwei Shi
helprhelp at gmail.com
Wed Oct 25 19:04:43 CEST 2006
object.size(intersect.matrix)
41314204
but my machine has 4 G memory, so it should be ok since after 12
hours, it finishes 16k out of 60k but still slow non-linearly.
I am thinking to chop 60k into multiple 5k data.frames to run the
program. but just wondering is there a way around it?
> version
_
platform i686-pc-linux-gnu
arch i686
os linux-gnu
system i686, linux-gnu
status
major 2
minor 3.1
year 2006
month 06
day 01
svn rev 38247
language R
version.string Version 2.3.1 (2006-06-01)
[wshi at chopper ox]$ more /proc/meminfo
total: used: free: shared: buffers: cached:
Mem: 4189724672 3035549696 1154174976 0 282836992 2057129984
Swap: 4293586944 645042176 3648544768
[wshi at chopper ox]$ more /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Xeon(TM) CPU 3.60GHz
stepping : 3
cpu MHz : 3591.419
cache size : 2048 KB
thanks.
On 10/25/06, Weiwei Shi <helprhelp at gmail.com> wrote:
> Hi,
> I have a series of lda analysis using the following lapply function:
>
> n <- dim(intersect.matrix)[1]
> net1.lda <- lapply(1:(n), function(k) i.lda(data.list,
> intersect.matrix, i=k, w))
>
> i.lda is function to do the real lda analysis.
>
> intersect.matrix is a nx1026 matrix, n can be a really huge number
> like 60k. The target is perform a random search. Building a n=120k
> matrix is impossible for my machine. When n=5k, the task can be done
> in 30 min while n=60k, it is estimated to take 5 days. So I am
> wondering where my coding problem is, which causes this to be a
> nonlinearity.
>
> If more info is needed, I will provide.
>
> thanks
>
> --
> Weiwei Shi, Ph.D
> Research Scientist
> GeneGO, Inc.
>
> "Did you always know?"
> "No, I did not. But I believed..."
> ---Matrix III
>
--
Weiwei Shi, Ph.D
Research Scientist
GeneGO, Inc.
"Did you always know?"
"No, I did not. But I believed..."
---Matrix III
More information about the R-help
mailing list