[R] Analyzing large files faster
Rui Barradas
ruipbarradas at sapo.pt
Wed Jun 13 00:06:57 CEST 2012
Hello,
The trick is to use index vectors. They allow us to do without loops.
Try the following.
muscle <- read.table(text='
"ID" "adj.P.Val" "logFC" "Gene.symbol"
"1419156_at" "5.32e-12" "2.6462565" "Sox4"
"1433575_at" "5.32e-12" "3.9417089" "Sox4"
"1428942_at" "2.64e-11" "3.9163618" "Mt2"
"1454699_at" "2.69e-10" "1.8654677" "LOC100047324///Sesn1"
"1416926_at" "3.19e-10" "2.172342" "Trp53inp1"
"1422557_s_at" "1.58e-09" "2.9569254" "Mt1"
', header=TRUE, stringsAsFactors=FALSE)
muscle
p_thresh = 6.51e-06
# Create index vectors
gsym <- muscle$Gene.symbol != ""
this_pval <- muscle$adj.P.Val <= p_thresh
this_Ma <- muscle$logFC > -1
this_Mb <- muscle$logFC < 1
# Use them
downregulated_list <- muscle$Gene.symbol[gsym & !this_Ma & this_pval]
upregulated_list <- muscle$Gene.symbol[gsym & !this_Mb & this_pval]
nochange <- muscle$Gene.symbol[gsym & this_Ma & this_Mb]
# See the result [ Maybe with head() ]
upregulated_list
downregulated_list
nochange
Hope this helps,
Rui Barradas
Em 12-06-2012 21:55, mousy0815 escreveu:
> upregulated_list = c()
> downregulated_list = c()
> nochange = c()
> p_thresh = 6.51e-06
> x=1
>
> while (x <= nrow(muscle)) {
> this_pval = muscle[x,"adj.P.Val"]
> this_M = muscle[x, "logFC"]
More information about the R-help
mailing list