[R] data arranged by p-values

jd6688 jdsignature at gmail.com
Mon Jul 26 07:06:59 CEST 2010


Id	cat1	location	item_values	p-values	sequence	
a111	1	3002737	0.196504377	0.01	1	
a112	1	3017821	0.196504377	0.05	2	
a113	1	3027730	0.196504377	0.02	3	
a114	1	3036220	0.196504377	0.04	4	
a115	1	3053984	0.196504377	0.03	5	
a116	1	3063892	0.196504377	0.07	6	
a117	1	3076333	0.196504377	0.08	7	
a118	1	3090500	0.196504377	0.02	8	
a119	1	3103304	0.196504377	0.03	9	
a120	1	3119350	0.196504377	0.05	10	
a121	1	3129884	0.196504377	0.01	11	
a122	1	3154598	0.196504377	0.03	12	
a123	1	3170910	0.196504377	0.05	13	
a124	1	3180712	0.196504377	0.06	14	
a125	1	3186519	0.196504377	0.07	15	
a126	1	3192256	0.196504377	0.09	16	
a127	1	3198441	0.196504377	0.01	17	
a128	1	3205784	0.196504377	0.02	18	
a129	1	3210685	0.196504377	0.03	19	
a130	1	3218542	0.196504377	0.04	20	
a131	1	3234318	0.196504377	0.05	21	
a132	1	3239972	0.196504377	0.09	22	
a133	1	3245663	0.196504377	0.05	23	
a134	1	3257997	0.196504377	0.02	24	
a135	1	3273226	0.196504377	0.03	26	
a136	1	3285404	0.196504377	0.04	27	
a137	1	3290332	0.196504377	0.05	28	
a138	1	3300679	0.196504377	0.03	29	
a139	1	3310164	0.196504377	0.09	30	


first of all, please pay attention to the P -values, all the rows with the
p-value <0.05 will be considered as one region until the p-value >0.05
identified. for instance: REGION 1 is the rows from id a111 to id A115 .
REGION 2  is the rows from id a118 to a123, etc.

what i am going to accomplish is to pick the start and end location, and the
peak value from the item_values for each region.

option 1:

   loop through each row until the p-value>0.05 identified then
        start_location=the first location value
        end_location=the location value before the p>0.05
        peak_value of the item_values=the maximum one

option 2

    create a sequence number for each row;
    subset the raw dataframe by p<0.05;
    the p-value regions will be identified by the gapped sequence number.
for instance
   from sequence 1 to 5 will be considering one region.

     Id	cat1	location	item_values	p-values	sequence	
a111	1	3002737	0.196504377	0.01	1	
a112	1	3017821	0.196504377	0.05	2	
a113	1	3027730	0.196504377	0.02	3	
a114	1	3036220	0.196504377	0.04	4	
a115	1	3053984	0.196504377	0.03	5	
a118	1	3090500	0.196504377	0.02	8	
a119	1	3103304	0.196504377	0.03	9       


I need your recommendation on the different approach to implement this?
Thanks,

-- 
View this message in context: http://r.789695.n4.nabble.com/data-arranged-by-p-values-tp2301909p2301909.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list