[R] How to create a new data.frame based on calculation of subsets of an existing data.frame
Jim Lemon
drj|m|emon @end|ng |rom gm@||@com
Fri Dec 20 22:04:29 CET 2019
Hi Ioanna,
We're getting somewhere, but there are four unique combinations of
Taxonomy and IM.type:
ER+ETR_H1,PGA
ER+ETR_H2,PGA
ER+ETR_H1,Sa
ER+ETR_H2,Sa
Perhaps you mean that ER+ETR_H1 only occurs with PGA and ER+ETR_H2
only occurs with Sa. I handled that by checking that there were any
rows that corresponded to the condition requested.
Also you want a matrix for each row containing Taxonomy and IM.type in
the output. When I run what I think you are asking, I only get a two
element list, each a vector of values. Maybe this is what you want,
and it could be coerced into matrix format:
D<- data.frame(Ref.No = c(1622, 1623, 1624, 1625, 1626, 1627, 1628,
1629), Region = rep(c('South America'), times = 8),
IM.type = c('PGA', 'PGA', 'PGA', 'PGA', 'Sa', 'Sa', 'Sa', 'Sa'),
Damage.state = c('DS1', 'DS2', 'DS3', 'DS4','DS1', 'DS2', 'DS3', 'DS4'),
Taxonomy = c('ER+ETR_H1','ER+ETR_H1','ER+ETR_H1','ER+ETR_H1','ER+ETR_H2','ER+ETR_H2','ER+ETR_H2','ER+ETR_H2'),
Prob.of.exceedance_1 = c(0,0,0,0,0,0,0,0),
Prob.of.exceedance_2 = c(0,0,0,0,0,0,0,0),
Prob.of.exceedance_3 =
c(0.26,0.001,0.00019,0.000000573,0.04,0.00017,0.000215,0.000472),
Prob.of.exceedance_4 =
c(0.72,0.03,0.008,0.000061,0.475,0.0007,0.00435,0.000405),
stringsAsFactors=FALSE)
# names of the variables used in the calculations
calc_vars<-paste("Prob.of.exceedance",1:4,sep="_")
# get the rows for the four damage states
DS1_rows <-D$Damage.state == "DS1"
DS2_rows <-D$Damage.state == "DS2"
DS3_rows <-D$Damage.state == "DS3"
DS4_rows <-D$Damage.state == "DS4"
# create an empty list
VC<-list()
# set an index variable for VC
VCindex<-1
# step through all possible values of IM.type and Taxonomy
for(IM in unique(D$IM.type)) {
for(Tax in unique(D$Taxonomy)) {
# get a logical vector of the rows to be used in this calculation
calc_rows <- D$IM.type == IM & D$Taxonomy == Tax
cat(IM,Tax,calc_rows,"\n")
# check that there are any such rows in the data frame
if(sum(calc_rows)) {
# if so, fill in the four values for these rows
VC[[VCindex]] <- 0.0 * (1- D[calc_rows & DS1_rows,calc_vars]) +
0.02* (D[calc_rows & DS1_rows,calc_vars] -
D[calc_rows & DS2_rows,calc_vars]) +
0.10* (D[calc_rows & DS2_rows,calc_vars] -
D[calc_rows & DS3_rows,calc_vars]) +
0.43 * (D[calc_rows & DS3_rows,calc_vars] -
D[calc_rows & DS4_rows,calc_vars]) +
1.0* D[calc_rows & DS4_rows,calc_vars]
# increment the index
VCindex<-VCindex+1
}
}
}
I think we'll get there.
Jim
On Sat, Dec 21, 2019 at 12:45 AM Ioannou, Ioanna
<ioanna.ioannou using ucl.ac.uk> wrote:
>
> Hello Jim,
>
> I made some changes to the code essentially I substitute each 4 lines DS1-4 with one. I estimate VC which in an ideal world should be a matrix with 4 columns one for every exceedance_probability_1-4 and 2 rowsfor each unique combination of taxonomy and IM.Type. Coukd you please check the code I sent last and based on that give your solution?
More information about the R-help
mailing list