do I need to do meta analysis and how to perform meta analysis for observational studies

If you have the raw data from each study, I would analyze those directly. This is usually called an "individual patient data meta-analysis" (IPDMA). For certain types of models, an IPDMA and a meta-analysis based on effect sizes will yield identical results, but if you have the raw data, I would go with an IPDMA.

And yes, you should include 'study ID' in your model, either as a random effect or as a fixed effects (the merits of those two approaches can be debated).

If you want to go with a more traditional meta-analysis, then you would first want to fit logistic regression models to each dataset, including the covariates you want to adjust for. You can then extract the log odds ratio from each model with its corresponding standard error. You can then meta-analyze those values with:

rma(log_odds_ratio, sei=standard_error)


rma(log_odds_ratio, standard_error^2)

(the second argument of rma() is for the variances, so standard_error^2 will work, or if you want to pass the SEs to the function, then you can do that via the 'sei' argument).


I have collected raw data from five observational studies, so a total of 1300 observations or patients, then for each patients there are 17,000 variables, since all five studies used the same method to measure gene expression values for over 16900 genes (plus some phenotypic variables, 17000 variables in total). Then I normalized gene expression data for all 16900 genes, and then calculated an immune score for each patient based on gene expression values for those  16900 genes.   I need to run a multivariable logistic regression to test association between the estimated immune score and a cancer development phenotype by adjusting for  covariates such as age, race etc.

My first question is :   do I really need to use the meta analysis approach?  I originally just ran multivariable logistic regression by considering all 1300 patients are from one combined study (patient phenotype (yes or no) = immune score + 6 covariates, so estimating adjusted OR).  Then I thought about including study ID as a  random factor in the model or calculate OR for each of five study, then pool the five OR to get an overall OR.  What do you think?

The second question:  from reading meta analysis paper,  the following R code to run meta analysis was design to run  clinical trials (randomized design) with confounding variables controlled in each study, so only univariate logistic regression analysis required, no need to adjust covariate; so how to calculate overall adjusted OR by meta analysis?  if I calculate adjusted OR for each study, I will get " Estimate" (for OR) and  "Std. Error" in each study. if using metafor, I guess I need to convert OR to logOR (Y in the example code below), then standard error to variance (V), is that right?  also how to convert from standard error in each study to variance?

Y <- with(dat.bcg, log(tpos * cneg/(tneg * cpos)))
V <- with(dat.bcg, 1/tpos + 1/cneg + 1/tneg + 1/cpos)
cbind(Y, V)

result.or.FE <- rma(yi = Y, vi = V, method = "FE") # Log Odds Ratio

result.or.DL <- rma(yi = Y, vi = V, method = "DL")

