[R] Method for reduction of independent variables

Daniel Malter daniel at umd.edu
Wed Jan 13 19:32:13 CET 2010


Hi, please read the posting guide. You are not likely to get an extensive
answer to your question from this list. Your question is a "please
solve/explain my statistical problem for me" question. There are two things
problematic with that. First, "statistical", and second "please solve for
me."

First, the R-help list is mostly concerned with problems in implementing
analyses in R, not with the (choice of the) statistical approach per se
(there are few exceptions). Second, "please solve for me" questions are
generally frowned upon, unless you evidence a specific point at which you
are stuck and have to make a choice. That is, the list members want to see
that you have done your "homework" to the extent one can expect you to. To
ask the list to provide an introduction to data reduction methods without
having any background knowledge is, frankly, a waste of your and the list
members' time. There are books on the topic, which you can buy or lend, and
certainly many online sources to give you a basic background. Or you can
start here: http://en.wikipedia.org/wiki/Dimension_reduction. If you want
your statistical questions answered and problems solved without reading
yourself into the matter, your question is more suitable for a local
statistician at your institution or a paid service rather than this list.

Best,
Daniel 

-------------------------
cuncta stricte discussurus
-------------------------
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of rubystallion
Sent: Wednesday, January 13, 2010 11:57 AM
To: r-help at r-project.org
Subject: [R] Method for reduction of independent variables


Hello

I am currently investing software code metrics for a variety of software
projects of a company to determine the worst parts of software products
according to specified quality characteristics. 
As the gathering of metrics correlates with effort, I would like to find a
subset of the metrics preserving significant predictive power for the
"problem value" while using the least amount of code metrics. 

I have the results of 25 metrics for 6 software projects for a combined 9355
"individuals", i.e. software parts with metrics.
However, as many metrics only measure metric values above a predefined
limit, 58% of the responses for independent variables are 0.

Which method can I use to determine a reduced set of independent variables
with significant predictive power?
As I do not have a statistics background, I would also appreciate a simple
explanation of the chosen method and sensible choices for parameters, so
that I will be able to infer the reduced set of software metrics to keep.

Thank you in advance!

Johannes
-- 
View this message in context:
http://n4.nabble.com/Method-for-reduction-of-independent-variables-tp1013171
p1013171.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list