[R] Odp: How to use R to perform prediction based on history data
Petr PIKAL
petr.pikal at precheza.cz
Tue Aug 18 15:27:45 CEST 2009
Hi
r-help-bounces at r-project.org napsal dne 15.08.2009 04:27:39:
> Say I have a csv file, each row contains several fields, one of them
> are whether the row is success.
>
> In history data, I have all the fields including the result of whether
> it is success. In future data, I only have fields without the result.
>
> For example:
>
> history data:
>
> Field1 Field2 Field3 Field4 ResultField
> 1231 CA TRUE 443 TRUE
> 23231 NC TRUE 123 FALSE
> 1231 CA FALSE 243 TRUE
>
> The future data:
> Field1 Field2 Field3 Field4
> 23231 NC TRUE 123
>
>
>
> I am newbie in R and statistics, I just feel R could have some
> mechanism to give the probably of success rate based on history data.
>
> I tried to read in the csv data, and try to call "factor" on the list,
> but I am seeing error message:
> Error in sort.list(unique.default(x), na.last = TRUE) :
>
> Any idea are highly welcome.
Well, the first idea seems to be that you could buy or borrow some book of
introductory statistics and look into some R intro documents (it is in doc
folder of your R installation) there are also books of introductory
statistics which use R as a programming language.
If you do not know much about statistics and R you possibly could toss a
coin and fill your ResultField accordingly (it could be quicker and maybe
more foolproof :-).
I would not call myself an expert in statistics but in data like that you
could try
?lm, ?glm or other modelling procedure with predict ability for future
data and/or tree based package like ??mvpart
Regards
Petr
>
> Thanks in advance.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list