[R] Help RFM analysis in R (i want a code where i can define my own breaks instead of system defined breaks used in auto_RFM package)
Hemant Sain
hemantsain55 at gmail.com
Tue Oct 10 07:19:39 CEST 2017
Hello Jim,
i have converted all my variable data type according to your attached
example including date, and my dataset looks like this.
ID purchase date
1234 10.2 2017-02-18
3453 18.9 2017-03-22
7689 8 2017-03-24
but when I'm passing the data into the function it is giving me same values
for entire observations i. r=2, f=2, m=2
and which part of your code is responsible to calculate recency
and frequency score i mean how it will determine how many times a user made
a purchase in last 30 days so that we can put that user into our own
defined category.
one more thing it would be great if you can explain lil bit about finish
date. because i'm not able to understand what do you meant by finish date.
Thanks
On 10 October 2017 at 02:27, Jim Lemon <drjimlemon at gmail.com> wrote:
> I seriously doubt that you are running the code I sent. What you have
> probably done is to run your data, which has a different date format,
> without changing the breaks or the date format arguments. As you
> haven't provided any example that shows what you are doing, I can't
> guess what the problem is.
>
> Jim
>
>
> On Mon, Oct 9, 2017 at 9:40 PM, Hemant Sain <hemantsain55 at gmail.com>
> wrote:
> > I'm getting all the rows as NA in Cscore and almost most of the
> observation
> > in R and F and M are also NA.
> > what can be the reason for this. also suggest me the appropriate
> solution.
> >
> > On 9 October 2017 at 15:51, Jim Lemon <drjimlemon at gmail.com> wrote:
> >>
> >> Hi Hemant,
> >> Here is an example that might answer your questions. Please don't run
> >> previous code as it might not work.
> >>
> >> I define the break values as arguments to the function
> >> (rbreaks,fbreaks,mbreaks) If you want the breaks to work, make sure that
> >> they cover the range of the input values, otherwise you get NAs.
> >>
> >> # expects a three (or more) column data frame where
> >> # column 1 is customer ID, column 2 is amount of purchase
> >> # and column 3 is date of purchase
> >> qdrfm<-function(x,rbreaks=3,fbreaks=3,mbreaks=3,date.format="%Y-%m-%d",
> >> weights=c(1,1,1),finish=NA) {
> >>
> >> # if no finish date is specified, use current date
> >> if(is.na(finish)) finish<-as.Date(date(), "%a %b %d %H:%M:%S %Y")
> >> x$rscore<-as.numeric(finish-as.Date(x[,3],date.format))
> >> x$rscore<-as.numeric(cut(x$rscore,breaks=rbreaks,labels=FALSE))
> >> custIDs<-unique(x[,1])
> >> ncust<-length(custIDs)
> >> rfmout<-data.frame(custID=custIDs,rscore=rep(0,ncust),
> >> fscore=rep(0,ncust),mscore=rep(0,ncust))
> >> rfmout$rscore<-cut(by(x$rscore,x[,1],min),breaks=rbreaks,labels=FALSE)
> >> rfmout$fscore<-cut(table(x[,1]),breaks=fbreaks,labels=FALSE)
> >> rfmout$mscore<-cut(by(x[,2],x[,1],sum),breaks=mbreaks,labels=FALSE)
> >> rfmout$cscore<-(weights[1]*rfmout$rscore+
> >> weights[2]*rfmout$fscore+
> >> weights[3]*rfmout$mscore)/sum(weights)
> >> return(rfmout[order(rfmout$cscore),])
> >> }
> >>
> >> set.seed(12345)
> >> x2<-data.frame(ID=sample(1:50,250,TRUE),
> >> purchase=round(runif(250,5,100),2),
> >> date=paste(rep(2016,250),sample(1:12,250,TRUE),
> >> sample(1:28,250,TRUE),sep="-"))
> >>
> >> # example 1
> >> qdrfm(x2)
> >>
> >> # example 2
> >> qdrfm(x2,rbreaks=c(0,200,400),fbreaks=c(0,5,10),mbreaks=c(0,350,700),
> >> finish=as.Date("2017-01-01"))
> >>
> >> Jim
> >>
> >
> >
> >
> > --
> > hemantsain.com
>
--
hemantsain.com
[[alternative HTML version deleted]]
More information about the R-help
mailing list