[R] Link prediction in social network with R
kjetilbrinchmannhalvorsen at gmail.com
Wed Dec 22 20:54:40 CET 2010
You could start having a look at cran packages like sna or statnet,
or search cran for "network" and you nfind a lot of packages!
On Wed, Dec 22, 2010 at 12:00 AM, EU JIN LOK <ejlok1 at hotmail.com> wrote:
> Dear R users
> I'm a novice user of R and have absolutely no prior knowledge of social network analysis, so apologies if my question is trivial. I've spent alot of time trying to solve this on my own but I really can't so hope someone here can help me out. Cheers!
> The dataset:
> I'm trying to predict the existance of links (True or False) in a test set using a training set. Both data sets are in an "edgelist" format, where User IDs represents nodes in both columns with the 1st column directing to the 2nd column (see figure 1 below). Using the AUC to evaluate the performance, I am looking for the best algorithm to predict the existance of links in the test data (50% are true and rest are false).
> Figure 1:
> Vertices: 1133143
> Edges: 999
> Directed: TRUE
>  105 -> 850956
>  105 -> 1073420
>  105 -> 1102667
>  165 -> 888346
>  165 -> 579649
>  165 -> 136665
> I'm having problems obtaining the probability scores for the links / edges as most of the scores are for the nodes. An example of this is the graph.knn and page.rank module in igraph.
> So my questions are:
> 1) What do I need to do to obtain the scores for the links instead of the nodes (I presume it must be a data preparation step that I must be missing out)?
> 2) Which R package would be the best for running the various techniques - Jackard index, Adamic-Adar, common neightbours, PropFlow, etc
> 3) How to implement a supervised learning method such as random forest (I am guessing I need to obtain a feature list but again, how can I get the scores for the edges)?
> Hope I've explain my questions well but do let me know if more clarification is need.
> Thanks in advance
> Eu Jin
> [[alternative HTML version deleted]]
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help