## Item-based k-nearest neighbors in rrecsys

Given a target user and her positively (i.e., above a pre-defined threshold) rated items, the algorithm relies on the items' similarities for the formation of a neighborhood of nearest items.

The choice of the k nearest neighbors for the neighborhood formation results in a tradeoff:
a very small k leads to few candidate items that can be recommended because there are not a lot of neighbors to support the predictions. In contrast, a very large k impacts precision as the particularities of user's preferences can be blunted due to the large neighborhood size. In most related works k has been set to be in the range of values from 10 to 100, where the optimum k also depends on data characteristics such as sparsity.

The similarity is measured based on two algorithms: cosine(simFunct ='cos') and adjusted cosine(simFunct = 'adjCos').

For the Rating Prediction task, to train a model with this algorithm, it is required to define an additional argument, *neigh* the neighborhood size.

```
data("ml100k")
d <- defineData(ml100k)
e <- evalModel(d, folds = 2)
ib_model_res <- evalPred(e, "ibknn", simFunct = "cos", neigh = 10)
ib_model_res
```

For the Item Recommendation task, to provide item recommendations, it is required to define two additional arguments, *positiveThreshold* the threshold for “positive” ratings, and the *topN* the number of recommended items.

```
data("ml100k")
d <- defineData(ml100k)
e <- evalModel(d, folds = 2)
ib_model_res <- evalRec(e, "ibknn", simFunct = "cos", neigh = 10, positiveThreshold = 3, topN = 3)
ib_model_res
```

The *neigh* default value is 10.
The *positiveThreshold* default value is 3.
The *topN* default value is 3.

The returned object is of type *IBclass*.

To get more details about the slots read the reference manual.