# Main functions

Functions are coded for easy read and easy understand. * lnmbiclust * lnmfa * plnmfa

1 *lnmbiclust* is the main function that perform our
algorithm, which includes default initial values, main estimations as
well as model selection. For illustration, we will generate a simulation
data from model “GUU” as follows:

```
set.seed(123)
<- 40
n <- rmultinom(n,1,c(0.6,0.4))
simp <- as.factor(apply(t(simp),1,which.max))
lab
#parameter comes from multinomial
<- 11
p <- c(-2.8,-1.3,-1.6,-3.9,-2.6,-2.9,-2.5,-2.7,-3.1,-2.9)
mu1 <- matrix(c(1,0,0,1,0,0,0,0,0,0,0,1,1,0,1,0,1,1,1,1,0,0,0,0,0,1,0,0,0,0),nrow = p-1)
B1 <- diag(c(2.9,0.5,1))
T1 <- diag(c(0.52, 1.53, 0.56, 0.19, 1.32, 1.77, 0.6, 0.53, 0.37, 0.4))
D1 <- B1%*%T1%*%t(B1)+D1
cov1
<- c(1.5,-2.7,-1.1,-0.4,-1.4,-2.6,-3,-3.9,-2.7,-3)
mu2 <- matrix(c(1,0,0,1,0,0,0,0,0,0,0,1,1,0,1,0,1,1,1,1,0,0,0,0,0,1,0,0,0,0),nrow = p-1)
B2 <- diag(c(0.2,0.003,0.15))
T2 <- diag(c(0.01, 0.62, 0.45, 0.01, 0.37, 0.42, 0.08, 0.16, 0.23, 0.27))
D2 <- B2%*%T2%*%t(B2)+D2
cov2
<- matrix(0,nrow=n,ncol=p-1)
df for (i in 1:n) {
if(lab[i]==1){df[i,] <- rmvnorm(1,mu1,sigma = cov1)}
else if(lab[i]==2){df[i,] <- rmvnorm(1,mu2,sigma = cov2)}
}
<- cbind(df,0)
f_df <- exp(f_df)/rowSums(exp(f_df))
z
<- matrix(0,nrow=n,ncol=p)
W_count for (i in 1:n) {
<- rmultinom(1,runif(1,10000,20000),z[i,])
W_count[i,]
}
```

After generated data, we can strat to try to fit one model:

```
<- 2 #define the number of components
range_G <- 2 #define the possible number of bicluster.
range_Q <- "GUU" #select the model you want to fit
cov_str #It will fit GUU model with G=2, Q=c(2,2)
<- lnmbiclust(W_count=W_count, range_G=range_G, range_Q=range_Q, model=cov_str)
res #where res will be a list contain all parameters.
```

Notice the default setting is to run all 16 models if parameter
*model* is missing. There are 3 criteria you can choose: AIC,
BIC(default) and ICL.

If you don’t want to fit a model with specific *G* or
*Q*, the function can do model selection based on criteria you
choose. The output will contain two lists in *res*, one is the
paramters of the best model selected by BIC(AIC or ICL), the other one
is a dataframe of model names along with AIC, BIC and ICL values for all
models that have ran.

```
<- c(2:3)
range_G <- c(2:3)
range_Q <- c("UUU", "GGC")
cov_str <- lnmbiclust(W_count=W_count, range_G=range_G, range_Q=range_Q, model=cov_str, criteria="BIC")
res =res$best_model best_model
```

it will run G=2, Q_g=c(2,2); G=3, Q_g=c(2,2,2); G=2, Q_g=c(3,3);G=3, Q_g=c(3,3,3). In total 4 models for each UUU and GGC, then select the best one based on BIC. If you want to include permutations in UUU, UUG, UUD or UUC:

```
<- 2
range_G <- c(2:3)
range_Q <- "UUU"
cov_str <- lnmbiclust(W_count=W_count, range_G=range_G, range_Q=range_Q, model=cov_str, criteria="BIC",permutation=TRUE)
res $best_model res
```

it will run G=2, Q_g=c(2,2); G=2, Q_g=c(3,3); G=2, Q_g=c(2,3);G=2, Q_g=c(3,2). In total 4 models for UUU.

Sometimes you may want to know more detail about models that ran. For
example, if the BIC values are very close between two models, then they
may equally good. Only choose the best one with highest BIC is not fair.
Here we have output *all_fitted_model* under *lnmbiclust*,
which gives the output of all models you have ran and model selection
criteria.

`$all_fitted_model res`

It will return a dataframe with all combinations of G and Q for all models you have included, decreasing ordered as the criteria you specified, default is ordered by BIC.

2 *lnmfa* The usage is exactly the same as
*lnmbiclust*. Except it doesn’t have parameter permutation. The
only difference would be the model name. Since in this model, the
*T* is fix as identity matrix, the middle position will also
fixed as U. So all the 8 models will be: UUU, UUG, UUD, UUC, GUU, GUG,
GUD, GUC.

```
<- c(2:3)
range_G <- c(2:3)
range_Q <- c("UUU", "GUC")
cov_str <- lnmfa(W_count=W_count, range_G=range_G, range_Q=range_Q, model=cov_str, criteria="BIC")
res =res$best_model
best_model=res$all_fitted_model model_output
```

3 *plnmfa* The usage is exactly the same as *lnmfa*,
with additional tunning parameters. In here, *range_Q* need to be
specified by a number instead of a range. The *range_tuning*
could be a range of number which is between 0 and 1. And it doesn’t
allow model selections between the two model, so you have to specify the
model name between UUU and GUU.

```
<- c(2:3)
range_G =seq(0.5,0.7,length.out=10)
range_tunning<- 2
range_Q <- "UUU"
cov_str <- plnmfa(W_count=W_count, range_G=range_G, range_Q=range_Q, model=cov_str, criteria="BIC",
res range_tuning = range_tuning)
=res$best_model
best_model=res$all_fitted_model model_output
```