# A short introduction to TensorPreAve

library(TensorPreAve)

## Data of tensor time series

This package provides function to estimate the rank and factor loading spaces of time series tensor factor models. A $$K$$-th order tensor time series can be stored as a ‘Tensor’ object with $$K+1$$ modes, with mode-1 corresponds to time. See for the operations of ‘Tensor’ objects. As an example, value weighted Fama-French portfolio returns data formed on 10 levels of size and 10 levels of operating profitability can be shaped as a tensor time series, with monthly data from July 1973 to June 2021 so that T = 576. (See Weilin Chen (2022) for more details.) The data tensor thus has size 576 × 10 × 10 tensor, and can be stored in the following ‘Tensor’ object.

value_weight_tensor@modes
#>  576  10  10

We also provide a function ‘tensor_data_gen’ to generate random tensor time series based on econometric factor model assumptions. (See Weilin Chen (2022) for more details on the assumptions.)

# Setting tensor dimensions and parameters related to data generation
K = 2                     # The number of modes for the tensor time series
n = 100                   # Length of time series.
d = c(10,10)              # Dimensions of each mode of the tensor
r = c(2,2)                # Rank of core tensors,
re = 10                   # Dimension of the cross-sectional common error
eta = list(c(0,0),c(0,0)) # Control factor strengths in each factor loading matrix
u = list(c(-2,2),c(-2,2)) # Control the range of elements in each factor loading matrix

set.seed(10)
Data_test = tensor_data_gen(K,n,d,r,re,eta,u)

X = Data_test$X A = Data_test$A
F_ts = Data_test$F_ts E_ts = Data_test$E_ts

X@modes
#>  100  10  10
F_ts@modes
#>  100   2   2
E_ts@modes
#>  100  10  10
dim(A[])
#>  10  2

## Pre-averaging method to estimate the strongest factor

Given a time series tensor factor model, ‘pre_est’ estimates the factor loading matrices by setting $$z$$ to be the rank of core tensors if it is known.

X = value_weight_tensor
set.seed(10)
Q_PRE_2 = pre_est(X, z = c(2,2))
Q_PRE_2
#> []
#>               [,1]        [,2]
#>  [1,]  0.899541445  0.11899170
#>  [2,]  0.116345254  0.45020724
#>  [3,] -0.003945559  0.55331589
#>  [4,] -0.067410116  0.43390894
#>  [5,] -0.131462950  0.35249401
#>  [6,] -0.172117014  0.28424491
#>  [7,] -0.166848184  0.22955540
#>  [8,] -0.194329983  0.07537761
#>  [9,] -0.187656164  0.02461164
#> [10,] -0.158130363 -0.15712601
#>
#> []
#>              [,1]        [,2]
#>  [1,] -0.49323432  0.65960017
#>  [2,] -0.39909521 -0.03918812
#>  [3,] -0.23058147 -0.28040576
#>  [4,] -0.14495219 -0.35494734
#>  [5,] -0.04484750 -0.35694299
#>  [6,]  0.08343645 -0.24838616
#>  [7,]  0.18881603 -0.08997255
#>  [8,]  0.28671500 -0.06472708
#>  [9,]  0.37141027  0.10650837
#> [10,]  0.50841760  0.38215735

For most cases we do not know the true number of factors. In Weilin Chen (2022), pre-averaging procedure is used to estimate the directions corresponding to the strongest factors by setting $$z$$ to be a vector of ones of length $$K$$, which is also the default value in the function.

set.seed(10)
Q_PRE_1 = pre_est(X)
Q_PRE_1
#> []
#>               [,1]
#>  [1,]  0.899541445
#>  [2,]  0.116345254
#>  [3,] -0.003945559
#>  [4,] -0.067410116
#>  [5,] -0.131462950
#>  [6,] -0.172117014
#>  [7,] -0.166848184
#>  [8,] -0.194329983
#>  [9,] -0.187656164
#> [10,] -0.158130363
#>
#> []
#>              [,1]
#>  [1,] -0.49323432
#>  [2,] -0.39909521
#>  [3,] -0.23058147
#>  [4,] -0.14495219
#>  [5,] -0.04484750
#>  [6,]  0.08343645
#>  [7,]  0.18881603
#>  [8,]  0.28671500
#>  [9,]  0.37141027
#> [10,]  0.50841760

## Re-estimation by iterative projection

Running ‘pre_est’ gives us the estimated directions corresponding to the strongest factors. Using these directions as the initial directions, we can perform iterative projection to better estimate the factor loading spaces, which can be computed by the function ‘iter_proj’. Similarly, we can set $$z$$ to be the number of factors if it is known.

set.seed(10)
Q_PROJ_2 = iter_proj(X, initial_direction = Q_PRE_1, z = c(2,2))
Q_PROJ_2
#> []
#>             [,1]        [,2]
#>  [1,] -0.4186526  0.81958342
#>  [2,] -0.3034959  0.15608438
#>  [3,] -0.2897870  0.03418513
#>  [4,] -0.2822386 -0.02447667
#>  [5,] -0.2884712 -0.06993895
#>  [6,] -0.2819975 -0.14283911
#>  [7,] -0.3100956 -0.25977356
#>  [8,] -0.2982880 -0.19637954
#>  [9,] -0.3101129 -0.25107977
#> [10,] -0.3534712 -0.32828374
#>
#> []
#>             [,1]        [,2]
#>  [1,] -0.2310763 -0.20931155
#>  [2,] -0.2982651 -0.17583427
#>  [3,] -0.3100153 -0.20709533
#>  [4,] -0.3272582 -0.09977626
#>  [5,] -0.3673419 -0.13197027
#>  [6,] -0.3693822 -0.03268300
#>  [7,] -0.3526410 -0.05158860
#>  [8,] -0.3380189  0.05593774
#>  [9,] -0.3091934  0.18851720
#> [10,] -0.2209928  0.90145090

Weilin Chen (2022) use the final estimated directions to further estimate the rank of core tensors. Thus, the default value for $$z$$ is a vector of ones of length $$K$$.

set.seed(10)
Q_PROJ_1 = iter_proj(X, initial_direction = Q_PRE_1)
Q_PROJ_1
#> []
#>   -0.4186526 -0.3034959 -0.2897870 -0.2822386 -0.2884712 -0.2819975
#>   -0.3100956 -0.2982880 -0.3101129 -0.3534712
#>
#> []
#>   -0.2310763 -0.2982651 -0.3100153 -0.3272582 -0.3673419 -0.3693822
#>   -0.3526410 -0.3380189 -0.3091934 -0.2209928

## Core tensor rank estimation using projected data

With the estimated directions produced by iterative projection, ‘bs_cor_rank’ is a function to estimate the rank of core tensors by Bootstrapped Correlation Thresholding.

set.seed(10)
bs_rank = bs_cor_rank(X, Q_PROJ_1)
bs_rank
#>  2 2

## The whole rank and factor loading estimations in a single step

With the help of all above functions, given a tensor time series with unknown rank of core tensor, we can actually estimate the rank and factor loading matrices simultaneously. This can be done by the function ‘rank_factors_est’.

set.seed(10)
results = rank_factors_est(X)
results
#> $rank #>  2 2 #> #>$loadings
#> $loadings[] #> [,1] [,2] #> [1,] -0.4186526 0.81958342 #> [2,] -0.3034959 0.15608438 #> [3,] -0.2897870 0.03418513 #> [4,] -0.2822386 -0.02447667 #> [5,] -0.2884712 -0.06993895 #> [6,] -0.2819975 -0.14283911 #> [7,] -0.3100956 -0.25977356 #> [8,] -0.2982880 -0.19637954 #> [9,] -0.3101129 -0.25107977 #> [10,] -0.3534712 -0.32828374 #> #>$loadings[]
#>             [,1]        [,2]
#>  [1,] -0.2310763 -0.20931155
#>  [2,] -0.2982651 -0.17583427
#>  [3,] -0.3100153 -0.20709533
#>  [4,] -0.3272582 -0.09977626
#>  [5,] -0.3673419 -0.13197027
#>  [6,] -0.3693822 -0.03268300
#>  [7,] -0.3526410 -0.05158860
#>  [8,] -0.3380189  0.05593774
#>  [9,] -0.3091934  0.18851720
#> [10,] -0.2209928  0.90145090

Alternatively, if the rank of core tensor is already known, we can set it as , then the function outputs the estimator of factor loading matrices accordingly.

set.seed(10)
results = rank_factors_est(X, input_r = c(2,2))
results
#> $rank #>  2 2 #> #>$loadings
#> $loadings[] #> [,1] [,2] #> [1,] -0.4186526 0.81958342 #> [2,] -0.3034959 0.15608438 #> [3,] -0.2897870 0.03418513 #> [4,] -0.2822386 -0.02447667 #> [5,] -0.2884712 -0.06993895 #> [6,] -0.2819975 -0.14283911 #> [7,] -0.3100956 -0.25977356 #> [8,] -0.2982880 -0.19637954 #> [9,] -0.3101129 -0.25107977 #> [10,] -0.3534712 -0.32828374 #> #>$loadings[]
#>             [,1]        [,2]
#>  [1,] -0.2310763 -0.20931155
#>  [2,] -0.2982651 -0.17583427
#>  [3,] -0.3100153 -0.20709533
#>  [4,] -0.3272582 -0.09977626
#>  [5,] -0.3673419 -0.13197027
#>  [6,] -0.3693822 -0.03268300
#>  [7,] -0.3526410 -0.05158860
#>  [8,] -0.3380189  0.05593774
#>  [9,] -0.3091934  0.18851720
#> [10,] -0.2209928  0.90145090