Package 'VLTimeCausality' reference manual

Title:	Variable-Lag Time Series Causality Inference Framework
Description:	A framework to infer causality on a pair of time series of real numbers based on variable-lag Granger causality and transfer entropy. Typically, Granger causality and transfer entropy have an assumption of a fixed and constant time delay between the cause and effect. However, for a non-stationary time series, this assumption is not true. For example, considering two time series of velocity of person A and person B where B follows A. At some time, B stops tying his shoes, then running to catch up A. The fixed-lag assumption is not true in this case. We propose a framework that allows variable-lags between cause and effect in Granger causality and transfer entropy to allow them to deal with variable-lag non-stationary time series. Please see Chainarong Amornbunchornvej, Elena Zheleva, and Tanya Berger-Wolf (2021) <doi:10.1145/3441452> when referring to this package in publications.
Authors:	Chainarong Amornbunchornvej [aut, cre]
Maintainer:	Chainarong Amornbunchornvej <[email protected]>
License:	GPL-3
Version:	0.1.5
Built:	2025-03-01 03:35:37 UTC
Source:	https://github.com/darkeyes/vltimeseriescausality

checkMultipleSimulationVLtimeseries

Description

checkMultipleSimulationVLtimeseries is a support function that can compare two adjacency matrices: groundtruth and inferred matrices. It re

Usage

checkMultipleSimulationVLtimeseries(trueAdjMat, adjMat)
checkMultipleSimulationVLtimeseries(trueAdjMat, adjMat)

Arguments

`trueAdjMat`	a groundtruth matrix.
`adjMat`	an inferred matrix.

Value

This function returns a list of precision prec, recall rec, and F1 score F1 of inferred vs. groundtruth matrices.

Examples

## Generate simulation data
#G<-matrix(FALSE,10,10) # groundtruth
#G[1,c(4,7,8,10)]<-TRUE
#G[2,c(5,7,9,10)]<-TRUE
#G[3,c(6,8,9,10)]<-TRUE
#TS <- MultipleSimulationVLtimeseries()
#out<-multipleVLGrangerFunc(TS)
#checkMultipleSimulationVLtimeseries(trueAdjMat=G,adjMat=out$adjMat)

## Generate simulation data
#G<-matrix(FALSE,10,10) # groundtruth
#G[1,c(4,7,8,10)]<-TRUE
#G[2,c(5,7,9,10)]<-TRUE
#G[3,c(6,8,9,10)]<-TRUE
#TS <- MultipleSimulationVLtimeseries()
#out<-multipleVLGrangerFunc(TS)
#checkMultipleSimulationVLtimeseries(trueAdjMat=G,adjMat=out$adjMat)

followingRelation

Description

followingRelation is a function that infers whether Y follows X.

Usage

followingRelation(Y, X, timeLagWindow, lagWindow = 0.2)
followingRelation(Y, X, timeLagWindow, lagWindow = 0.2)

Arguments

`Y`	is a numerical time series of a follower
`X`	is a numerical time series of a leader
`timeLagWindow`	is a maximum possible time delay in the term of time steps.
`lagWindow`	is a maximum possible time delay in the term of percentage of length(X). If `timeLagWindow` is missing, then `timeLagWindow=ceiling(lagWindow*length(X))`. The default is 0.2.

Value

This function returns a list of following relation variables below.

`follVal`	is a following-relation value s.t. if `follVal` is positive, then `Y` follows `X`. If `follVal` is negative, then `X` follows `Y`. Otherwise, if `follVal` is zero, there is no following relation between `X,Y`.
`nX`	is a time series that is rearranged from `X` by applying the lags `optIndexVec` in order to imitate `Y`.
`optDelay`	is the optimal time delay inferred by cross-correlation of `X,Y`. It is positive if `Y` is simply just a time-shift of `X` (e.g. `Y[t]=X[t-optDelay]`).
`optCor`	is the optimal correlation of `Y[t]=X[t-optDelay]` for all `t`.
`optIndexVec`	is a time series of optimal warping-path from DTW that is corrected by cross correlation. It is approximately that `Y[t]=X[t-optIndexVec[t]]`).
`VLval`	is a percentage of elements in `optIndexVec` that is not equal to `optDelay`.
`ccfout`	is an output object of `ccf` function.

Examples

# Generate simulation data
TS <- SimpleSimulationVLtimeseries()
# Run the function
out<-followingRelation(Y=TS$Y,X=TS$X)

# Generate simulation data
TS <- SimpleSimulationVLtimeseries()
# Run the function
out<-followingRelation(Y=TS$Y,X=TS$X)

GrangerFunc

Description

GrangerFunc is a Granger Causality function. It tests whether X Granger-causes Y.

Usage

GrangerFunc(
  Y,
  X,
  maxLag = 1,
  alpha = 0.05,
  autoLagflag = TRUE,
  gamma = 0.5,
  family = gaussian
)
GrangerFunc(
  Y,
  X,
  maxLag = 1,
  alpha = 0.05,
  autoLagflag = TRUE,
  gamma = 0.5,
  family = gaussian
)

Arguments

`Y`	is a numerical time series of effect
`X`	is a numerical time series of cause
`maxLag`	is a maximum possible time delay. The default is 1.
`alpha`	is a significance level of F-test to determine whether `X` Granger-causes `Y`. The default is 0.05.
`autoLagflag`	is a flag for enabling the automatic lag inference function. The default is true. If it is set to be true, then maxLag is set automatically using cross-correlation. Otherwise, if it is set to be false, then the function takes the maxLag value to infer Granger causality.
`gamma`	is a parameter to determine whether `X` Granger-causes `Y` using BIC difference ratio.
`family`	is a parameter of family of function for Generalized Linear Models function (glm). The default is `gaussian`.

Value

This function returns of whether X Granger-causes Y.

`ftest`	F-statistic of Granger causality.
`p.val`	A p-value from F-test.
`BIC_H0`	Bayesian Information Criterion (BIC) derived from `Y` regressing on `Y` past.
`BIC_H1`	Bayesian Information Criterion (BIC) derived from `Y` regressing on `Y`,`X` past.
`XgCsY`	The flag is true if `X` Granger-causes `Y` using BIC difference ratio where `BICDiffRatio >= gamma`.
`XgCsY_ftest`	The flag is true if `X` Granger-causes `Y` using F-test where `p.val>=alpha`.
`XgCsY_BIC`	The flag is true if `X` Granger-causes `Y` using BIC where `BIC_H0>=BIC_H1`.
`maxLag`	A maximum possible time delay.
`H0`	glm object of `Y` regressing on `Y` past.
`H1`	glm object of `Y` regressing on `Y,X` past.
`BICDiffRatio`	Bayesian Information Criterion difference ratio: `(BIC_H0-BIC_H1)/BIC_H0`.

Examples

# Generate simulation data
TS <- SimpleSimulationVLtimeseries()
# Run the function
out<-GrangerFunc(Y=TS$Y,X=TS$X)

# Generate simulation data
TS <- SimpleSimulationVLtimeseries()
# Run the function
out<-GrangerFunc(Y=TS$Y,X=TS$X)

MultipleSimulationVLtimeseries

Description

MultipleSimulationVLtimeseries is a support function for generating a set of time series TS[,1],...TS[,10]. TS[,1],TS[,2],TS[,3] are causes X time series that are generated independently. The rest of time series are Y time series that are effects of some causes TS[,1],TS[,2],TS[,3]. TS[,1] causes TS[,4],TS[,7],TS[,8], and TS[,10]. TS[,2] causes TS[,5],TS[,7],TS[,9], and TS[,10]. TS[,3] causes TS[,6],TS[,8],TS[,9], and TS[,10].

Usage

MultipleSimulationVLtimeseries(
  n = 200,
  lag = 5,
  YstFixInx = 110,
  YfnFixInx = 170,
  XpointFixInx = 100,
  arimaFlag = TRUE,
  seedVal = -1
)
MultipleSimulationVLtimeseries(
  n = 200,
  lag = 5,
  YstFixInx = 110,
  YfnFixInx = 170,
  XpointFixInx = 100,
  arimaFlag = TRUE,
  seedVal = -1
)

Arguments

`n`	is length of time series.
`lag`	is a time lag between `X` and `Y` s.t. `Y[t]` is approximately `X[t-lag]`.
`YstFixInx`	is the starting point of variable lag part.
`YfnFixInx`	is the end point of variable lag part.
`XpointFixInx`	is a point in X s.t. `Y[YstFixInx:YfnFixInx]= X[XpointFixInx]` .
`arimaFlag`	is ARMA model flag. If it is true, then `X` is generated by ARMA model. If it is false, then `X` is generated by sampling of the standard normal distribution.
`seedVal`	is a seed parameter for generating random noise.

Value

This function returns a list of time series TS.

Examples

# Generate simulation data
TS <- MultipleSimulationVLtimeseries()

# Generate simulation data
TS <- MultipleSimulationVLtimeseries()

multipleVLGrangerFunc

Description

multipleVLGrangerFunc is a function that infers Variable-lag Granger Causality of all pairwises of m time series TS[,1],...TS[,m].

Usage

multipleVLGrangerFunc(
  TS,
  maxLag,
  alpha = 0.05,
  gamma = 0.3,
  autoLagflag = TRUE,
  causalFlag = 0,
  VLflag = TRUE,
  family = gaussian
)
multipleVLGrangerFunc(
  TS,
  maxLag,
  alpha = 0.05,
  gamma = 0.3,
  autoLagflag = TRUE,
  causalFlag = 0,
  VLflag = TRUE,
  family = gaussian
)

Arguments

`TS`	is a numerical time series of effect where `TS[t,k]` is an element at time `t` of `k`th time series.
`maxLag`	is a maximum possible time delay. The default is 0.2*length(Y).
`alpha`	is a significance level of F-test to determine whether `X` Granger-causes `Y`. The default is 0.05.
`gamma`	is a parameter to determine whether `X` Granger-causes `Y` using BIC difference ratio. The default is 0.3.
`autoLagflag`	is a flag for enabling the automatic lag inference function. The default is true. If it is set to be true, then maxLag is set automatically using cross-correlation. Otherwise, if it is set to be false, then the function takes the maxLag value to infer Granger causality.
`causalFlag`	is a choice of criterion for inferring causality: `causalFlag=0` for BIC difference ratio, `causalFlag=1` for f-test, or `causalFlag=2` for BIC.
`VLflag`	is a flag of Granger causality choice: either `VLflag=TRUE` for VL-Granger or `VLflag=FALSE` for Granger causality.
`family`	is a parameter of family of function for Generalized Linear Models function (glm). The default is `gaussian`.

Value

This function returns of a list of an adjacency matrix of causality where adjMat[i,j] is true if TS[,i] causes TS[,j].

Examples

## Generate simulation data
#TS <- MultipleSimulationVLtimeseries()
## Run the function
#out<-multipleVLGrangerFunc(TS)


## Generate simulation data
#TS <- MultipleSimulationVLtimeseries()
## Run the function
#out<-multipleVLGrangerFunc(TS)

multipleVLTransferEntropy

Description

multipleVLTransferEntropy is a function that infers Variable-lag Transfer Entropy of all pairwises of m time series TS[,1],...TS[,m].

Usage

multipleVLTransferEntropy(
  TS,
  maxLag,
  nboot = 0,
  lx = 1,
  ly = 1,
  VLflag = TRUE,
  autoLagflag = TRUE,
  alpha = 0.05
)
multipleVLTransferEntropy(
  TS,
  maxLag,
  nboot = 0,
  lx = 1,
  ly = 1,
  VLflag = TRUE,
  autoLagflag = TRUE,
  alpha = 0.05
)

Arguments

`TS`	is a numerical time series of effect where `TS[t,k]` is an element at time `t` of `k`th time series.
`maxLag`	is a maximum possible time delay. The default is 0.2*length(Y).
`nboot`	is a number of times of bootstrapping for RTransferEntropy::transfer_entropy() function.
`lx`, `ly`	are lag parameters of RTransferEntropy::transfer_entropy().
`VLflag`	is a flag of Granger causality choice: either `VLflag=TRUE` for VL-Granger or `VLflag=FALSE` for Granger causality.
`autoLagflag`	is a flag for enabling the automatic lag inference function. The default is true. If it is set to be true, then maxLag is set automatically using cross-correlation. Otherwise, if it is set to be false, then the function takes the maxLag value to infer Granger causality.
`alpha`	is a significant-level threshold for TE bootstrapping by Dimpfl and Peter (2013).

Value

This function returns of a list of an adjacency matrix of causality where adjMat[i,j] is true if TS[,i] causes TS[,j].

Examples

## Generate simulation data
#out1<-SimpleSimulationVLtimeseries()
#TS<-cbind(out1$X,out1$Y)
## Run the function
#out2<-multipleVLTransferEntropy(TS,maxLag=1)


## Generate simulation data
#out1<-SimpleSimulationVLtimeseries()
#TS<-cbind(out1$X,out1$Y)
## Run the function
#out2<-multipleVLTransferEntropy(TS,maxLag=1)

plotTimeSeries

Description

plotTimeSeries is a function for visualizing time series

Usage

plotTimeSeries(X, Y, strTitle = "Time Series Plot", TSnames)
plotTimeSeries(X, Y, strTitle = "Time Series Plot", TSnames)

Arguments

`X`	is a 1st numerical time series
`Y`	is a 2nd numerical time series. If it is not supplied, the function plots only `X`.
`strTitle`	is a string of the plot title
`TSnames`	is a list of legend of `X,Y` where TSnames[1] is a legend of `X` and TSnames[2] is a legend of `Y`.

Value

This function returns an object of ggplot class.

Examples

# Generate simulation data
TS <- SimpleSimulationVLtimeseries()
# Run the function
plotTimeSeries(Y=TS$Y,X=TS$X)

# Generate simulation data
TS <- SimpleSimulationVLtimeseries()
# Run the function
plotTimeSeries(Y=TS$Y,X=TS$X)

SimpleSimulationVLtimeseries

Description

SimpleSimulationVLtimeseries is a support function for generating time series X,Y where X VL-Granger-causes Y.

Usage

SimpleSimulationVLtimeseries(
  n = 200,
  lag = 5,
  YstFixInx = 110,
  YfnFixInx = 170,
  XpointFixInx = 100,
  arimaFlag = TRUE,
  seedVal = -1,
  expflag = FALSE,
  causalFlag = TRUE
)
SimpleSimulationVLtimeseries(
  n = 200,
  lag = 5,
  YstFixInx = 110,
  YfnFixInx = 170,
  XpointFixInx = 100,
  arimaFlag = TRUE,
  seedVal = -1,
  expflag = FALSE,
  causalFlag = TRUE
)

Arguments

`n`	is length of time series.
`lag`	is a time lag between `X` and `Y` s.t. `Y[t]` is approximately `X[t-lag]`.
`YstFixInx`	is the starting point of variable lag part.
`YfnFixInx`	is the end point of variable lag part.
`XpointFixInx`	is a point in X s.t. `Y[YstFixInx:YfnFixInx]= X[XpointFixInx]` .
`arimaFlag`	is ARMA model flag. If it is true, then `X` is generated by ARMA model. If it is false, then `X` is generated by sampling of the standard normal distribution.
`seedVal`	is a seed parameter for generating random noise. If it is not -1, then the rnorm is set the random seed with `seedVal`.
`expflag`	is the flag to set the relation between `Y[i+lag]` and `X[i]`. If it is false `Y,X` has a linear relation, otherwise, they have an exponential relation.
`causalFlag`	is a flag. If it is true, then `X` causes `Y`. Otherwise, `X,Y` have no causal relation.

Value

This function returns a list of time series X,Y where X VL-Granger-causes Y.

Examples

# Generate simulation data
TS <- SimpleSimulationVLtimeseries()

# Generate simulation data
TS <- SimpleSimulationVLtimeseries()

TSNANNearestNeighborPropagation

Description

TSNANNearestNeighborPropagation is a function that fills NA values with nearest real values in the past ( or future if the first position of time series is NA), for time series X.

Usage

TSNANNearestNeighborPropagation(X)
TSNANNearestNeighborPropagation(X)

Arguments

`X`	is a T-by-D matrix numerical time series

Value

This function returns a list of following relation variables below.

Xout

is a T-by-D matrix numerical time series that all NAN have been filled with nearest real values.

Examples

# Load example data

z<-1:20
z[2:5]<-NA
z<-TSNANNearestNeighborPropagation(z)

# Load example data

z<-1:20
z[2:5]<-NA
z<-TSNANNearestNeighborPropagation(z)

VLGrangerFunc

Description

VLGrangerFunc is a Variable-lag Granger Causality function. It tests whether X VL-Granger-causes Y.

Usage

VLGrangerFunc(
  Y,
  X,
  alpha = 0.05,
  maxLag,
  gamma = 0.5,
  autoLagflag = TRUE,
  family = gaussian
)
VLGrangerFunc(
  Y,
  X,
  alpha = 0.05,
  maxLag,
  gamma = 0.5,
  autoLagflag = TRUE,
  family = gaussian
)

Arguments

`Y`	is a numerical time series of effect
`X`	is a numerical time series of cause
`alpha`	is a significance level of f-test to determine whether `X` Granger-causes `Y`. The default is 0.05.
`maxLag`	is a maximum possible time delay. The default is 0.2*length(Y).
`gamma`	is a parameter to determine whether `X` Granger-causes `Y` using BIC difference ratio. The default is 0.5.
`autoLagflag`	is a flag for enabling the automatic lag inference function. The default is true. If it is set to be true, then maxLag is set automatically using cross-correlation. Otherwise, if it is set to be false, then the function takes the maxLag value to infer Granger causality.
`family`	is a parameter of family of function for Generalized Linear Models function (glm). The default is `gaussian`.

Value

This function returns of whether X Granger-causes Y.

`ftest`	F-statistic of Granger causality.
`p.val`	A p-value from F-test.
`BIC_H0`	Bayesian Information Criterion (BIC) derived from `Y` regressing on `Y` past.
`BIC_H1`	Bayesian Information Criterion (BIC) derived from `Y` regressing on `Y`,`X` past.
`XgCsY`	The flag is true if `X` Granger-causes `Y` using BIC difference ratio where `BICDiffRatio >= gamma`.
`XgCsY_ftest`	The flag is true if `X` Granger-causes `Y` using f-test where `p.val>=alpha`.
`XgCsY_BIC`	The flag is true if `X` Granger-causes `Y` using BIC where `BIC_H0>=BIC_H1`.
`maxLag`	A maximum possible time delay.
`H0`	glm object of `Y` regressing on `Y` past.
`H1`	glm object of `Y` regressing on `Y,X` past.
`follOut`	is a list of variables from function `followingRelation`.
`BICDiffRatio`	Bayesian Information Criterion difference ratio: `(BIC_H0-BIC_H1)/BIC_H0`.

Examples

# Generate simulation data
TS <- SimpleSimulationVLtimeseries()
# Run the function
out<-VLGrangerFunc(Y=TS$Y,X=TS$X)

# Generate simulation data
TS <- SimpleSimulationVLtimeseries()
# Run the function
out<-VLGrangerFunc(Y=TS$Y,X=TS$X)

VLTransferEntropy

Description

VLTransferEntropy is a Variable-lag Transfer Entropy function. It tests whether X VL-Transfer-Entropy-causes Y.

Usage

VLTransferEntropy(
  Y,
  X,
  maxLag,
  nboot = 0,
  lx = 1,
  ly = 1,
  VLflag = TRUE,
  autoLagflag = TRUE,
  alpha = 0.05
)
VLTransferEntropy(
  Y,
  X,
  maxLag,
  nboot = 0,
  lx = 1,
  ly = 1,
  VLflag = TRUE,
  autoLagflag = TRUE,
  alpha = 0.05
)

Arguments

`Y`	is a numerical time series of effect
`X`	is a numerical time series of cause
`maxLag`	is a maximum possible time delay. The default is 0.2*length(Y).
`nboot`	is a number of times of bootstrapping for RTransferEntropy::transfer_entropy() function.
`lx`, `ly`	are lag parameters of RTransferEntropy::transfer_entropy().
`VLflag`	is a flag of Transfer Entropy choice: either `VLflag=TRUE` for VL-Transfer Entropy or `VLflag=FALSE` for Transfer Entropy.
`autoLagflag`	is a flag for enabling the automatic lag inference function. The default is true. If it is set to be true, then maxLag is set automatically using cross-correlation. Otherwise, if it is set to be false, then the function takes the maxLag value to infer Granger causality.
`alpha`	is a significant-level threshold for TE bootstrapping by Dimpfl and Peter (2013).

Value

This function returns of whether X (VL-)Transfer-Entropy-causes Y.

`TEratio`	is a Transfer Entropy ratio. If it is greater than one , then `X` causes `Y`.
`res`	is an object of output from RTransferEntropy::transfer_entropy()
`follOut`	is a list of variables from function `followingRelation`.
`XgCsY_trns`	The flag is true if `X` (VL-)Transfer-Entropy-causes `Y` using Transfer Entropy ratio ratio where `TEratio >1` if `X` causes `Y`. Additionally, if `nboot>1`, the flag is true only when `pval<=alpha`.
`pval`	It is a p-value for TE bootstrapping by Dimpfl and Peter (2013).

Examples

# Generate simulation data
TS <- SimpleSimulationVLtimeseries()
# Run the function
out<-VLTransferEntropy(Y=TS$Y,X=TS$X)


# Generate simulation data
TS <- SimpleSimulationVLtimeseries()
# Run the function
out<-VLTransferEntropy(Y=TS$Y,X=TS$X)

Package 'VLTimeCausality'

Help Index

checkMultipleSimulationVLtimeseries

Description

Usage

Arguments

Value

Examples

followingRelation

Description

Usage

Arguments

Value

Examples

GrangerFunc

Description

Usage

Arguments

Value

Examples

MultipleSimulationVLtimeseries

Description

Usage

Arguments

Value

Examples

multipleVLGrangerFunc

Description

Usage

Arguments

Value

Examples

multipleVLTransferEntropy

Description

Usage

Arguments

Value

Examples

plotTimeSeries

Description

Usage

Arguments

Value

Examples

SimpleSimulationVLtimeseries

Description

Usage

Arguments

Value

Examples

TSNANNearestNeighborPropagation

Description

Usage

Arguments

Value

Examples

VLGrangerFunc

Description

Usage

Arguments

Value

Examples

VLTransferEntropy

Description

Usage

Arguments

Value

Examples