Moving on forward in
the book, these days were about Categorical Variables, significance testing
(t-ratios, F-ratios, ANOVA) moving on to variable selection throughout which I
struggled to maintain interest. However, this was extremely crucial knowledge
and stressed the importance of intuition to support the math. Luckily, I was
able to sum this up through an example on modeling Stock Liquidity using
various performance measures.
The case study begins through defining the following
criteria through which investors choose their stocks;
- Expected Return;
- Riskiness (Volatility);
- Length of Time of Investment (Varies for Growth and Income Stocks);
- Liquidity (Ease through which a stock could be sold, measured through its VOLUME in a stock market)
The data was composed
of data of 123 companies which was originally 126 companies filtered for 3
companies due to unusually large volumes or prices. The data represents the
period December 3, 1984, to February 28, 1985. For the trading activity variables,
we examine
- The three-month total trading volume (VOLUME, in millions of shares)
- The three-month total number of transactions (NTRAN)
- The average time between transactions (AVGT, measured in minutes)
- Opening stock price on January 2, 1985 (PRICE),
- The number of outstanding shares on December 31, 1984 (SHARE, in millions of shares)
- The market equity value (VALUE, in billions of dollars) obtained by taking the product of PRICE and SHARE.
- debt-to-equity ratio (DEB_EQ)
The model would attempt to model LIQUIDITY based on 6 explanatory
variables. A scatter-plot illustrates relationships between the variables as
under.
![]() |
Scatterplot Matrix |
VOLUME appears to have the best relationships with AVGT
(Average Time between Transactions) and NTRAN (Number of Transactions) while
both AVGT and NTRAN appear to have a perfect inversely proportional
relationship with each other.
Taking a rather different route than the book, I
started by fitting all explanatory variable to the response variable (VOLUME) with
and R2 statistic of 84.9%. The results revealed only 2 significant
variables when compared to their t-ratios:
Coefficients
|
Standard Error
|
t Stat
|
P-value
|
|
Intercept
|
6.1571
|
1.9567
|
3.1467
|
0.0021
|
AVGT
|
(0.4436)
|
0.1499
|
(2.9601)
|
0.0037
|
NTRAN
|
0.0014
|
0.0002
|
7.8732
|
0.0000
|
PRICE
|
(0.0138)
|
0.0228
|
(0.6061)
|
0.5456
|
SHARE
|
0.0090
|
0.0073
|
1.2316
|
0.2206
|
VALUE
|
0.0812
|
0.1135
|
0.7152
|
0.4759
|
DEBEQ
|
0.0609
|
0.0593
|
1.0272
|
0.3065
|
The next regression was using NTRAN as the only explanatory variable (R2
statistic was 83.43%).
Coefficients
|
Standard Error
|
t Stat
|
P-value
|
|
Intercept
|
1.65128
|
0.61730
|
2.67501
|
0.00851
|
NTRAN
|
0.00183
|
0.00007
|
24.68049
|
0.00000
|
Correlations of the residuals with the remaining
explanatory variables revealed that there is still some information contained
in the variable AVGT:
AVGT
|
PRICE
|
SHARE
|
VALUE
|
DEBEQ
|
-15.90%
|
-1.40%
|
6.42%
|
1.80%
|
7.79%
|
Finally, the third regression resulted in a model with
negligible correlations of the residuals with remaining explanatory variables
(R2 statistic was 84.18%).
Coefficients
|
Standard
Error
|
t Stat
|
P-value
|
|
Intercept
|
4.4087
|
1.3012
|
3.3882
|
0.0010
|
AVGT
|
(0.3222)
|
0.1346
|
(2.3942)
|
0.0182
|
NTRAN
|
0.0017
|
0.0001
|
17.1337
|
0.0000
|
PRICE
|
SHARE
|
VALUE
|
DEBEQ
|
(0.015)
|
0.100
|
0.074
|
0.089
|
And so, concluded the exercise to determine the
significant explanatory variables for the regression model. However, I was
amazed that the AVGT variable that appears as a perfect function of NTRAN
variable could be a significant variable alongside the same. I was under the impression
that both would add the same information to the regression.
Key take away from this exercise was the mathematics towards key variable selection and how some variables may although adding to the R squared statistic may have very little mathematical significance. The exercise did also got me excited towards non-linear regressions, however it may take a while to get there.
No comments:
Post a Comment