Stock Market Quotes Estimator (Part II)
In part I of the stock market quotes estimator it was explained that even though stock prices are random, the variable ▲Price is not and it follows a distribution that can be approximated, then a range within which a security's price oscilates can be computed for the coming short term with quite good accuracy. This post starts with the picture of the notebook that contains the python language to create the Markov Chain Status vs Spline vs Historical Values graph shown in part one (click to open it in google colab).
The possible slowing growth then mentioned for this security will be better visualized later when the final outcome of this program is shared. By now, the required features have been calculated to start the projection of the movement range for the price in the short term, which is equal to the Markov's number of states until reaching steadiness. Note than although the range is for the price, the features are based on calculations on ▲Price, then the last historical price is just added at the end for easier understanding.
Furthermore, it is recommended not to project the steady state too long as the model will lose accuracy, each four or six months it should be updated. So far the inputs are:
First two bins with maximum frequency.
Markov probabilities until reaching the steady state.
And the functions:
Spline for frequency distribution (shown in part I).
Spline for the accumulated distribution (mising).
The fisrt one helps with the calculation of the ▲Price with the highest probability of occurence according to historical values and its designated probability. Hence, even though the library scipy.interpolate offers the helpful function UnivariateSpline, this is limited for probability purposes as one needs to ensure that the spline yields to a result between zero and one. Additionally, this function uses "x", namely ▲Price, as input, then can't directly give the desired output, it must be modified.
As explained in part I, the developed algorithm is undisclosed but some hints have been given for the readers. On the other hand, the second spline function helps with computing the probability of the security giving an absolute return (▲Price) higher than zero. This case is easier as just by calculating spline(0) and subtracting this from one gives the desired output. Below graph shows this function (click to acces real file):
Consequently, with all inputs ready an additional calculation is required to correctly construct the projected range for the price. This step is related to the probability of each bin on each state and how it should ve accumulated. Thus, the selected two bins of the Markov chain with the highest probabilities have the following structures:
State 1 | State 2 | State 3 | ... |
---|---|---|---|
Upper Bound a | Upper Bound c | Upper Bound e | |
Bin a | Bin c | Bin e | ... |
Lower Bound a | Lower Bound c | Lower Bound e | |
Upper Bound b | Upper Bound d | Upper Bound f | |
Bin b | Bin d | Bin f | ... |
Lower Bound b | Lower Bound d | Lower Bound f | |
Let the probability of being in "Bin a" be "a", "Bin b" be "b", etc. Let also the "Upper Bound a" be "Sa" and so on. Therefore, to calculate the upper bound of ▲Price for the State 1 one can simply do:
a*Sa+b*Sb
Same applies for the lower bound. Following same logic, the upper bound for ▲Price in State 2 would be as follows:
a*c*Sc+b*c*Sc+a*d*Sd+b*d*Sd
(a+b)(c*Sc+d*Sd)
For State 3:
a*c*e*Se+a*d*e*Se+b*c*e*Se+b*d*e*Se+a*c*f*Sf+a*d*f*Sf+b*c*f*Sf+b*d*f*Sf
(a+b)(c*e*Se+d*e*Se+c*f*Sf+d*f*Sf)
(a+b)(c+d)(e*Se+f*Sf)
Hence, the logic is clear, to compute the lower and upper bounds for the variable ▲Price for a specific state, it is needed to multiply this by the sum of both bins' probabilities per prior state. The following notebook specifies the algorithm used that is divided in three sections: Sum of the two maximum probabilities per state, definition of function to accumulate the prior sums, and calculation of ▲Price per state with their corresponding price. This process is repeated for the upper, lower and average values of the range:
Clicking the image redirects the reader to Google Colab notebook. Furthermore, the resulting range would be as follows (click on image to see real file):
The estimated range for the next three months have been estimated for the price of the studied security. Now, by having a quick look at the probabilities indicated by their marker color, one can easily identify that as the farther one goes, the lesser probable the estimation becomes, this is why it was mentioned before not to extend the stedy state too long. The first dark green spot corresponds to the last historical price from which the estimation was computed.
Finally, if every week the candlestick for the actual price is added and splines are run, the graph would look as follows (click for real file):
When the real file is enquired, it is possible to identify how close were the estimations to the fluctuations of the actual prices during the trading period, in this case, a week. The dash line that separates the estimated range in two sections corresponds to the price with the ▲Price with maximum probability, the second colorbar gives these values. As this one moves towards the upper bound, it is expected that most of actual prices will be within this sub-range as it can be seen; however, how more probable can't be evaluated by now.
In addition, the last candle closing in the second sub-range would be an indication of slowing growth now onwards. It is reinforced with the second graph output of the current program (click for real file):
Silver line corresponds to the probability of the maximum ▲Price while the green one the probability of return higher than zero. The last four periods show a divergence between both probabilities: while the maximum ▲Price goes upwards, the higher than zero goes downwards, this could be an indication of slowing growth if not bear market as per the actual candlesticks. These outputs for a variety of stocks will be published on this website and will be accessible only with paid subscriptions.