As its name says it, the stock market quotes estimator is a machine learning program developed to estimate, with a probability, within what range the quote of a security would oscilate in the short term to take timely decisions. Current methods focus on estimating the price at which a security should quote by using fundamentals, technicals, or a time-series approach, all focused on price as "all information regarding the stock is represented in its value". However, they obviate one feature in which ML Perspective focus: â–²Price.
So, what is â–²Price? Delta-price is the magnitud of change in the value of a security from one period to another, namely day, week, month, etc. The exhibit at the begining of this post shows the quotes for S&P 500, and S&P Small and Large Caps for weekly periods, "Dif" corresponds to â–²Price, and "Rent" to the rentability of the period. Therefore, by approaching the value estimation by â–²Price yields to a different result, which is what constitutes the basis of ML Perspective: Being different to achieve a different outcome.
Consequently, to address the new perspective first it has to be analyzed if â–²Price is random or if those values can be grouped by categories, namely bins. A categorizer algorithm is used for this purpose (click on image to see it on google colab):
Once the categorizer has evaluated all â–²Prices, a histogram of frequencies can be constructed to find out, historically, which magnitud of changes are the most common ones thus most expected. Additionaly, this information is used to develop a spline that will allow the users of this program to discern key features to tune their decisions. These are:
Probability of positive return (>0).
Most possible expected â–²Price.
Probability of the above feature.
The smoothing factor is selected according to the user as the program allows for this. Why is it important to count with a proper spline? Because each period that comes onwards the three features defined above will be updated besides the estimated range, which is to be explained in part II, that will be compared against the real quote. Now, with the historical parameters and the spline worked out, the algorithm builds a transition matrix of the movements of â–²Price, language used is not disclosed due to privacy terms.
Furthermore, based on the last stock quote, initial vector and Markov states are determined until reaching the steady state:
The probabilities from the markov states are then used to determine the range within which the security's price will move within. But, not to work under a wide range only the bins that sum the closest value to 80% are used, this number won't be elaborated here as it comes from another topic but can be briefly said that it is based on the Pareto Principle. A simple but useful algorithm is used to select the two bins with the highest probabilities, which in the above case would be the second and third ones (click on image to see it on google colab):
Having defined the states, the following graph is programmed to show towards which direction the spline and probabilties of â–²Price are moving, among the outcomes of this analysis why is it foremost? Because, they define the trend of the histogram computed before (click to see real file):
For example, the above histogram is right-biased, however, states are moving downwards on top probabilities while values on the left are doing so upwards, a flattening curve could be reflection of a change in trend of â–²Price to slow growth or in a more extreme case turning towards a bearish behavior. More details to be explained in part II. I invite you to visit ML Perspective website and find out more about how it is changing the way data is analyzed.
Comments