**Remember that when an interactice chart is cited on the post, by clicking on it the source code will be shown, In order to visualize it on the right way, download the file as html and open it with your browser.**

**Remember that when an interactice chart is cited on the post, by clicking on it the source code will be shown, In order to visualize it on the right way, download the file as html and open it with your browser.**

In the __prior post__ it was proven that, as a ** stochastic **variable, a stock quote is

**to be predicted using**

*impossible***. It is based on the fact that it is wrongly assumed that**

*regression***regarding a stock value is**

*all information***in the**

*reflected***. Indeed, the latter is a**

*price***of the dispute between those who took the decision to**

*consequence***and to**

*buy***, thus, at this point**

*sell***is added from different sources i.e. technological means, speculator's background and mood, etc. reason why focusing on**

*noise***eliminates this.**

*price changes*Have a look at the following data:

Most of the data is mostly ** grouped** around the

**and**

*third***intervals showing a clear**

*fourth***to a**

*tendency***which is**

*central value***, a behavior called as the**

*zero*__Central Limit Theorem (CLT)__. Thus, these tables reflect two main takeaways:

The central value of zero

that prices are purely*confirms*as the probability of going up or below zero is expected to be the same, something called*stochastic*__market efficiency__.ETFs have their frequency tables*Some*towards left or right, what one would call a right/left*shifted*distribution e.g. the top right table shows that over 64% of the data is accumulated in the fourth interval while the bottom right is well centered with 60% of it located in the third interval.*tailed*

Now, one would be tempted to state that the top-right option is better than the bottom-right but there are other features to take into account and the following distribution graphs will give a better hint:

** Markov** chain states - green colorscaled- move

**the**

*towards***values -pink dots- and a**

*historic***constructed**

*manually***gives the probability distribution for the absolute returns of a certain asset. Consequently, two**

*spline***features can be developed from here:**

*new*of a return*Probability*-probability of earning money-.*above zero*with*Return*-two values here the return and the likelihood-.*maximum likelihood*

These, added to the ** estimated return** calculated with the spline, its

__jacobian__, and other features computed by the

__MGM code__, allow the user to

__accurately__classify price changes for the next periods depending on the timeframe used.

However, as the title suggests, to master the maths involved is of utmost importance to ** adjust** the standard models according to the problem that is being solved. In fact, it is known that

**provides**

*sklearn*__different methodologies__for a wide range of problems; nevertheless, they are

**to certain complex cases. Although it allows one to**

*not enough***as penalties, learning rates, and others, the same level of**

*set parameters***-**

*accuracy*__f1 score__- has been proven to be

**than that one from the MGM code.**

*lower*By using the standard models, only two manage to get a f1 score above 0.3, considered low. __Confusion matrices__ are shown below:

As expected -from the low f1 score- both methodologies performed quite poor; therefore, an ** underfit** model suggests that a

**method is necessary; hence, by combining features and changing parameters for the two**

*more complex***that performed best, the maximum score that could be got was of**

*loss functions***, an improvement of**

*0.39***but still**

*30%***:**

*deficient*The ** key** here is that one is

**by sklearn's**

*constrained*__loss functions__as nine are available, same as penalties -three available plus no penalty-, and other parameters. Dominating the

**the**

*math behind***is important to identify**

*libraries***-that don't adjust to your case- and change them accordingly. In this regard, a good place to start is the following**

*flaws*__book__, several acquaintances have asked me to teach them to code and trade, and depending on their backgrounds, I always recommend them a good

**to start.**

*literature**All these calculations are based on **probabilities**, which can **fail** sometimes; however, the developed algorithm to reach those numbers has been thought to **reduce** such failures to their **lowest** level.*

## Comments