Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's the meaning of rolling_slope, rolling_rsquare, and rolling_resi? #283

Closed
Bowen905 opened this issue Feb 23, 2021 · 11 comments
Closed
Assignees
Labels
question Further information is requested stale

Comments

@Bowen905
Copy link

In the document, slope means regression slope, but in rolling_slope function, there is only one parameter without window size, i just confuse about the regression for what, the parameter autoregression? And, the same question for rolling_rsquare and rolling_resi.

@Bowen905 Bowen905 added the question Further information is requested label Feb 23, 2021
@bxdd
Copy link
Collaborator

bxdd commented Feb 23, 2021

Hi, What you said "there is only one parameter without window size" where can I find the related code?
And, the regression is an linear regression for the (series.index, series.value) in window, not autoregression. Please refer to code, obviously, the following code is the slope of linear regression:

return (N*self.xy_sum - self.x_sum*self.y_sum) / (N*self.x2_sum - self.x_sum*self.x_sum)

@Bowen905
Copy link
Author

The slope function code is:
`cdef class Slope(Rolling):
"""1-D array rolling slope"""
cdef double i_sum # can be used as i2_sum
cdef double x_sum
cdef double x2_sum
cdef double y_sum
cdef double xy_sum
def init(self, int window):
super(Slope, self).init(window)
self.i_sum = 0
self.x_sum = 0
self.x2_sum = 0
self.y_sum = 0
self.xy_sum = 0

cdef double update(self, double val):
    self.barv.push_back(val)
    self.xy_sum = self.xy_sum - self.y_sum
    self.x2_sum = self.x2_sum + self.i_sum - 2*self.x_sum
    self.x_sum = self.x_sum - self.i_sum
    cdef double _val
    _val = self.barv.front()
    if not isnan(_val):
        self.i_sum -= 1
        self.y_sum -= _val
    else:
        self.na_count -= 1
    self.barv.pop_front()
    if isnan(val):
        self.na_count += 1
        # return NAN
    else:
        self.i_sum  += 1
        self.x_sum  += self.window
        self.x2_sum += self.window * self.window
        self.y_sum  += val
        self.xy_sum += self.window * val
    cdef int N = self.window - self.na_count
    return (N*self.xy_sum - self.x_sum*self.y_sum) / \
        (N*self.x2_sum - self.x_sum*self.x_sum)`

in this slope function, the x means window size, and y means update value, right? My question is, if the inputs are update value and window size, what's the meaning for this regression?

@bxdd
Copy link
Collaborator

bxdd commented Feb 25, 2021

x means series.index rather than window size, y means series.value. window size is passed in Slope.__init__.
By the way, Slope is not the rolling_slope function, but this https://github.com/microsoft/qlib/blob/main/qlib/data/_libs/rolling.pyx#L197

@Bowen905
Copy link
Author

Thank you for answering my question, but I'm still confusing about the input x, and y. So, could you give me an example for Slope function? What does the input data set look like?

@bxdd
Copy link
Collaborator

bxdd commented Feb 25, 2021

The return value of rolling_slope(np.array([1,2,7,10]), 2) is np.array([nan, 1, 5, 3])

@Bowen905
Copy link
Author

For the example, x, the series.index, is range(4), and y, the series.value, is array([1,2,7,10]), is this right? Is there any economic meaning for this regression?

@bxdd
Copy link
Collaborator

bxdd commented Mar 1, 2021

Hi, what you said is almost right (in fact, there are still some small details that you must read the code to know).

These operators are just basic operations

  • rolling_slope: the linear regression slope in the given window
  • rolling_rsquare: the R-square R^2 of linear regression in the given window
  • rolling_resi: the last linear regression residual between value_{last} and slope*index_{last}+interp in the given window

And they may be used when calculating some alpha factors, such as Idiosyncratic volatility

@Bowen905
Copy link
Author

Bowen905 commented Mar 9, 2021

Thank you. I get the basic knowledge for those operators, but I'm still not clear about how to use those operators. For example, the alpha factor 'Beta5', which is 'Slope($close, 5)/$close'. In my knowledge, the beta means the correlation between the returns of benchmark index and the returns of a stock. But, for the explanation you give me, i can't find the relationship.

@bxdd
Copy link
Collaborator

bxdd commented Mar 9, 2021

Wow, I’m not sure about the meaning of Beta5, but I guess that Beta5 means the coefficient 1 standardized by close price in linear regression 2.
Is it right? @you-n-g

@you-n-g
Copy link
Collaborator

you-n-g commented Mar 10, 2021

@bxdd it is right.
@Bowen905 It is different from the Beta you explained.

@github-actions
Copy link

github-actions bot commented Jun 8, 2021

This issue is stale because it has been open for three months with no activity. Remove the stale label or comment on the issue otherwise this will be closed in 5 days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested stale
Projects
None yet
Development

No branches or pull requests

3 participants