Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conceptual question regarding norm vector rescaling #19

Closed
dmalzl opened this issue Jul 1, 2020 · 3 comments
Closed

Conceptual question regarding norm vector rescaling #19

dmalzl opened this issue Jul 1, 2020 · 3 comments

Comments

@dmalzl
Copy link

dmalzl commented Jul 1, 2020

Hi,

First of all thank you for this amazing implementation.
I have a rather conceptual question regarding the code. From using this tool and from what I found how it is used in the HiCExplorer hicCorrectMatrix I found that there are two main results that can be returned:

  1. a norm vector and a rescaled norm vector with kr.get_normalisation_vector(True/False)
  2. a normalized matrix with rescaled and nonrescaled norm vector with kr.get_normalised_matrix(True/False)

My question now is: Why the rescaling?

I get that the nonrescaled results balances the matrix to rowsum/colsum of 1, but is it better to use the rescaled result instead of the unrescaled?

Also a line in the hicCorrectMatrix script is a little bit misleading in this sense:

732            # set it to False since the vector is already normalised
733            # with the previous True
734            # correction_factors = np.true_divide(1, kr.get_normalisation_vector(False).todense())
735            correction_factors = kr.get_normalisation_vector(False).todense()

However, there is no previous True. I mean for the h5 format it does not matter since you anyway store the normalised rescaled matrix but if you use it in cooler this will get you the nonrescaled vector or am I wrong here?

Thank you for the answer in advance,
Best regards,
Daniel

@dmalzl
Copy link
Author

dmalzl commented Jul 1, 2020

Just another question:

Could you also elaborate on the way you compute the rescaling. Why do you use the square root of matrix sum ratio. Is there a deeper reason for that?

@LeilyR
Copy link
Collaborator

LeilyR commented Jul 6, 2020

Hi Daniel,

Thanks for your interest in this tool. In theory using the re-scaled one or the original matrix generated directly by kr algorithm (with row and column close to 1) should not change your analysis, however our motivation was scaling up the values to avoid the complication caused very small values in the downstream analysis when using hicexplorer. I hope it helps, let me know if you any further questions.

@dmalzl
Copy link
Author

dmalzl commented Jul 6, 2020

I see. I also thought this shouldn't matter. Anyway, thanks for the clarification :)

@dmalzl dmalzl closed this as completed Jul 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants