Enable overplotting additional data via linking #192

rosteen · 2020-07-07T18:57:57Z

Previously, datasets loaded after the first were not able to be plotted at the same time as previous datasets in the profile viewer. This fixes that by "blindly" linking subsequent datasets to ones that were loaded earlier, without any consideration of whether their units are the same and such. It sounded from discussion offline today that we intend to add intelligent unit conversion to have datasets match in a later PR.

I ended up putting this in the top level jdaviz app rather than at a lower level like the spectrum viewer, with the assumption that we're going to be standardizing around the parser model of data loading, rather than what I was seeing currently with a separate load_data function in the specviz helper.

javerbukh

I noticed that the order in which you add or select data to the viewer changes how the final outcome looks. Using the three spectra below, I added the url one first, followed by the custom 2, then reversed the order (by selecting the check boxes in the drop down). The plot looks completely different. This may just be how this works until unit conversion is implemented, and in that case I think this PR covers the first part of the issue.

I used this dataset: spec_url = 'https://dr14.sdss.org/optical/spectrum/view/data/format=fits/spec=lite?plateid=1323&mjd=52797&fiberid=12

In addition to the following spectra:

from specutils import Spectrum1D
from astropy.modeling.models import Gaussian1D

y = Gaussian1D(mean=50, stddev=10)(np.arange(100)) + np.random.sample(100) * 0.1
spec1 = Spectrum1D(flux=y * u.Jy, spectral_axis=np.arange(100) * u.AA)
specviz.load_data(spec1)

y = Gaussian1D(mean=50, stddev=10)(np.arange(100)) + np.random.sample(100) * 0.1
spec2 = Spectrum1D(flux=y * u.Jy, spectral_axis=np.arange(100) * u.m)
specviz.load_data(spec2)

rosteen · 2020-07-08T15:45:46Z

Well, the exact point you raised isn't a bug (I don't think), but it did lead me to another bug that I should fix. In your case the plot looks different based on order of loading because your synthetic spectra go from 0-100 Angstroms, whereas the real spectrum goes from ~4000-9000 Angstroms (the y scales are also totally different). So whatever is loaded first is setting the plot scaling to be totally different from the other order.

However, I do see that even if you do e.g. spec1 = Spectrum1D(flux=y * u.Jy, spectral_axis=np.arange(5000,5100) * u.AA) instead, the synthetic spectrum is still ending up at x pixels 0:100 on the plot. I had been testing with two spectra of the same size so I didn't notice. I think I need to link the world coordinates rather than the pixel coordinates, giving that at try.

rosteen · 2020-07-08T16:02:31Z

Committed a change to link the world coordinates instead of the pixel coordinates, which puts the generated spectra in your test in the right place. You'll still see differences due to plot scaling based on the order they're added to the plot.

I recommend the following code snippet for generating spectra that are a bit more compatible with that data file:

from specutils import Spectrum1D
from astropy.modeling.models import Gaussian1D
import numpy as np
import astropy.units as u

y = Gaussian1D(mean=50, stddev=20)(np.arange(100))*100 + np.random.sample(100)*20 + 100
spec1 = Spectrum1D(flux=y * u.Jy, spectral_axis=np.arange(5000,5500,5) * u.AA)
specviz.load_data(spec1)

y = Gaussian1D(mean=50, stddev=20)(np.arange(100))*100 + np.random.sample(100)*20 + 200
spec2 = Spectrum1D(flux=y * u.Jy, spectral_axis=np.arange(5000,5500,5) * u.m)
specviz.load_data(spec2)

javerbukh · 2020-07-08T16:48:42Z

You're right! That looks much clearer.

Side note, I accidentally hit the bug #185 while testing and tried to replicate it. I think it happens when you deselect and reselect data too quickly. I don't think it's related to the work here but I think we need to prioritize fixing that bug for next sprint. Otherwise this looks good!

nmearl · 2020-07-09T15:22:43Z

This looks good, but I have some reservations about how the helper classes would use this. Currently, for specviz, it does not use the base class load_data method, and thus would not have the auto-linking.

Perhaps instead of implementing the check in the load_data and add_data methods, you instead just listen for the DataCollectionAddMessage event that's raised whenever data is added to the data collection. Then we can be sure that we get linking even if a user does e.g. app.data_collection['new data'] = .... Something like

self.hub.subscribe(self, DataCollectionAddMessage,
                   handler=self._link_new_data)

rosteen · 2020-07-09T16:11:03Z

I was operating under the assumption that specviz would be ditching its load_data method in favor of a parser that the base load_data uses, but that doesn't cover the case you mentioned of the user adding data manually to the collection...I think that you're right that using the DataCollectionAddMessage event is more elegant. That also has the added bonus that (I think) we could then remove the manual linking from the gaussian smoothing and model fitting plugins. The only downside I see is that I'm currently using the length of the data_collection list before the data is added to know what to link, so I'd need to code around the fact that the message is broadcast after that point. I'll see if I can make that change and test it this afternoon.

rosteen · 2020-07-09T16:29:40Z

I made the change to use DataCollectionAddMessage, and removed the manual linking that was happening in two of the plugins. I left in the linking in the collapse plugin, since that's happening over more/different axes than the automatic linking added in this PR.

duytnguyendtn · 2020-07-22T14:29:17Z

Tested it on my end and everything looks good! I'd like to see why the checks are failing before giving my approval, but apparently build 20200709.06 just doesn't exist? Or at least I can't find it...

nmearl · 2020-07-23T17:21:51Z

The build fails are due to linking failures in the gaussian smoothing plugin tests. @rosteen I've PR'd a fix against your fork.

rosteen · 2020-07-28T19:44:53Z

The latest commit fixes the failing test locally. The test as written was failing because the data linking now happens in jdaviz.Application, while the test was import glue.core.Application to use. Additionally, the default linking uses the world coordinates rather than pixel coordinates, and it doesn't seem like there's a big enough performance hit to override that for things like the smoothing where we know the coordinates match up perfectly.

EDIT: this is obviously exactly what @nmearl said above, my mind is all over the place today and I had forgotten about that comment. I could have sworn that I had accepted and merged the mentioned PR against my fork, but either way the test should be passing now.

duytnguyendtn

Looks good! Just tested again in a clean environment.

rosteen requested review from nmearl, javerbukh and duytnguyendtn July 7, 2020 18:57

javerbukh approved these changes Jul 8, 2020

View reviewed changes

rosteen added 7 commits July 27, 2020 09:43

Link any data loaded after the first set to data loaded earlier

b762046

Make data linking method private

0e87b9d

Link world coordinates rather than pixel coordinates

4d4ba8f

Hack our way to converting units on subsequent data loads

89066d2

Change to use DataCollectionAddMessage

b864eb2

Remove redundant linking in model_fitting and gaussian smooth

d2c34a3

Small docstring update, mainly to try to force tests to run

62b223f

rosteen force-pushed the overplot branch from 6c8c8b4 to 62b223f Compare July 27, 2020 13:45

Update gaussian smooth test for new linking

ba17ec3

duytnguyendtn approved these changes Jul 28, 2020

View reviewed changes

nmearl merged commit 5769d2e into spacetelescope:master Jul 29, 2020

rosteen deleted the overplot branch August 18, 2020 19:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable overplotting additional data via linking #192

Enable overplotting additional data via linking #192

rosteen commented Jul 7, 2020

javerbukh left a comment

rosteen commented Jul 8, 2020 •

edited

Loading

rosteen commented Jul 8, 2020

javerbukh commented Jul 8, 2020

nmearl commented Jul 9, 2020 •

edited

Loading

rosteen commented Jul 9, 2020 •

edited

Loading

rosteen commented Jul 9, 2020

duytnguyendtn commented Jul 22, 2020 •

edited

Loading

nmearl commented Jul 23, 2020

rosteen commented Jul 28, 2020 •

edited

Loading

duytnguyendtn left a comment

Enable overplotting additional data via linking #192

Enable overplotting additional data via linking #192

Conversation

rosteen commented Jul 7, 2020

javerbukh left a comment

Choose a reason for hiding this comment

rosteen commented Jul 8, 2020 • edited Loading

rosteen commented Jul 8, 2020

javerbukh commented Jul 8, 2020

nmearl commented Jul 9, 2020 • edited Loading

rosteen commented Jul 9, 2020 • edited Loading

rosteen commented Jul 9, 2020

duytnguyendtn commented Jul 22, 2020 • edited Loading

nmearl commented Jul 23, 2020

rosteen commented Jul 28, 2020 • edited Loading

duytnguyendtn left a comment

Choose a reason for hiding this comment

rosteen commented Jul 8, 2020 •

edited

Loading

nmearl commented Jul 9, 2020 •

edited

Loading

rosteen commented Jul 9, 2020 •

edited

Loading

duytnguyendtn commented Jul 22, 2020 •

edited

Loading

rosteen commented Jul 28, 2020 •

edited

Loading