Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When should we compute values w.r.t. TIME and TIME_CENTROID respectively? #249

Open
sjperkins opened this issue Feb 28, 2018 · 7 comments
Open
Labels

Comments

@sjperkins
Copy link
Member

The Measurement Set specification says that UVW coordinates are calculated w.r.t TIME_CENTROID.

This raises questions as to whether we should calculate other quantities w.r.t. TIME_CENTROID. I can think of:

  1. Parallactic Angles?
  2. Direction Independent Effects?

Related to #248
/cc @JSKenyon @o-smirnov @landmanbester

@sjperkins sjperkins changed the title When should we compute values w.r.t. TIME and TIME_CENTROID respectively When should we compute values w.r.t. TIME and TIME_CENTROID respectively? Feb 28, 2018
@o-smirnov
Copy link
Contributor

In theory, using TIME_CENTROID is the most precise way. In practice, most applications will not see a difference in the result (for things like PA and DIEs, it's certainly down in the noise and/or machine precision). So I would think really hard before we (a) pour Simon time into this, (b) give up any significant performance.

Arguing the other side: at some point we need to support BDA datasets, in which case many things become per-baseline anyway...

@twillis449
Copy link

I completely agree with OMS on this issue.

@sjperkins
Copy link
Member Author

@o-smirnov @landmanbester @SaiyanPrince @JSKenyon @MichelleLochner @bennahugo @rdeane @twillis449.

I've written up a short latex document describing the accuracy (DDFacet, Cubical, Bayesian Inference) vs performance (Bayesian Inference) issues facing Montblanc. I'd appreciate your input on this because it seems like we're going to have to sacrifice some performance in order to obtain accuracy when data is flagged.

https://www.overleaf.com/read/ccvvkzgymntg

@sjperkins
Copy link
Member Author

/cc @JoshVStaden @IanHeywood too

@sjperkins
Copy link
Member Author

sjperkins commented Mar 13, 2018

Updated document to indicate that TIME_CENTROID usually differs from TIME when data has been averaged...

@twillis449
Copy link

Two comments

  1. You should always use the UVW coordinates provided in the measurement set rather than computing the values yourself from the differences in the XYZ positions of the antennas making up the baseline. In real-world data the observatory making the observations may have 'adjusted' the UVW values due to some obscure local conditions. @o-smirnov and I had an extensive discussion about this issue many years ago with Wim Brouw, author of the CASA Measures package.
  2. When you do data averaging, you are doing so on shorter baselines where the fringe rate is much slower than on long baselines. e.g. the SKA will have a longest baseline of somewhere around 150 km and a shortest baseline of 29 metres and a 'standard' integration period of 0.14 sec (if the SKA has continued to shrink my values may no longer be quite right). At a wavelength of 21 cm you can average 512 data points on your 29m baseline and not lose any information content in your field of view. And I suspect any differences between TIME and TIME-CENTROID would not be noticeable in the averaged data.

@sjperkins
Copy link
Member Author

Seems like per-baseline complex phase (from UVW coordinates) is going to be the way forward for the purposes of correctness. However, I still aim to support the antenna decomposition so I've been putting switches into the tensorflow kernels in the dask branch to allow plugging in terms in various places.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants