Fix the catalogue #57

MatthewJA · 2016-05-24T03:52:19Z

Use wcs_pix2world instead of all_pix2world.
Compare rgz-analysis/RGZcatalogcode/RGZcatalog.py to crowdastro/catalogue.py — do they behave differently? (yes)
Find the leak. How many duplicates are deleted? Are we just missing objects due to the IR threshold radius? Plot IR threshold radius against number of identified AGNs for catalogue #50

The text was updated successfully, but these errors were encountered:

MatthewJA · 2016-05-24T06:35:43Z

Well... I think I found the leak.

DEBUG:root:3059 hosts with no associated SWIRE object.

MatthewJA · 2016-05-24T06:59:01Z

Flipping the coordinates along the y-axis seems to have fixed the issue. I'm not sure why this convention is different to the radio convention - if it isn't different, then the radio matching must be working by pure coincidence, which seems unlikely as the radio sky is not as dense as the IR sky.

I throw out ~2200 hosts, but (at least) hundreds are just duplicates (compared with ~60 duplicates before). The debug output went out of buffer, so I'll have to re-run to get better numbers. I've sent this to Julie, so we'll see if these are reasonable numbers soon.

I am now re-running the catalogue with PG-means.

MatthewJA · 2016-05-24T07:02:29Z

Some numbers: PG-means finds 4433 consensuses, compared to KDE's 4400. This means PG-means throws out less classifications; I'm not sure what the implications are.

A note mainly for myself: I changed coordinate systems for PG-means consensus generation, so that may have an effect on how much I throw out.

MatthewJA · 2016-05-24T07:40:43Z

PG-means throws out 1632 consensuses for not having an associated SWIRE component, bringing the total number of results down to ~1500 which seems too small. This is really strange! I'll stick with KDE for now but I think I could fix PG-means to work properly. It's entirely possible that GMM is a terrible idea for this data, so I might try k-means instead. PG-means is a wrapper, so it's trivial to change.

jbanfield · 2016-05-27T03:44:16Z

trying to merge the rgz catalogue you sent with swire.

Look at CI0074C1 and CI0074C2 in the rgz_radio_component file. There is only CI0074C1 in the 11JAN2014 ATLAS catalogue. What is CI0074C2? Plus it is not in the rgz_host file. Not sure how everything goes together.

MatthewJA · 2016-05-27T03:57:59Z

I don't have the 11JAN2014 ATLAS catalogue; I have the 23JULY2015 ATLAS catalogue. This is the one on GitLab.

MatthewJA · 2016-05-27T04:04:31Z

Just to make sure we're on the same page, I'm using the CDFS images from 11JAN2014, also on GitLab.

The sources/Zooniverse IDs in the rgz_host file only refer to an arbitrary subject that contains that line's RGZ/SWIRE object within a 1' radius. This is purely for reference, so these aren't relevant (and are actually ID_RGZ, not ID, in the CSV you sent me).

jbanfield · 2016-05-27T10:26:00Z

Yes, you should have the 23JULY2015 ATLAS catalogue and the 11JAN2014 images. The ID between the two are different. The 23JULY2015 catalogue has one entry for every gaussian fit to the radio image at a signal-to-noise >= 5. If a guassian is labelled with a C1, C2, C3, etc. then we required more then one gaussian to fit the radio structure at that location in the image. In cases like this, the RGZ image was only centred on C1 in order to reduce the number of duplicates.

I was using the 11JAN2014 catalogue to make the bookkeeping file.

jbanfield · 2016-05-27T10:27:47Z

My question is how do you record the match in the catalogue if there are more then one radio component making up the consensus? i.e., the tutorial image - only the double radio source in the centre of the image.

MatthewJA · 2016-05-27T10:34:16Z

I was using the 11JAN2014 catalogue to make the bookkeeping file.

That might explain why some of the radio components I had expected were missing.

My question is how do you record the match in the catalogue if there are more then one radio component making up the consensus? i.e., the tutorial image - only the double radio source in the centre of the image.

Two rows with the same RGZ name but different component names. The advantage of this is that it means all rows are the same length.

RGZ names should be unique in rgz_hosts.csv and component names should be unique in rgz_radio_components.csv.

jbanfield · 2016-06-24T01:30:17Z

What happens to the radio subjects with no infrared host?

MatthewJA · 2016-06-24T01:35:07Z

They're skipped. If a radio subject doesn't appear in the catalogue, then either there was no nearby SWIRE object or "No IR Source" was selected as the majority by volunteers.

jbanfield · 2016-06-24T05:35:50Z

I'm working on creating the RGZ CDFS catalogue. 2 questions:

Where do the radio_component values from rgz_component_kde come from? - I'm thinking from the mondgodb

Where do the source values come from for the rgz_host_kde? - I'm thinking from the 23JULY2015 catalogue

Is this correct?

MatthewJA · 2016-06-24T05:40:58Z

Other way around — the component IDs are from the 23JULY2015 catalogue. The zooniverse_id and source columns in the hosts file should be ignored.

jbanfield · 2016-06-24T06:09:26Z

Thanks!

Do you have a list of the RGZ subjects with no consensus or no SWIRE id?

MatthewJA · 2016-06-24T06:19:50Z

Hmm, I don't. I could probably generate one but I'll have to regenerate the catalogue (which shouldn't be different).

MatthewJA added a commit that referenced this issue May 24, 2016

Update coordinates after discussions with Julie. More debug output. #57

99944e8

MatthewJA added the astro label May 25, 2016

MatthewJA added the help wanted label Sep 15, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix the catalogue #57

Fix the catalogue #57

MatthewJA commented May 24, 2016

MatthewJA commented May 24, 2016

MatthewJA commented May 24, 2016 •

edited

Loading

MatthewJA commented May 24, 2016 •

edited

Loading

MatthewJA commented May 24, 2016

jbanfield commented May 27, 2016

MatthewJA commented May 27, 2016

MatthewJA commented May 27, 2016

jbanfield commented May 27, 2016

jbanfield commented May 27, 2016

MatthewJA commented May 27, 2016

jbanfield commented Jun 24, 2016

MatthewJA commented Jun 24, 2016

jbanfield commented Jun 24, 2016

MatthewJA commented Jun 24, 2016

jbanfield commented Jun 24, 2016

MatthewJA commented Jun 24, 2016

Fix the catalogue #57

Fix the catalogue #57

Comments

MatthewJA commented May 24, 2016

MatthewJA commented May 24, 2016

MatthewJA commented May 24, 2016 • edited Loading

MatthewJA commented May 24, 2016 • edited Loading

MatthewJA commented May 24, 2016

jbanfield commented May 27, 2016

MatthewJA commented May 27, 2016

MatthewJA commented May 27, 2016

jbanfield commented May 27, 2016

jbanfield commented May 27, 2016

MatthewJA commented May 27, 2016

jbanfield commented Jun 24, 2016

MatthewJA commented Jun 24, 2016

jbanfield commented Jun 24, 2016

MatthewJA commented Jun 24, 2016

jbanfield commented Jun 24, 2016

MatthewJA commented Jun 24, 2016

MatthewJA commented May 24, 2016 •

edited

Loading

MatthewJA commented May 24, 2016 •

edited

Loading