Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simbad multi-object search behaviour #967

Open
JLeftley opened this issue Aug 8, 2017 · 8 comments
Open

Simbad multi-object search behaviour #967

JLeftley opened this issue Aug 8, 2017 · 8 comments
Assignees
Labels

Comments

@JLeftley
Copy link

JLeftley commented Aug 8, 2017

Currently if you query Simbad.query_objects (or Simbad.query_region) with many objects you get a table returned of a different size making comparison between input and output very difficult. This is because Simbad can return multiple rows for a single object (Centaurus for example) or no rows at all for an unrecognised object. The returned object id's aren't necessarily the input id's either so you can't use them to search the returned table. You can do it one at a time but for a few thousand objects it's quite slow.

Would it be possible to change this behaviour and the same for multiple coordinate searches? I'm not sure of the best way to handle this, to return the original search names/coordinates and/or the same number of rows with the same order could work. Even just a row label that identifies results as being in the same search group would be fine. A search by multiple coordinates (or a multiple region criteria) would benefit extremely from the second or third behaviour. As at the moment it is very difficult to distinguish which object belongs to each region search. The same behaviour for Ned would be even better :)

For Example:

rac=['144.696458 -60.09181', '203.426453 -65.99033']
Simbad.query_region(SkyCoord(rac, unit=u.deg),'1d')

Would return:

MAIN_ID RA DEC OTYPE GROUP
  h m s d m s    
object unicode13 unicode13 object int
--------------- ------------- ------------- ------ ------
IC 2501 9 38 47.146 -60 5 30.52 PN 1
TYC 9003-1531-1 13 33 42.8988 -65 59 11.376 Star 2
NGC 5189 13 33 32.86 -65 58 27.1 PN 2
TYC 9003-1874-1 13 33 27.265 -65 58 27.9 Star 2
TYC 9003-654-1 13 33 25.988 -66 0 14.359 Star 2

Multi object example:

from astroquery.eso import Eso
from astroquery.simbad import Simbad

#login info
eso = Eso()
login=raw_input('ESO Login: ')
eso.login(login)

#Set simbad
Sim=Simbad()
Sim.add_votable_fields('otype')
Sim.ROW_LIMIT=1e6
Sim.TIMEOUT=500

#Query
eso.ROW_LIMIT=1e12
table=eso.query_instrument('muse', night_flag=0, column_filters={'dp_type':'OBJECT'})
#Remove duplicate names
oname=list(set(table['Object']))
onar=Sim.query_objects(oname)
print len(onar['MAIN_ID']),len(oname)
@keflavich
Copy link
Contributor

Somehow this slipped by me, but yes, this should be possible and maybe even straightforward. A PR implementing it would be welcome, otherwise maybe we can tackle this next time there's a hack session.

@JLeftley
Copy link
Author

I have no solution for this to post yet but I'll put it here if I manage to make one before the next hack session :)

@cdeil
Copy link
Member

cdeil commented Feb 19, 2018

I want to query many names in Simbad, to figure out which ones resolve and don't.

This is what I tried:
https://gist.github.com/cdeil/ad1ffdd724878f4d72d25a117d92d5a5

It doesn't give me what I want, the issues I have are:

  1. The number or rows in table plus number of entries in table.errors doesn't match the number of names I queried!?
  2. And the result table doesn't contain the name I queried, i.e. I can't easily figure out which row corresponds to which query name?

Is there a way to do this currently with query_objects? Or do I have to run one query_object per object? Will SIMBAD block me if I run ~ 100 queries, possibly a few times?

@keflavich
Copy link
Contributor

@cdeil query_objects sends the list of names to SIMBAD in a single form. I believe what happened is:

  1. Several entries (7) resulted in no match, but were recognized as valid names (I'm uncertain about this)
  2. Several entries (another 7) somehow errored - perhaps they did not parse properly? I'm again not sure why
  3. Both of the above categories are simply excluded from the results.

This is a good question for the CDS folks. I suggest e-mailing them directly to see if there's a way to get a table returned with blanks for missing fields or something similar.

@aoberto
Copy link
Contributor

aoberto commented Feb 20, 2018

Hi, for the question number 2 : if you are using scripts in SIMBAD, there is a way to get the names you gave : %OBJECT, for votable fields, you can use : TYPED_ID.
SIMBAD blocks if you send more than 6 queries in the same second, and you can query with a list until 10000 names

@aoberto
Copy link
Contributor

aoberto commented Feb 20, 2018

The list of errors generated by this list of names are all here :

::error:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

[4] Identifier not found in the database : GAL 292.2-00.5
[5] 'PWN G292.15-0.54': No known catalog could be found
[10] Identifier not found in the database : GAL 292.2-00.5
[11] Identifier not found in the database : GAL 318.2+00.1
[13] Identifier not found in the database : GAL 292.2-00.5
[14] 'PWN G292.15-0.54': No known catalog could be found
[22] 'AX J150436-5824': this identifier has an incorrect format for catalog:
AX : ASCA satellite, X-ray

[25] Identifier not found in the database : GAL 327.15-01.04
[26] Identifier not found in the database : GAL 327.1-01.1
[30] 'PWN G18.5-0.4': No known catalog could be found
[33] Identifier not found in the database : GAL 018.6-00.2
[52] Identifier not found in the database : GAL 030.8-00.2
[61] Identifier not found in the database : GAL 033.2-00.6
[68] Identifier not found in the database : GAL 042.8+00.6

@keflavich
Copy link
Contributor

OK, so we should probably add the %OBJECT column in the query_objects query in astroquery.

@keflavich
Copy link
Contributor

Well, my memory is awful. This PR: #496 addresses the issue of keeping the input name in the output results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants