Scroll to navigation

STILTS-CONESKYMATCH(1) Stilts commands STILTS-CONESKYMATCH(1)

NAME

stilts-coneskymatch - Crossmatches table on sky position against remote cone service

SYNOPSIS

stilts coneskymatch [ifmt=<in-format>] [istream=true|false] [in=<table>] [icmd=<cmds>] [ocmd=<cmds>] [omode=out|meta|stats|count|checksum|cgi|discard|topcat|samp|tosql|gui] [out=<out-table>] [ofmt=<out-format>] [ra=<expr>] [dec=<expr>] [sr=<expr/deg>] [find=best|all|each] [usefoot=true|false] [footnside=<int-value>] [copycols=<colid-list>] [scorecol=<col-name>] [parallel=<n>] [erract=abort|ignore|retry|retry<n>] [ostream=true|false] [fixcols=none|dups|all] [suffix0=<label>] [suffix1=<label>] [servicetype=cone|ssa|sia1|sia2|sia] [serviceurl=<url-value>] [verb=1|2|3] [dataformat=<value>] [emptyok=true|false] [compress=true|false]

DESCRIPTION

Note: this command is very inefficient for large tables, and in most cases cdsskymatch or tapskymatch provide better alternatives.

coneskymatch is a utility which performs a cone search-like query to a remote server for each row of an input table. Each of these queries returns a table with one row for each item held by the server in the region of sky represented by the input row. The results of all the queries are then concatenated into one big output table which is the output of this command.

The type of virtual observatory service queried is determined by the servicetype parameter. Typically it will be a Cone Search service, which queries a remote catalogue for astronomical objects or sources in a particular region. However, you can also query Simple Image Access and Simple Spectral Access services in just the same way, to return tables of available image and spectral resources in the relevant regions.

The identity of the server to query is given by the serviceurl parameter. Some advice about how to locate URLs for suitable services is given in SUN/256.

The effect of this command is like doing a positional crossmatch where one of the catalogues is local and the other is remote and exposes its data via a cone search/SIA/SSA service. Because of both the network communication and the necessarily naive crossmatching algorithm (which scales linearly with the size of the local catalogue) however, it is only suitable if the local catalogue has a reasonably small number of rows, unless you are prepared to wait a long time.

The parallel parameter allows you to perform multiple cone searches concurrently, so that instead of completing the first cone search, then the second, then the third, the program can be executing a number of them at once. This can speed up operation considerably, especially in the face of network latency, but beware that submitting a very large number of queries simultaneously to the same server may overload it, resulting in some combination of failed queries, ultimately slower runtimes, and unpopularity with server admins. Best to start with a low parallelism and cautiously increase it to see whether there are gains in performance.

Note that when running, coneskymatch can generate a lot of WARNING messages. Most of these are complaining about badly formed VOTables being returned from the cone search services. STILTS does its best to work out what the service responses mean in this case, and usually makes a good enough job of it.

Note: this task was known as multicone in its experimental form in STILTS v1.2 and v1.3.

OPTIONS

Specifies the format of the input table as specified by parameter in. The known formats are listed in SUN/256. This flag can be used if you know what format your table is in. If it has the special value (auto) (the default), then an attempt will be made to detect the format of the table automatically. This cannot always be done correctly however, in which case the program will exit with an error explaining which formats were attempted. This parameter is ignored for scheme-specified tables.

If set true, the input table specified by the in parameter will be read as a stream. It is necessary to give the ifmt parameter in this case. Depending on the required operations and processing mode, this may cause the read to fail (sometimes it is necessary to read the table more than once). It is not normally necessary to set this flag; in most cases the data will be streamed automatically if that is the best thing to do. However it can sometimes result in less resource usage when processing large files in certain formats (such as VOTable). This parameter is ignored for scheme-specified tables.

The location of the input table. This may take one of the following forms:

  • A filename.
  • A URL.
  • The special value "-", meaning standard input. In this case the input format must be given explicitly using the ifmt parameter. Note that not all formats can be streamed in this way.
  • A scheme specification of the form :<scheme-name>:<scheme-args>.
  • A system command line with either a "<" character at the start, or a "|" character at the end ("<syscmd" or "syscmd|"). This executes the given pipeline and reads from its standard output. This will probably only work on unix-like systems.

In any case, compressed data in one of the supported compression formats (gzip, Unix compress or bzip2) will be decompressed transparently.

Specifies processing to be performed on the input table as specified by parameter in, before any other processing has taken place. The value of this parameter is one or more of the filter commands described in SUN/256. If more than one is given, they must be separated by semicolon characters (";"). This parameter can be repeated multiple times on the same command line to build up a list of processing steps. The sequence of commands given in this way defines the processing pipeline which is performed on the table.

Commands may alteratively be supplied in an external file, by using the indirection character '@'. Thus a value of "@filename" causes the file filename to be read for a list of filter commands to execute. The commands in the file may be separated by newline characters and/or semicolons, and lines which are blank or which start with a '#' character are ignored.

Specifies processing to be performed on the output table, after all other processing has taken place. The value of this parameter is one or more of the filter commands described in SUN/256. If more than one is given, they must be separated by semicolon characters (";"). This parameter can be repeated multiple times on the same command line to build up a list of processing steps. The sequence of commands given in this way defines the processing pipeline which is performed on the table.

Commands may alteratively be supplied in an external file, by using the indirection character '@'. Thus a value of "@filename" causes the file filename to be read for a list of filter commands to execute. The commands in the file may be separated by newline characters and/or semicolons, and lines which are blank or which start with a '#' character are ignored.

The mode in which the result table will be output. The default mode is out, which means that the result will be written as a new table to disk or elsewhere, as determined by the out and ofmt parameters. However, there are other possibilities, which correspond to uses to which a table can be put other than outputting it, such as displaying metadata, calculating statistics, or populating a table in an SQL database. For some values of this parameter, additional parameters (<mode-args>) are required to determine the exact behaviour.

Possible values are

  • out
  • meta
  • stats
  • count
  • checksum
  • cgi
  • discard
  • topcat
  • samp
  • tosql
  • gui

Use the help=omode flag or see SUN/256 for more information.

The location of the output table. This is usually a filename to write to. If it is equal to the special value "-" (the default) the output table will be written to standard output.

This parameter must only be given if omode has its default value of "out".

Specifies the format in which the output table will be written (one of the ones in SUN/256 - matching is case-insensitive and you can use just the first few letters). If it has the special value "(auto)" (the default), then the output filename will be examined to try to guess what sort of file is required usually by looking at the extension. If it's not obvious from the filename what output format is intended, an error will result.

This parameter must only be given if omode has its default value of "out".

Right ascension in degrees in the ICRS coordinate system for the position of each row of the input table. This may simply be a column name, or it may be an algebraic expression calculated from columns as explained in SUN/256. If left blank, an attempt is made to guess from UCDs, column names and unit annotations what expression to use.

Declination in degrees in the ICRS coordinate system for the position of each row of the input table. This may simply be a column name, or it may be an algebraic expression calculated from columns as explained in SUN/256. If left blank, an attempt is made to guess from UCDs, column names and unit annotations what expression to use.

Expression which evaluates to the search radius in degrees for the request at each row of the input table. This will often be a constant numerical value, but may be the name or ID of a column in the input table, or a function involving one.

Determines which matches are retained.

  • best: Only the matching query table row closest to the input table row will be output. Input table rows with no matches will be omitted. (Note this corresponds to the best1 option in the pair matching commands, and best1 is a permitted alias).
  • all: All query table rows which match the input table row will be output. Input table rows with no matches will be omitted.
  • each: There will be one output table row for each input table row. If matches are found, the closest one from the query table will be output, and in the case of no matches, the query table columns will be blank.

Determines whether an attempt will be made to restrict searches in accordance with available footprint information. If this is set true, then before any of the per-row queries are performed, an attempt may be made to acquire footprint information about the servce. If such information can be obtained, then queries which fall outside the footprint, and hence which are known to yield no results, are skipped. This can speed up the search considerably.

Currently, the only footprints available are those provided by the CDS MOC (Multi-Order Coverage map) service, which covers VizieR and a few other cone search services.

Determines the HEALPix Nside parameter for use with the MOC footprint service. This tuning parameter determines the resolution of the footprint if available. Larger values give better resolution, hence a better chance of avoiding unnecessary queries, but processing them takes longer and retrieving and storing them is more expensive.

The value must be a power of 2, and at the time of writing, the MOC service will not supply footprints at resolutions greater than nside=512, so it should be <=512.

Only used if usefoot=true.

List of columns from the input table which are to be copied to the output table. Each column identified here will be prepended to the columns of the combined output table, and its value for each row taken from the input table row which provided the parameters of the query which produced it. See SUN/256 for list syntax. The default setting is "*", which means that all columns from the input table are included in the output.

Gives the name of a column in the output table to contain the distance between the requested central position and the actual position of the returned row. The distance returned is an angular distance in degrees. If a null value is chosen, no distance column will appear in the output table.

Allows multiple cone searches to be performed concurrently. If set to the default value, 1, the cone query corresponding to the first row of the input table will be dispatched, when that is completed the query corresponding to the second row will be dispatched, and so on. If set to <n>, then queries will be overlapped in such a way that up to approximately <n> may be running at any one time.

Whether increasing <n> is a good idea, and what might be a sensible maximum value, depends on the characteristics of the service being queried. In particular, setting it to too large a number may overload the service resulting in some combination of failed queries, ultimately slower runtimes, and unpopularity with server admins.

The maximum value permitted for this parameter by default is 5. This limit may be raised by use of the service.maxparallel system property but use that option with great care since you may overload services and make yourself unpopular with data centre admins. As a rule, you should only increase this value if you have obtained permission from the data centres whose services on which you will be using the increased parallelism.

Determines what will happen if any of the individual cone search requests fails. By default the task aborts. That may be the best thing to do, but for unreliable or poorly implemented services you may find that some searches fail and others succeed so it can be best to continue operation in the face of a few failures. The options are:

  • abort: Failure of any query terminates the task.
  • ignore: Failure of a query is treated the same as a query which returns no rows.
  • retry: Failed queries are retried until they succeed; an increasing delay is introduced for each failure. Use with care - if the failure is for some good, or at least reproducible reason this could prevent the task from ever completing.
  • retry<n>: Failed queries are retried at most a fixed number <n> of times; an increasing delay is introduced for each failure. If failures persist the task terminates.

If set true, this will cause the operation to stream on output, so that the output table is built up as the results are obtained from the cone search service. The disadvantage of this is that some output modes and formats need multiple passes through the data to work, so depending on the output destination, the operation may fail if this is set. Use with care (or be prepared for the operation to fail).

Determines how input columns are renamed before use in the output table. The choices are:

  • none: columns are not renamed
  • dups: columns which would otherwise have duplicate names in the output will be renamed to indicate which table they came from
  • all: all columns will be renamed to indicate which table they came from

If columns are renamed, the new ones are determined by suffix* parameters.

If the fixcols parameter is set so that input columns are renamed for insertion into the output table, this parameter determines how the renaming is done. It gives a suffix which is appended to all renamed columns from the input table.

If the fixcols parameter is set so that input columns are renamed for insertion into the output table, this parameter determines how the renaming is done. It gives a suffix which is appended to all renamed columns from the cone result table.

Selects the type of data access service to contact. Most commonly this will be the Cone Search service itself, but there are one or two other possibilities:

  • cone: Cone Search protocol - returns a table of objects found near each location. See Cone Search standard.
  • ssa: Simple Spectral Access protocol - returns a table of spectra near each location. See SSA standard.
  • sia1: Simple Image Access protocol version 1 - returns a table of images near each location. See SIA 1.0 standard.
  • sia2: Simple Image Access protocol version 2 - returns a table of images near each location. See SIA 2.0 standard.
  • sia: alias for sia1

The base part of a URL which defines the queries to be made. Additional parameters will be appended to this using CGI syntax ("name=value", separated by '&' characters). If this value does not end in either a '?' or a '&', one will be added as appropriate.

See SUN/256 for discussion of how to locate service URLs corresponding to given datasets.

Verbosity level of the tables returned by the query service. A value of 1 indicates the bare minimum and 3 indicates all available information.

Indicates the format of data objects described in the returned table. The meaning of this is dependent on the value of the servicetype parameter:

  • servicetype=cone: not used
  • servicetype=ssa: gives the MIME type of spectra referenced in the output table, also special values "votable", "fits", "compliant", "graphic", "all", and others (value of the SSA FORMAT parameter).
  • servicetype=sia1: gives the MIME type required for images/resources referenced in the output table, corresponding to the SIA FORMAT parameter. The special values "GRAPHIC" (all graphics formats) and "ALL" (no restriction) as defined by SIAv1 are also permissible. For SIA version 1 only, this defaults to "image/fits".
  • servicetype=sia2: gives the MIME type required for images/resources referenced in the output table, corresponding to the SIA FORMAT parameter. The special values "GRAPHIC" (all graphics formats) and "ALL" (no restriction) as defined by SIAv1 are also permissible.
  • servicetype=sia: gives the MIME type required for images/resources referenced in the output table, corresponding to the SIA FORMAT parameter. The special values "GRAPHIC" (all graphics formats) and "ALL" (no restriction) as defined by SIAv1 are also permissible. For SIA version 1 only, this defaults to "image/fits".

Whether the table metadata which is returned from a search result with zero rows is to be believed. According to the spirit, though not the letter, of the cone search standard, a cone search service which returns no data ought nevertheless to return the correct column headings. Unfortunately this is not always the case. If this parameter is set true, it is assumed that the service behaves properly in this respect; if it does not an error may result. In that case, set this parameter false. A consequence of setting it false is that in the event of no results being returned, the task will return no table at all, rather than an empty one.

If true, the service is requested to provide HTTP-level compression for the response stream (Accept-Encoding header is set to "gzip", see RFC 2616). This does not guarantee that compression will happen but if the service honours this request it may result in a smaller amount of network traffic at the expense of more processing on the server and client.

SEE ALSO

stilts(1)

If the package stilts-doc is installed, the full documentation SUN/256 is available in HTML format:
file:///usr/share/doc/stilts/sun256/index.html

VERSION

STILTS version 3.4.7-debian

This is the Debian version of Stilts, which lack the support of some file formats and network protocols. For differences see
file:///usr/share/doc/stilts/README.Debian

AUTHOR

Mark Taylor (Bristol University)

Mar 2017