Web sites vs Web services
Many astronomical web sites already exist.
They provide access to archives of objects information,
collections of catalogs, as well as databases of images and spectra.
These web sites are mainly designed with a human user in mind.
Information is expected to be entered manually via Web
forms and results of queries are presented in HTML or graphics.
To integrate access to archives into a more automated data
processing flow, data must be transported from application to
application in an efficient manner. Web services are create
for this purpose. While Web sites are designed for human
consumption, Web services enable machines to communicate with other
machines without human intervention.
VO services are Web services that can:
- deliver information on demand,
- deliver information by subscription, and
- perform some action upon request.
VO services provide a consistent interface to access astronomical information
by following standards that define the
data exchange format. Web services, and therefore VO services, rely
on XML, a flexible and easy to parse language to describe
the information to be transported from application to application. By using
XML and related standards, data consuming applications can
dynamically discover interfaces and services
offered by data producing applications with minimal human interaction.
The VO Registry
Demonstration:Keyword search, Advanced searchVO Registry is a VO service that provides information on available VO services and other related applications. A VO Registry is the yellow pages directory for VO services. Currently, several registries are available. And they can be found by querying the registry. In theory, all registries contain the same information by mirroring and updating each other with the newest changes. Also, they should implement the same Web service interface.
The VO Registry implementation at STSCI (http://nvo.stsci.edu/voregistry/index.aspx) features not only a SOAP interface , but also a HTTP-GET and a HTTP-POST implementation. The Registry also provides a WSDL (http://nvo.stsci.edu/VORegistry/registry.asmx?WSDL) that defines the syntax of methods and parameters.
- KeywordSearch
Input: a list of words and flag indicating if matching entries should contain all words
Output: List of VOResource (structure of VOResource is defined in WSDL) - QueryResource, QueryRegistry, QueryVOResource
Input: a predicate (one or more conditions or combination of conditions to match entries)
Output:- QueryRegistry: list of SimpleResources
- QueryResource: list of DBResources
- QueryVOResource: list of VOResources
- DumpRegistry, DumpVOResources
Output: list of SimpleResources and VOResources respectively
There are two ways to query the VO Registry: via HTTP-GET/POST or via SOAP.
When using HTTP-GET or
HTTP-POST, the querying program acts as a web browser.
Conversely, any browser can act as a client program.
Ignoring the HTTP part for a moment, the result is a XML file
that starts like this:
<?xml version="1.0" encoding="utf-8"?>
<ArrayOfResource xmlns="http://www.us-vo.org">
<Resource created="date" updated="date" status="active or inactive or deleted">
<title xmlns="http://www.ivoa.net/xml/VOResource/v0.10">string</title>
<shortName xmlns="http://www.ivoa.net/xml/VOResource/v0.10">string</shortName>
<identifier xmlns="http://www.ivoa.net/xml/VOResource/v0.10">anyURI </identifier>
For example, try this link with your browser: http://nvo.stsci.edu/VORegistry/registry.asmx/KeywordSearch?keywords=chandra+einstein&andKeys=True
Client programs must parse the XML and extract the desired information: for example, the URLs of VO Services that match the search criteria. The HTTP-GET/POST method does not provide any special mechanism to handle XML content. It is up to the client to convert the XML structure into a more useful form.
With SOAP, the situation is different. The underlying SOAP library (assuming we use one) hides the communication with the server, XML parsing, and error handling. This abstraction layer also converts the returned XML into native data structure that the client program can process easily. This is in contrast to the HTTP-GET/POST methods, where the client program handles the XML directly.
In either case, using HTTP-GET/POST or SOAP, with and without handling XML, one needs to know the semantics of the content, its fields (tags) and its values. The WSDL defines the syntax of the data. Human interpretation is still needed to determine the meaning of the data. Fortunately, thanks to the verbosity of XML and WSDL, plain text descriptions and other supporting materials can be embedded, so that the XML/WSDL files become self-documenting. This may lead to a more automated data exchange with less human interaction.
One way to understand the interface is to generate data types and
functions (classes and methods) by using one of the conversion
tools, for example, wsdl2java ($NVOSS_HOME/java/bin)
and wsdl2py (not provided). Using the method stubs that these
tools generate, users do not have to know about the underlying mechanics
of networking and XML parsing.
Sample programs showing how to query the VO-Registry can be found at:
- Java:
- $NVOSS_HOME/java/dev/nvoregistry/FindConeSearch.java
- $NVOSS_HOME/java/dev/ivoaregistry
- Python: $NVOSS_HOME/python/samples/VORegistryEx.py
- PHP: $NVOSS_HOME/php/web/kwdSearch.php
python VORegistry.py keywordSearch chandra einstein
This keywordSearch example displays only a few fields. The student can experiment with the program and add other fields. But how do we find out the field names?
python VORegistry.py printFields chandra
This example is similar to the last one, but displays all the fields of the first entry. Note that the result is from KeywordSearch, which means that it is a VOResource. Also note that some fields have subfields.
VO Metadata, see link http://nvo.stsci.edu/VORegistry/ListColumns.aspx.
Using the field names provided by the link, one can make a more advanced query.
See Advanced Query:
http://nvo.stsci.edu/VORegistry/QueryRegistry.aspx?advanced=true&startRes=-1
Example:
python VORegistry queryResource "ResourceType = 'CONE' and contentlevel like '%research%'"
The queryResource example displays amongst other things the service type and the service URL, which points to the VO Service that provides the actual data. The most common values of 'ResourceType' are CONE, SIAP, SKYNODE and 'TABULARSKYSERVICE', where SIAP types can be SIAP/CUTOUT and SIAP/ARCHIVE.
Another field of interest is CoverageSpectral, which can contain one or more of the following values:
- Infrared
- EUV
- Gamma-ray
- Millimeter
- Optical
- Radio
- Ultraviolet
- UV
- X-ray
- Not provided
The field CoverageSpatial could provide valuable information that enables applications to check whether the service covers the area of interest, but unfortunately it is not well populated.
Using the service URL provided by the VO Registry, one can proceed to query the VO service. At this point, we only know about the type and the spectral coverage of the service. Additional information must be retrieved from that service. A special Cone Search query with search radius set to 0deg can be used to retrieve information on the table columns, which must include ID and UCD. Other properties such as name (for presentation proposes) and plain text descriptions are optional.
Similarly, SIAP services accept the option 'FORMAT=METADATA', which produces a VOTable with only metadata.
SkyNodes offer a different interface. The operations Tables, Columns and Column can be used to retrieve meta information.
Shui Hung Kwok
NVO Summer School 2005
