Import XSD and corresponding XML

You can use this feature to:

The help file is organized as follows:

Challenges importing XSD into OWL

Some challenges in converting XSD to OWL that are addressed by this module are:

The following description uses the example file catalog.xsd. You can download catalog.xsd and start using it with the help of the description. We extended an example in the book "Definitive XML Schema" by Priscilla Walmsley. The extensions cover many of the special cases we had to support.

Using the Wizard

Getting Started

The first step is to invoke the importer by selecting the target folder for the OWL files that will be generated from the XSD schemas. Right-click the selected folder and pick Import -> TopBraid Composer -> Import XML Schemas to display the first page of this wizard, where you can name the XML Schema files to convert. These can either reside locally or on the Web.

After you name the files to convert and click Next, TopBraid Composer loads them into memory and displays the next dialog box, which lets you edit the namespace mapping, i.e. specify which XML Schema namespace is mapped into which OWL file. This mapping is already populated for your convenience, and there is a short cut button to convert urn-based namespace into http-based ones.

Configuration Options

Clicking next will bring up the main panel with the configuration options:

Controlling Class Names

The first group in the options provides some control over class names.

Capitalize the first character of class names

If checked, the generated class names will begin with an upper case letter. If unchecked, none of the characters in the class localname are modified. Default is checked.

Generate "A_Global-" prefix for classes derived from global elements

If checked, all class names will be prefixed with A_Global-. This is used to distinguish global elements from complex types having similar names. The A_ prefix will ensure that these classes appear at the beginning in Classes View. Default is unchecked.

Remove trailing "Type" and similar characters from class names

If checked, the importer removes certain suffix words like Type, AbstractType, ComplexType and some non-alphanumeric characters during class generation. If unchecked, no suffix words or characters are removed. Default is unchecked.

Generate abstract superclasses (e.g. A_AbstractAttributeGroup)

If checked, abstract superclasses will be generated to organize classes. By using the A_ prefix, these classes will appear at the beginning in Classes View, and they are much less likely to clash with similarly named classes derived from XSD. Default is unchecked.

Abstract superclasses are useful for the following reasons:

The following table shows all abstract superclasses.

Table: Abstract superclasses for the OWL classes derived from particular XSD constructs
XSD Constructs Abstract superclass for the derived OWL classes
Anonymous types A_Anon
Attribute groups A_AbstractAttributeGroup
Model groups (element groups) A_AbstractModelGroup
Global elements A_GlobalElements
Annotations for assisting XML to RDF Conversion

The next checkbox controls the generation of Semantic XML annotations.

Generate Semantic XML statements
If checked, the importer generates Semantic XML annotations, which are sxml:tag and sxml:attribute property values. These annotations are necessary for mapping of XML instance data to RDF instances that correspond to the generated OWL model. If unchecked, no Semantic XML annotations are generated. Then, the annotations would have to be manually entered for XML instance mapping. Default is checked.
Controlling the names of declared properties

Generate object property declarations

If checked, the importer generates object property declarations. If unchecked, no object property declarations are generated. However, datatype property declarations are always generated in either case. Default is checked.

Prefix of object property names

The importer uses the given prefix in all generated object property names. Default appends Ref on all object property names, when no prefix is given, to distinguish an object property from a datatype property generated with the same name.

Prefix of datatype property names

The importer uses the given prefix in all generated datatype property names. Default is no change to the XSD construct name. If no prefix is given, then since the object property names always have a prefix or the suffix Ref generated, any name conflicts are already prevented.

Controlling Enumerated Values

The next line in the options dialog provides control over the transformation of enumerations.

In addition to enumerated values instances, generate an "Enumeration" class instance

If checked, "Enumeration" class instances will be generated to refer to enumerated values, and to hold their default values, metadata and act as a foundation to construct more complex enumeration structures. Default is unchecked.

Maximum enumerated values in owl:oneOf

The default is empty. This is the size of the generated owl:oneOf lists for enumerated values. If the given size is 0, then owl:oneOf is not generated. If this box is empty, then there is no limit to the number of enumerated values in owl:oneOf lists.

Controlling Labels

Generate skos:prefLabel statements

The importer constructs skos:prefLabel statements as label annotations for each generated resource. Default is unchecked. No skos:prefLabel statements are generated, but if checked, annotation statements using other properties may be generated.

Generate rdfs:label statements

The importer constructs rdfs:label statements as label annotations for each generated resource. Default is checked. No rdfs:label statements are generated, but if checked, annotation statements using other properties may be generated.

Append namespace prefixes to labels

The importer appends namespace prefixes to generated labels. This ensures unique labels across multiple namespaces.

Controlling Descriptions

Generate skos:definition statements

The importer constructs skos:definition statements from XSD annotations. Default is unchecked. No skos:definition statements are generated, but if checked, annotation statements using other properties may be generated.

Generate rdfs:comment statements

The importer constructs rdfs:comment statements from XSD annotations. Default is unchecked. No rdfs:comment statements are generated, but if checked, annotation statements using other properties may be generated.

Generate dc:description statements

The importer constructs dc:description statements from XSD annotations. Default is checked. No dc:description statements are generated, but if checked, annotation statements using other properties may be generated.

Controlling how XSD Datatypes are generated

If checked, only XSD datatypes are used as ranges in datatype property restrictions along with an rdfs:comment value and as datatypes for literals, such as enumeration literals. This enables the Semantic XML mapping to use only XSD datatypes on literals that are mapped from XML content. This option is useful if a semantic repository supports only XSD datatypes or comparison in SPARQL queries will be used among literals. If unchecked, the user-defined datatypes will be used as well as XSD datatypes. The default is unchecked.

Progress Monitor

As the XML Schemas are processed, a progress bar indicates the current file. A stack shows how complex types are processed according to their nested structure.

Final Steps

Once the wizard is finished, the system will create one or more files in the selected folder. These files may import each other, reflecting the imports that have been defined in the original XSD files.

By default, the XSD importer will annotate the generated classes so that they can be used as schema for Semantic XML files.

Converting instance documents to use the generated ontology

XML conversion will happen automatically, when users import the XML files into the generated OWL model in TBC, or use XML import modules in SPARQLMotion. If a related .sxml file configuration uses the generated model, the users can also right-click an XML file in the Project Explorer and pick Open With -> TopBraid (Semantic XML Schema Documents). As long as an XML file is valid against an XSD that it is based on, the XML will be transformed in accordance to the schema. Parts of the XML files that do not validate against a schema will continue to be converted using the default Semantic XML structure.

As an example, save and open the ontology catalog.ttl, which is generated using the importer, in TBC. Save catalog-instance.xml and then drag and drop it into the Imports View. You will see how XML instances are mapped to OWL instances, text content to literals and enumerated value instances. Semantic XML uses the generated classes, restrictions and properties during the mapping.

Summary of Supported Transformations

Table: Conversion from XSD Constructs to OWL Constructs
XSD/XML Constructs OWL Constructs
xsd:simpleType owl:Datatype
xsd:simpleType with xsd:enumeration Becomes an owl:Class as a subclass of EnumeratedValue. Instances are created for every enumerated value. Optionally, an instance of Enumeration, referring to all the instances, is created as well as the owl:oneOf union over the instances.
xsd:complexType over xsd:complexContent owl:Class
xsd:complexType over xsd:simpleContent owl:Class
xsd:element (global) with complex type owl:Class and subclass of the class generated from the referenced complex type. Optionally, the generated class is prefixed with A_Global- to distinguish global elements from complex types with similar names during trimming or case modification of characters. Also optionally, the generated class becomes subclass of A_GlobalElements.
xsd:element (global) with simple type owl:Datatype
xsd:element (local to a type) owl:DatatypeProperty or owl:ObjectProperty depending on the element type. OWL Restrictions are built for the occurrence.
xsd:group owl:Class and optionally subclass of A_AbstractModelGroup
xsd:attributeGroup owl:Class and optionally subclass of A_AbstractAttributeGroup
xsd:minOccurs and xsd:maxOccurs Cardinality specified in minimum cardinality, maximum cardinality and universal (allValuesFrom) OWL restrictions.
Anonymous Complex Type As for Complex Type except a URI is constructed from the parent element and the nested element reference. Optionally, the class is defined as a subclass of A_Anon.
Anonymous Simple Type As for Simple Type except a URI is constructed from the parent element and the nested element reference.
xsd:default on an attribute Uses dtype:defaultValue to attach a value to the OWL restriction representing the associated property.
Substitution Groups Subclass statements are generated for the members. Instance files resolve their types by consulting the OWL model at import-time.
Annotation attributes on elements OWL annotation property declarations are created, and the property values are placed directly on the relevant class.
Annotations using xsd:annotation Become, based on user selection, dc:description, rdfs:comment and/or skos:definition OWL annotations.
xsi:type on an XML element Overrides the schema type with the specified type.

Known Issues

The following are known problems:

  1. Namespaces
    1. Schema lacking a default namespace has a prefix mapping for another namespace coming from an imported schema.
    2. Several includes and imports don't work.
  2. Untyped elements
    1. An untyped element doesn't get a restriction generated. However, it should then assume by default that it is an element with type=xsd:string.
  3. Imports
    1. Wrong property type and wrong range are generated for a simple type referenced from another schema.
  4. Annotation structures
    1. Structured annotations and appinfo are not supported.
  5. Element declarations
    1. The default and fixed attributes don't get processed in XSD.
    2. Empty XML instances associated with default and fixed don't get processed.
  6. Attribute declarations
    1. The fixed attribute doesn't get processed.
  7. Unions and lists
    1. Unions and lists are not supported.
  8. Simple types
    1. Some XSD simple types are not supported like xs:hexBinary. For those simple types, before doing the import, instead use xs:string in XSD or another commonly used XSD simple type.
  9. Built-in simple types
    1. No special support for ID and IDREF.
  10. Complex types
    1. Mixed content (mixed="true")
    2. xs:any and its attributes
    3. xs:choice with only one element choice allowed.
    4. xs:all - no special processing
    5. xs:anyAttribute
  11. Type derivations
    1. Extension of xs:choice which allows a choice of only one element.
    2. Mixed content extension
    3. Attribute wildcard (xs:anyAttribute) extension
    4. Attribute wildcard (xs:anyAttribute) restriction
    5. Abstract types - no special processing
  12. Groups
    1. xs:all - no special processing
    2. xs:anyAttribute

These issues will be incrementally addressed in future releases.