Named Graphs and Base URIs in TopBraid

TopBraid Suite (TBS) implements named graphs as specified by the relevant W3C specifications. A TBS workspace defines a dataset consisting of multiple named graphs. Each named graph is defined by a file in the workspace, identified by its base URI. Thus, the best way to understand how a workspace works is to view it as a quad triple store that directly or indirectly stores a variety of named graphs.

Only one workspace (dataset) can be loaded for any TBS product, including TopBraid Composer (TBC), TopBraid Live (TBL) and Enterprise Data Governance (EDG). Multiple workspaces can be defined, each representing a separate dataset, but only one can be loaded at a time. Since the workspace is a file directory, it can be located anywhere, but to avoid network latency, it is best for the workspace to be on the same machine as the TBS product.

Each named graph must have a name that uniquely identifies it. All RDF graphs managed by TBS use a base URI as a unique identifier for a graph within the scope of a workspace. The graphs may be files that store RDF data directly, named graphs stored within a TDB or RDBMS model, or connector files to the remotely stored RDF data. These include:

In most cases (with the exception of non-RDF files that TopBraid auto-converts to RDF on the fly as described above) that base URI that identifies a graph is saved within a file using one of the following types of statements:

Since there is no way to directly persist this statement within the remote RDF or relational databases, the statement is captured only in TopBraid connectors (the workspace graphs that stand as a proxy for the remote graphs).

TopBraid will assign base URIs to XML, HTML and Excel files based on their location within the TopBraid workspace and the file name. Thus, changing a location or name of such file will effectively change its identity, while making such changes for other files will have no bearing on their identity.

All graphs (with the exception of non-RDF files that TopBraid auto-converts to RDF on the fly) should also contain the following triple, which is shown here in the Turtle syntax:

<http://example.org/myExample/baseURI> a owl:Ontology .

At system startup, TBS products must scan the files to name the graphs—that is, to associate the base URIs with the files in the workspace. TopBraid will first look for either a # baseURI or xml:base (depending on serialization) statement at the beginning of the file. This avoids having to look through the entire file to find the owl:Ontology statement. When # baseURI/xml:base and owl:Ontology statements are both present and conflict, precedence is given to the # baseURI/xml:base statement.

Thus, each file must:

When graphs are created using TopBraid, TopBraid will ensure that the appropriate statements exist. Graphs created outside of TopBraid must take care to follow the rules specified above.

Other important points:

Base URIs (named graphs) are used by TopBraid in accordance with standards set in SPARQL and RDF (1.1). Here is a representative while not a comprehensive list:

When a TBS product encounters a base URI, it will always first try to locate the named graph in the workspace. If the base URI specified is not associated with one of the graphs managed by TBS, TBS will attempt to access the graph in the location provided by the base URI. For example, if no TBS named graphs have the base URI http://www.example.org/graph1, TBS will attempt to connect to the web address http://www.example.org/graph1 to retrieve the graph data. Performance will depend on the response time of that web server. If the graph in question is not hosted in the appropriate manner by the server at the specified location, users will experience some delay as TBS confirms that there is no response (which would be the case with any example.org URIs).

TopBraid includes a component called FileRegistry that keeps an up-to-date list of all graphs managed by TopBraid. TopBraid Composer provides a view that shows FileRegistry content for a workspace. For more info see the File Registry View help panel. For TBS server products, a list of named graphs in the workspace is found through the Administrative Console at the Base URI Management link at the URLhttp://[host]:[port]/{evn,tbl}/tbl/admin/baseURIMgmt.

Duplicate base URIs

TopBraid can only operate over a single graph for a given base URI. This is identical to any other named graph implementation and consistent with the named graph specification—each graph must have a unique identity. This policy is enforced in TBS servers and by default in Composer. A preference is provided in TopBraid Composer to allow base URI conflicts (see below). However, the best and highly recommended practice is to avoid using the same base URI for two different files. TopQuadrant cannot provide support to resolve problems resulting from users attempting to keep and use multiple graphs with the same identity.

TopBraid Composer provides warnings when the uniqueness of the base URIs for the workspace graphs is violated. It also prevents users from uploading projects to TopBraid Live that either contain files with conflicting base URIs or contain files with base URIs that conflict with those already present in the TopBraid Live workspace.

The preference to allow base URI conflicts in Composer means that multiple files can use the same base URI, but only one file is defined as the "primary file" for the base URI. This can be used by advanced users to manipulate how data is imported, etc. If, however, there are two or more files with the same base URI, the conflict is resolved in the following way:

  1. TopBraid Composer will decide that the base URI is associated with the file that has been updated most recently.

  2. In TopBraid Composer only, the end user can alter this decision by opening the file that they want to be the "main" one for the URI. They will get a message asking them to confirm that this is what they want.