- Documentation
- Reference manual
- Packages
- SWI-Prolog Semantic Web Library 3.0
- Two RDF APIs
- library(semweb/rdf_db): The RDF database
- Query the RDF database
- Enumerating objects
- Modifying the RDF database
- Update view, transactions and snapshots
- Type checking predicates
- Loading and saving to file
- Graph manipulation
- Literal matching and indexing
- Predicate properties
- Prefix Handling
- Miscellaneous predicates
- Memory management considerations
- library(semweb/rdf_db): The RDF database
- Two RDF APIs
- SWI-Prolog Semantic Web Library 3.0
3.3.8 Literal matching and indexing
Literal values are ordered and indexed using a skip list. The aim of this index is threefold.
- Unlike hash-tables, binary trees allow for efficient prefix and range matching. Prefix matching is useful in interactive applications to provide feedback while typing such as auto-completion.
- Having a table of unique literals we generate creation and
destruction events (see rdf_monitor/2).
These events can be used to maintain additional indexing on literals,
such as‘by word’. See
library(semweb/litindex)
.
As string literal matching is most frequently used for searching
purposes, the match is executed case-insensitive and after removal of
diacritics. Case matching and diacritics removal is based on Unicode
character properties and independent from the current locale. Case
conversion is based on the‘simple uppercase mapping’defined
by Unicode and diacritic removal on the‘decomposition type’.
The approach is lightweight, but somewhat simpleminded for some
languages. The tables are generated for Unicode characters upto 0x7fff.
For more information, please check the source-code of the mapping-table
generator
unicode_map.pl
available in the sources of this package.
Currently the total order of literals is first based on the type of literal using the ordering numeric < string < term Numeric values (integer and float) are ordered by value, integers preceed floats if they represent the same value. Strings are sorted alphabetically after case-mapping and diacritic removal as described above. If they match equal, uppercase preceeds lowercase and diacritics are ordered on their unicode value. If they still compare equal literals without any qualifier preceeds literals with a type qualifier which preceeds literals with a language qualifier. Same qualifiers (both type or both language) are sorted alphabetically.
The ordered tree is used for indexed execution of
literal(prefix(Prefix), Literal)
as well as literal(like(Like), Literal)
if Like does not start with a‘*’. Note that results
of queries that use the tree index are returned in alphabetical order.