rdf_abstract.pl -- Abstract RDF graphs
The task of this module is to do some simple manipulations on RDF graphs
represented as lists of rdf(S,P,O)
. Supported operations:
- merge_sameas_graph(+GraphIn, -GraphOut, +Options)
- Merge nodes by owl:sameAs
- bagify_graph(+GraphIn, -GraphOut, -Bags, +Options)
- Bagify a graph, returning a new graph holding bags of resources playing a similar role in the graph.
- abstract_graph(+GraphIn, -GraphOut, +Options)
- Abstract nodes or edges using rdf:type, rdfs:subClassOf and/or rdfs:subPropertyOf
- merge_sameas_graph(GraphIn, GraphOut, +Options) is det
- Collapse nodes in GraphIn that are related through an identity
mapping. By default, owl:sameAs is the identity relation.
Options defines:
- predicate(-PredOrList)
- Use an alternate or list of predicates that are to be treated as identity relations.
- sameas_mapped(-Assoc)
- Assoc from resources to the resource it was mapped to.
- sameas_map(+Graph, +SameAs, -Map:assoc) is det[private]
- Create an assoc with R->Set, where Set contains an ordered set of resources equivalent to R.
- same_as(+Predicate:resource, +SameAs:list) is semidet[private]
- True if Predicate expresses a same-as mapping.
- representer_map(+List:list(Repr-Set), -Assoc) is det[private]
- Assoc maps all elements of Set to its representer.
- bagify_graph(+GraphIn, -GraphOut, -Bags, +Options) is det
- If a graph contains multiple objects of the same type (class) in
the same location in the graph (i.e. all links are the same),
create a bag. The bag is represented by a generated resource
of type rdf:Bag and the RDF for the bags is put in Bags. I.e.
appending GraphOut and Bags provides a proper RDF model. Options
provides additional abstraction properties. In particular:
- class(+Class)
- Try to bundle objects under Class rather than their rdf:type. Multiple of these options may be defined
- property(+Property)
- Consider predicates that are an rdfs:subPropertyOf Property the same relations.
- bagify_literals(+Bool)
- If
true
(default), also try to put literals into a bag. Works well to collapse non-preferred labels.
- canonise_options(+OptionsIn, -OptionsOut) is det[private]
- Rewrite option list from possible Name=Value to Name(Value)
- group_resources_by_class(+Resources, -ByClass, +Options) is det[private]
- ByClass is a list of lists of resources that belong to the same class. First step we process the classes specified in Options.
- has_class(+Match, +Class, +Node) is semidet[private]
- class_of(+Node, +Match, -Class) is det[private]
- class_of(+Node, +Match, +Class) is semidet[private]
- resource_bags(+ByClass:list(list(resource)), +NodeToEdges:list(node-list(edges)), -RawBags:list(list(resource))) is det[private]
- Find bags of resources that have the same connections.
- ord_subkeys(+Keys, +Pairs, -SubPairs) is det[private]
- SubPairs is the sublist of Pairs with a key in Keys.
- same_edges(+NodeToEdges:list(node-edges), -Bags:list(list)) is det[private]
- Bags is a list of lists of resources (nodes) that share the same (abstracted) edges with the rest of the graph.
- graph_node_edges(+Graph, -NodeEdges:assoc, +Options) is det[private]
- NodeEdges is an assoc from resource to a sorted list of involved
triples. Only subject and objects are considered.
Processes
bagify_literals
andproperty
options - property_map(+Options, -Map:assoc(P-Super))[private]
- Process the options, creating a map that replaces a property by its registered super.
- abstract_property(+P0, +Map0, -P, -Map) is det[private]
- Find the abstract property for some property P.
- assign_bagids(+Bags:list(bag), -IDBags:list(id-bag))[private]
- Assign bag identifiers to the each bag in Bags.
- make_rdf_graphs(+IDBags, -RDFBags) is det[private]
- Translate BagID-Members into an RDF graph.
- merge_properties(+GraphIn, -GraphOut, +Options) is det[private]
- Merge equivalent properties joining the same nodes. They are replaced by their common ancestors.
- common_ancestor_forest(:Pred, +Objects, -Forest) is det[private]
- Forest is a minimal set of minimal spanning trees with real
branching (more than one child per node) covering all Objects.
The partial ordering is defined by the non-deterministic goal
call(Pred, +Node, -Parent)
.- Build up a graph represented as Node->Children and a list of roots. The initial list of roots is Objects. The graph is built using breath-first search to minimize depth.
- Once we have all roots, we delete all branches that have only a single child.
- keys_to_assoc(+Keys:list, +Value, -Assoc) is det[private]
- True if Assoc is an assoc where each Key maps to Value.
- ancestor_tree(+Open, +Closed, +Targets, :Pred, +NodesIn, -NodesOut, -Roots) is det[private]
- Explore the ancestor graph one more step. This is the main loop
looking for a spanning tree. We are done if
- There is only one open node left and no closed ones. We found the single common root.
- No open nodes are left. We have a set of closed roots which form our starting points. We still have to figure out the minimal set of these, as some of the trees may overlap others.
- We have an open node covering all targets. This is the lowest one as we used breath-first expansion. This step is too expensive.
- expand_ancestor_tree(+Open0, -Open, +Closed0, -Closed, +Nodes0, -Nodes, :Pred)[private]
- Expand the explored graph with one level. Open are the currently open nodes. Closed are the nodes that have no parent and therefore are roots.
- add_parents(+Parents:list, +Child, -NR, +NRT, +Nodes0, -Nodes)[private]
- Add links Parent->Child to the tree Nodes0. The difference list NR\NRT contains Parents added new to the tree.
- in_tree(?Node, +Root, +Nodes) is nondet[private]
- True if Node appears in the tree below Root.
- prune_forest(+Nodes, +Roots, -MinimalForest) is det[private]
- MinimalForest is the minimal forest overlapping all targets.
- prune_root(+Nodes, +Root0, -Root) is det[private]
- Prune the parts of the search tree that ended up nowhere. The first real branch is where we find a solution or there are multiple parents. This avoids doing double work pruning the trees itself.
- prune_ancestor_tree(Nodes, Root, Tree) is det[private]
- Tree is a pruned hierarchy from Root using the branching paths of Nodes.
- tree_covers(+Root, +Nodes, -Targets:list) is det[private]
- True if Targets is the sorted list of targets covered by the tree for which Root is the root.
- map_graph(+GraphIn, +Map:assoc, -GraphOut) is det[private]
- Map a graph to a new graph by mapping all fields of the RDF
statements over Map. Then delete duplicates from the resulting
graph as well as
rdf(S,P,S)
links that did not appear before the mapping. - map_graph(+GraphIn, +Map:assoc, -GraphOut, -AbstractMap) is det[private]
- Map a graph to a new graph by mapping all fields of the RDF
statements over Map. The nodes in these graphs are terms of the
form Abstract-
list(concrete)
. - pairs_keys_intersection(+Pairs, +Keys, -PairsInKeys) is det[private]
- True if PairsInKeys is a subset of Pairs whose key appear in
Keys. Pairs must be key-sorted and Keys must be sorted. E.g.
?- pairs_keys_intersection([a-1,b-2,c-3], [a,c], X). X = [a-1,c-3]
- map_to_bagged_graph(+GraphIn, +Map, -GraphOut, -Bags) is det[private]
- GraphOut is a graph between objects and bags, using the most specific common ancestor for representing properties.
- rdf_to_paired_graph(+GraphIn, -PairedGraph) is det[private]
- used_properties(+S0, +O0, +GraphIn, +AbstractMap, -PredList) is det[private]
- Find properties actually used between two bags. S0 and O0 are the subject and object from the abstract graph.
- graph_resources(+Graph, -Resources:list(atom)) is det
- Resources is a sorted list of unique resources appearing in Graph. All resources are in Resources, regardless of the role played in the graph: node, edge (predicate) or type for a typed literal.
- graph_nodes(+Graph, -Nodes) is det[private]
- Nodes is a sorted list of all resources and literals appearing in Graph.
- graph_resources(+Graph, -Resources:list(atom), -Predicates:list(atom), -Types:list(atom)) is det
- Resources is a sorted list of unique resources appearing in Graph as subject or object of a triple. Predicates is a list of all unique predicates in Graph and Types is a list of all unique literal types in Graph.
- abstract_graph(+GraphIn, -GraphOut, +Options) is det
- Unify GraphOut with an abstracted version of GraphIn. The
abstraction is carried out triple-by-triple. Note there is no
need to abstract all triples to the same level. We do however
need to map nodes in the graph consistently. I.e. if we abstract
the object of
rdf(s,p,o)
, we must abstract the subject ofrdf(o, p2, o2)
to the same resource.If we want to do incremental growing we must keep track which nodes where mapped to which resources. Option?
We must also decide on the abstraction level for a node. This can be based on the weight in the search graph, the involved properties and focus such as location and time. Should we express this focus in the weight?
Options:
- map_in(?Map)
- If present, this is the initial resource abstraction map.
- map_out(-Map)
- Provide access to the final resource abstraction map.
- bags(-Bags)
- If provided, bagify the graph, returning the triples that define the bags in Bags. The full graph is created by appending Bags to GraphOut.
- merge_concepts_with_super(+Boolean)
- If
true
(default), merge nodes of one is a super-concept of another.
- node_map(+Nodes, +Map0, -Map, +Options) is det[private]
- Create the abstraction map for the nodes of the graph. It
consists of two steps:
- Map all instances to their class, except for concepts
- If some instances are mapped to class A and others to class B, where A is a super-class of B, map all instances to class A.
- identity_map(+List, -Map) is det[private]
- find_broaders(+List, +Map0, -Map) is det[private]
- deref_map(+Map0, -Map) is det[private]
- deref(+Pairs0, NewPairs) is det[private]
- Dereference chains V1-V2, V2-V3 into V1-V3, V2-V3. Note that Pairs0 may contain cycles, in which case all the members of the cycle are replaced by the representative as defined by rdf_representative/2.
- edge_map(+Edges, +MapIn, -MapOut) is det[private]
- concept_of(+Resource, -Concept) is det[private]
- True if Concept is the concept Resource belongs to. If Resource is a concept itself, Concept is Resource.
- broader(+Term, -Broader) is nondet[private]
- True if Broader is a broader term according to the SKOS schema.
- rdf_representative(+Resources:list, -Representative:atom) is det[private]
- Representative is the most popular resource from the non-empty list Resources. The preferred representative is currently defined as the resource with the highest number of associated edges.
- minimise_graph(+GraphIn, -GraphOut) is det
- Remove redudant triples from a graph. Redundant triples are
defined as:
- Super-properties of another property
- Inverse
- Symetric
- Entailed transitive