protobufs.pl -- Google's Protocol Buffers ("protobufs")
Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data -- think XML, but smaller, faster, and simpler. You define how you want your data to be structured once. This takes the form of a template that describes the data structure. You use this template to encode and decode your data structure into wire-streams that may be sent-to or read-from your peers. The underlying wire stream is platform independent, lossless, and may be used to interwork with a variety of languages and systems regardless of word size or endianness. Techniques exist to safely extend your data structure without breaking deployed programs that are compiled against the "old" format.
The idea behind Google's Protocol Buffers is that you define your structured messages using a domain-specific language and tool set. Further documentation on this is at https://developers.google.com/protocol-buffers.
There are two ways you can use protobufs in Prolog:
- with a compiled
.proto
file: protobuf_parse_from_codes/3 and protobuf_serialize_to_codes/3. - with a lower-level interface protobuf_message/2, which allows you to define your own domain-specific language for parsing and serializing protobufs.
The protobuf_parse_from_codes/3 and protobuf_serialize_to_codes/3
interface translates between a "wire stream" and a Prolog term. This
interface takes advantage of SWI-Prolog's
dict.
There is a protoc
plugin (protoc-gen-swipl
) that generates a
Prolog file of meta-information that captures the .proto
file's
definition in the protobufs
module:
proto_meta_normalize(Unnormalized, Normalized)
proto_meta_package(Package, FileName, Options)
proto_meta_message_type( Fqn, Package, Name)
proto_meta_message_type_map_entry( Fqn)
proto_meta_field_name( Fqn, FieldNumber, FieldName, FqnName)
proto_meta_field_json_name( FqnName, JsonName)
proto_meta_field_label( FqnName, LabelRepeatOptional) % 'LABEL_OPTIONAL', 'LABEL_REQUIRED', 'LABEL_REPEATED'
proto_meta_field_type( FqnName, Type) % 'TYPE_INT32', 'TYPE_MESSAGE', etc
proto_meta_field_type_name( FqnName, TypeName)
proto_meta_field_default_value( FqnName, DefaultValue)
proto_meta_field_option_packed( FqnName)
proto_meta_enum_type( FqnName, Fqn, Name)
proto_meta_enum_value( FqnName, Name, Number)
proto_meta_field_oneof_index( FqnName, Index)
proto_meta_oneof( FqnName, Index, Name)
The protobuf_message/2 interface allows you to define your message
template as a list of predefined
Prolog terms that correspond to production rules in the Definite Clause
Grammar (DCG) that realizes the interpreter. Each production rule has an
equivalent rule in the protobuf grammar. The process is not unlike
specifiying the format of a regular expression. To encode a template to
a wire-stream, you pass a grounded template, X
, and variable, Y
, to
protobuf_message/2. To decode a wire-stream, Y
, you pass an ungrounded
template, X
, along with a grounded wire-stream, Y
, to
protobuf_message/2. The interpreter will unify the unbound variables in
the template with values decoded from the wire-stream.
For an overview and tutorial with examples, see
library(protobufs)
: Google's Protocol Buffers
Examples of usage may also be found by inspecting
test_protobufs.pl
and the
demo
directory, or by looking at the "addressbook" example that is typically
installed at
/usr/lib/swi-prolog/doc/packages/examples/protobufs/interop/addressbook.pl
- protobuf_parse_from_codes(+WireCodes:list(int), +MessageType:atom, -Term) is semidet
- Process bytes (list of int) that is the serialized form of a message (designated
by
MessageType
), creating a Prolog term.Protoc
must have been run (with the--swipl_out=
option and the resulting top-level _pb.pl
file loaded. For more details, see the "protoc" section of the overview documentation.Fails if the message can't be parsed or if the appropriate meta-data from
protoc
hasn't been loaded.All fields that are omitted from the
WireCodes
are set to their default values (typically the empty string or 0, depending on the type; or[]
for repeated groups). There is no way of testing whether a value was specified inWireCodes
or given its default value (that is, there is no equivalent of the Python implementation's =HasField`). Optional embedded messages and groups do not have any default value -- you must check their existence by using get_dict/3 or similar. If a field is part of a "oneof" set, then none of the other fields is set. You can determine which field had a value by using get_dict/3. - protobuf_serialize_to_codes(+Term:dict, -MessageType:atom, -WireCodes:list(int)) is det
- Process a Prolog term into bytes (list of int) that is the serialized form of a
message (designated by
MessageType
).Protoc
must have been run (with the--swipl_out=
option and the resulting top-level _pb.pl
file loaded. For more details, see the "protoc" section of the overview documentation.Fails if the term isn't of an appropriate form or if the appropriate meta-data from
protoc
hasn't been loaded, or if a field name is incorrect (and therefore nothing in the meta-data matches it). - protobuf_var_int(?A:int)// is det[private]
- Conversion between an int A and a list of codes, using the
"varint" encoding.
The behvior is undefined if
A
is negative. This is a low-level predicate; normally, you should use template_message/2 and the appropriate template term. e.g.phrase(protobuf_var_int(300), S)
=> S = [172,2]phrase(protobuf_var_int(A), [172,2])
-> A = 300 - protobuf_tag_type(?Tag:int, ?WireType:atom)// is det[private]
- Conversion between Tag (number) + WireType and wirestream codes. This is a low-level predicate; normally, you should use template_message/2 and the appropriate template term. @arg Tag The item's tag (field number) @arg WireType The item's wire type (see prolog_type//2 for how to convert this to a Prolog type)
- prolog_type(?Tag:int, ?PrologType:atom)// is semidet[private]
- Match Tag (field number) + PrologType. When Type is a variable, backtracks through all the possibilities for a given wire encoding. Note that 'repeated' isn't here because it's handled by single_message//3. See also segment_type_tag/3.
- payload(?PrologType, ?Payload) is det[private]
- Process the codes into
Payload
, according toPrologType
TODO
: payload//2 "mode" is sometimes module-sensitive, sometimes not.payload(enum, A)
// has A as a callable all other uses of payload//2, the 2nd arg is not callable.- This confuses check/0; it also makes defining an enumeration
more difficult because it has to be defined in module protobufs
(see
vector_demo.pl
, which defines commands/2)
- This confuses check/0; it also makes defining an enumeration
more difficult because it has to be defined in module protobufs
(see
- single_message(+PrologType:atom, ?Tag, ?Payload)// is det[private]
- Processes a single messages (e.g., one item in the list in
protobuf([...])
. The PrologType, Tag, Payload are from Field =.. [PrologType, Tag, Payload] in the caller - protobuf_message(?Template, ?WireStream) is semidet
- protobuf_message(?Template, ?WireStream, ?Rest) is nondet
- Marshals and unmarshals byte streams encoded using Google's Protobuf grammars. protobuf_message/2 provides a bi-directional parser that marshals a Prolog structure to WireStream, according to rules specified by Template. It can also unmarshal WireStream into a Prolog structure according to the same grammar. protobuf_message/3 provides a difference list version.
- protobuf_segment_message(+Segments:list, -WireStream:list(int)) is det[private]
- protobuf_segment_message(-Segments:list, +WireStream:list(int)) is det[private]
- Low level marshalling and unmarshalling of byte streams. The
processing is independent of the
.proto
description, similar to the processing done byprotoc --decode_raw
. This means that field names aren't shown, only field numbers.For unmarshalling, a simple heuristic is used on length-delimited segments: first interpret it as a message; if that fails, try to interpret as a UTF8 string; otherwise, leave it as a "blob" (if the heuristic was wrong, you can convert to a string or a blob by using protobuf_segment_convert/2). 32-bit and 64-bit numbers are left as codes because they could be either integers or floating point (use int32_codes_when/2, float32_codes_when/2, int64_codes_when/2, uint32_codes_when/2, uint64_codes_when/2, float64_codes_when/2 as appropriate); variable-length numbers ("varint" in the Protocol Buffers encoding documentation), might require "zigzag" conversion, int64_zigzag_when/2.
For marshalling, use the predicates int32_codes_when/2, float32_codes_when/2, int64_codes_when/2, uint32_codes_when/2, uint64_codes_when/2, float64_codes_when/2, int64_zigzag_when/2 to put integer and floating point values into the appropriate form.
- detag(+Compound, -Name, -Tag, -Value, List, -CompoundWithList) is semidet[private]
- Deconstruct
Compound
or the formName(Tag,Value)
and create a newCompoundWithList
that replacesValue
withList
. This is used by packed_list/2 to transform[varint(1,0),varint(1,1)]
tovarint(1,[0,1])
.Some of
Compound
items are impossible forpacked
with the current protobuf spec, but they don't do any harm. - protobuf_segment_convert(+Form1, ?Form2) is multi[private]
- A convenience predicate for dealing with the situation where
protobuf_segment_message/2 interprets a segment of the wire stream
as a form that you don't want (e.g., as a message but it should have
been a UTF8 string).
Form1
is converted back to the original wire stream, then the predicate non-deterimisticly attempts to convert the wire stream to astring
orlength_delimited
term (or both: the lattter always succeeds).The possible conversions are:
message(Tag,Segments)
=>string(Tag,String)
message(Tag,Segments)
=>length_delimited(Tag,Codes)
string(Tag,String)
=>length_delimited(Tag,Codes)
length_delimited(Tag,Codes)
=>length_delimited(Tag,Codes)
Note that for fixed32, fixed64, only the signed integer forms are given; if you want the floating point forms, then you need to do use int64_float64_when/2 and int32_float32_when/2.
For example:
?- protobuf_segment_convert( message(10,[fixed64(13,7309475598860382318)]), string(10,"inputType")). ?- protobuf_segment_convert( message(10,[fixed64(13,7309475598860382318)]), length_delimited(10,[105,110,112,117,116,84,121,112,101])). ?- protobuf_segment_convert( string(10, "inputType"), length_delimited(10,[105,110,112,117,116,84,121,112,101])). ?- forall(protobuf_segment_convert(string(1999,"\x1\\x0\\x0\\x0\\x2\\x0\\x0\\x0\"),Z), writeln(Z)). string(1999, ) packed(1999,fixed64([8589934593])) packed(1999,fixed32([1,2])) packed(1999,varint([1,0,0,0,2,0,0,0])) length_delimited(1999,[1,0,0,0,2,0,0,0])
These come from:
Codes = [82,9,105,110,112,117,116,84,121,112,101], protobuf_message(protobuf([embedded(T1, protobuf([integer64(T2, I)]))]), Codes), protobuf_message(protobuf([string(T,S)]), Codes). T = 10, T1 = 10, T2 = 13, I = 7309475598860382318, S = "inputType".
- uint32_codes_when(?Uint32, ?Codes) is det[private]
- Convert between a 32-bit unsigned integer value and its wirestream codes.
This is a low-level predicate; normally, you should use
template_message/2 and the appropriate template term.
This predicate delays until either
Uint32
orCodes
is sufficiently instantiated.There is also a non-delayed uint32_codes/2
SWI-Prolog doesn't have a 32-bit integer type, so 32-bit integer is simulated by doing a range check.
- int32_codes_when(?Int32, ?Codes) is det[private]
- Convert between a 32-bit signed integer value and its wirestream codes.
This is a low-level predicate; normally, you should use
template_message/2 and the appropriate template term.
This predicate delays until either
Int32
orCodes
is sufficiently instantiated.There is also a non-delayed int32_codes/2
SWI-Prolog doesn't have a 32-bit integer type, so 32-bit integer is simulated by doing a range check.
- float32_codes_when(?Value, ?Codes) is det[private]
- Convert between a 32-bit floating point value and its wirestream codes.
This is a low-level predicate; normally, you should use
template_message/2 and the appropriate template term.
This predicate delays until either
Value
orCodes
is sufficiently instantiated.There is also a non-delayed float32_codes/2
- uint64_codes_when(?Uint64, ?Codes) is det[private]
- Convert between a 64-bit unsigned integer value and its wirestream codes.
This is a low-level predicate; normally, you should use
template_message/2 and the appropriate template term.
SWI-Prolog allows integer values greater than 64 bits, so a range check is done.
This predicate delays until either
Uint64
orCodes
is sufficiently instantiated.There is also a non-delayed uint64_codes/2
- int64_codes_when(?Int64, ?Codes) is det[private]
- Convert between a 64-bit signed integer value and its wirestream codes.
This is a low-level predicate; normally, you should use
template_message/2 and the appropriate template term.
SWI-Prolog allows integer values greater than 64 bits, so a range check is done.
This predicate delays until either
Int64
orCodes
is sufficiently instantiated.There is also a non-delayed int64_codes/2
- float64_codes_when(?Value, ?Codes) is det[private]
- Convert between a 64-bit floating point value and its wirestream codes.
This is a low-level predicate; normally, you should use
template_message/2 and the appropriate template term.
This predicate delays until either
Value
orCodes
is sufficiently instantiated.There is also a non-delayed float64_codes/2
- int64_zigzag_when(?Original, ?Encoded) is det[private]
- Convert between a signed integer value and its zigzag encoding,
used for the protobuf
sint32
andsint64
types. This is a low-level predicate; normally, you should use template_message/2 and the appropriate template term.SWI-Prolog allows integer values greater than 64 bits, so a range check is done.
This predicate delays until either
Original
orEncoded
is sufficiently instantiated.There is also a non-delayed int64_zigzag/2
- uint64_int64_when(?Uint64:integer, ?Int64:integer) is det[private]
- Reinterpret-cast between uint64 and int64. For example,
uint64_int64(0xffffffffffffffff,-1)
.This predicate delays until either
Uint64
orInt64
is sufficiently instantiated.There is also a non-delayed uint64_int64/2
- uint32_int32_when(?Uint32, ?Int32) is det[private]
- Reinterpret-case between uint32 and int32.
This predicate delays until either
Uint32
orInt32
is sufficiently instantiated.There is also a non-delayed uint32_int32/2
- int64_float64_when(?Int64:integer, ?Float64:float) is det[private]
- Reinterpret-cast between uint64 and float64. For example,
int64_float64(3ff0000000000000,1.0)
.This predicate delays until either
Int64
orFloat64
is sufficiently instantiated.There is also a non-delayed uint64_int64/2
- int32_float32_when(?Int32:integer, ?Float32:float) is det[private]
- Reinterpret-cast between uint32 and float32. For example,
int32_float32(0x3f800000,1.0)
.This predicate delays until either
Int32
orFloat32
is sufficiently instantiated.There is also a non-delayed uint32_int32/2
- segment_to_term(+ContextType:atom, +Segment, -FieldAndValue) is det[private]
- ContextType is the type (name) of the containing message
Segment is a segment from protobuf_segment_message/2
TODO
: if performance is an issue, this code can be combined with protobuf_segment_message/2 (and thereby avoid the use of protobuf_segment_convert/2) - convert_segment(+Type:atom, +ContextType:atom, Tag:atom, ?Segment, ?Value) is det[private]
- Compute an appropriate
Value
from the combination of descriptor "type" (inType
) and aSegment
. Reversible onSegment
,Values
. - add_defaulted_fields(+Value0:dict, ContextType:atom, -Value:dict) is det[private]
- message_field_default(+ContextType:atom, Name:atom, -DefaultValue) is semidet[private]
- combine_fields(+Fields:list, +MsgDict0, -MsgDict) is det[private]
- Combines the fields into a dict and sets missing fields to their default values. If the field is marked as 'norepeat' (optional/required), then the last occurrence is kept (as per the protobuf wire spec) If the field is marked as 'repeat', then all the occurrences are put into a list, in order. This code assumes that fields normally occur all together, but can handle (less efficiently) fields not occurring together, as is allowed by the protobuf spec.
- combine_fields_repeat(+Fields:list, Field:atom, -Values:list, RestFields:list) is det[private]
- Helper for combine_fields/3
Stops at the first item that doesn't match
Field
- the assumption is that all the items for a field will be together and if they're not, they would be combined outside this predicate. - field_and_type(+ContextType:atom, +Tag:int, -FieldName:atom, -FqnName:atom, -ContextType2:atom, -RepeatOptional:atom, -Type:atom) is det[private]
- Lookup a
ContextType
andTag
to get the field name, type, etc. - fqn_repeat_optional(+FqnName:atom, -RepeatOptional:atom) is det[private]
- Lookup up
proto_meta_field_label(FqnName, _)
,proto_meta_field_option_packed(FqnName)
and set RepeatOptional to one ofnorepeat
,repeat
,repeat_packed
. - fqn_repeat_optional_2(+DescriptorLabelEnum:atom, -RepeatOrEmpty:atom) is det[private]
- Map the descriptor "label" to 'repeat' or 'norepeat'.
From
proto_meta_enum_value('.google.protobuf.FieldDescriptorProto.Label', Label, _)
. - field_descriptor_label_repeated(+Label:atom) is semidet[private]
- From
proto_meta_enum_value('.google.protobuf.FieldDescriptorProto.Label', 'LABEL_REPEATED', _)
.TODO
: unused - field_descriptor_label_single(+Label:atom) is semidet[private]
- From
proto_meta_enum_value('.google.protobuf.FieldDescriptorProto.Label', Label, _)
. - term_to_segments(+Term:dict, +MessageType:atom, Segments) is det[private]
- Recursively traverse a
Term
, generating message segments - field_segment_scalar_or_repeated(+Label, +Packed, +FieldType, +Tag, +FieldTypeName, ?Value, Segment) is det[private]
FieldType
is from the.proto
meta information ('TYPE_SINT32', etc.)- protobuf_field_is_map(+MessageType, +FieldName) is semidet
- Succeeds if
MessageType
'sFieldName
is defined as a map<...> in the .proto file. - protobuf_map_pairs(+ProtobufTermList:list, ?DictTag:atom, ?Pairs) is det
- Convert between a list of protobuf map entries (in the form
DictTag{key:Key, value:Value}
and a key-value list as described in library(pairs). At least one ofProtobufTermList
andPairs
must be instantiated;DictTag
can be uninstantiated. IfProtobufTermList
is from a term created by protobuf_parse_from_codes/3, the ordering of the items is undefined; you can order them by using keysort/2 (or by a predicate such as dict_pairs/3, list_to_assoc/2, or list_to_rbtree/2.
Re-exported predicates
The following predicates are exported from this file while their implementation is defined in imported modules or non-module files loaded by this module.
- protobuf_message(?Template, ?WireStream) is semidet
- protobuf_message(?Template, ?WireStream, ?Rest) is nondet
- Marshals and unmarshals byte streams encoded using Google's Protobuf grammars. protobuf_message/2 provides a bi-directional parser that marshals a Prolog structure to WireStream, according to rules specified by Template. It can also unmarshal WireStream into a Prolog structure according to the same grammar. protobuf_message/3 provides a difference list version.