Refactor to clean up unnecessary Serialization/Deserialization to/from Matter (and ducktyped) subclasses (Indexer, Counter) #753

SmithSamuelM · 2024-04-12T17:44:00Z

Matter subclasses support round trippable transformation between the three CESR domains: Text (qb64), Binary (qb2) and Raw (code, raw) tuple. The Matter primitive holds the value as a (code, raw) tuple and then transforms to Text to Binary as needed. The raw value is needed to performa cryptographic operations on the primitive or in other cases to use the primitive as some other type like a number (int) or string or the like.

In many cases a Matter subclass is used for spurious non-transform where it is transformed from Text to Primitive Instance (code,raw) and back to Text without ever needed to use it in (code, raw) form. This is unnecessary. Just keep in in Text form.

There are three reasons to use a Matter instance and convert from Text or Binary to (code, raw) form. (Note this is in addition to the major use case for the Matter instance which is to generate the Text or Binary from (code, raw) form in the first place.)

parsing a concatenated (composed) stream of primitives. Each primitive is self-framing and the Primitive instance knows how to extract the Text or Binary from the stream which end up with it in (code, raw) form. So even if the only use is to re-serialize, the instance was required to deserialize in a framed way. If the Text or Binary value is not concatenated then there is no reason to convert to the Instance (code, raw) if its only ever going to be used in the Text or Binary form.
When an operation is to be performed on the raw value after deserialization from Text Binary
When the Text or Binary value is to be stored in database in concatenated form and the database doesn't know what Instance type to use to deconcatenate it later. In other words the database stored value is not sniffible so it must be preconfigured with the list of instance types so it can deconcatenate later.

We can largely remove 3. above by changing the interface to CesrSuber and CatCesrSuber. What has happened is that the end state of storing has forced creating instances everywhere instead of simply at the interface to the database. The backpressure of the database interface is driving the interface of everything upstream which is an antipattern. We need to fix the database interface to be smarter and/or better decouple it from its upstream.

Looking at this a different way. The convenience of the fully qualified Text domain representation (qb64) is that we have a human readable ascii (Base64) representation of a primitive that we can use as an identifier. When using as an identifier we don't need the Matter instance that generated it. We just need the string. In many cases we are better off just passing around the string between functions and methods and then only instancing as Matter at the point of need for a cryptographic operation instead of passing around the instance and then deserializing over and over everywhere when its used as an identifier. Especially when the later is the predominant use case and the former is comparatively rare.

Matter instances (and their duck types Indexer, counter) don't store the qb64 and qb2 representations. The generate them as properties when referenced. This avoids always doing what may be an unnecessary extra serialization. Instead when creating an Instance, the instance may be created from (code, raw) or may be created from some other input that is used to compute (code, raw), or may be by parsing a stream in qb64/qb2 which determines how many bytes to pull from the stream (self framing). The instance does not distinquish between a stream and framed input for qb2 or qb64 so it always parses. When it parses it pulls off parts of the serialization one at a time in order to compute the (code, raw). It actually doesn't have the full qb2 or qb64 onces its done. It would have to then reserialize the (code, raw). If it doesn't parse it can't ensure that it computes the (code, raw) correctly. Consequently, it must first compute (code, raw) and then now that it has (code, raw) the corresponding qb2 or qb64 must be re-computed from the (code, raw). Recall that it may have gotten a qb2 stream or a qb64 stream not both. So storing all three means that in most cases there is at least one or two spurious reserializations required to store.

This means that in general, if the use case is not to use (code, raw) but use the qb64 so extracted from a qb64 or qb2 or (code, raw) input then just pass around the qb64 and if one needs to convert to one of the other sometime later then and only then re-instance in order to do the conversion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor to clean up unnecessary Serialization/Deserialization to/from Matter (and ducktyped) subclasses (Indexer, Counter) #753

Refactor to clean up unnecessary Serialization/Deserialization to/from Matter (and ducktyped) subclasses (Indexer, Counter) #753

SmithSamuelM commented Apr 12, 2024 •

edited

Loading

Refactor to clean up unnecessary Serialization/Deserialization to/from Matter (and ducktyped) subclasses (Indexer, Counter) #753

Refactor to clean up unnecessary Serialization/Deserialization to/from Matter (and ducktyped) subclasses (Indexer, Counter) #753

Comments

SmithSamuelM commented Apr 12, 2024 • edited Loading

SmithSamuelM commented Apr 12, 2024 •

edited

Loading