6 format
sel edited this page 2026-02-08 18:55:41 +01:00

Modular Object Format Specification

This specification makes several references to structs, enums, types, etc. in the implementation found in https://git.linuxposting.xyz/sel/wgu. These references generally begin with crate::.

I'll probably remove those references whenever this document becomes ready enough to stand as a thing of its own.

For now, please treat this as a draft / project documentation / NAND dump of my brain :3

Introduction

The modular object format is a framework-esque informal set of schemas & an informal specification, for analyzing and performing basic operations on structured computer data, referred to as object instances.

The basic data structure of object instances is defined in the base template below (see: struct crate::types::ObjectInstance). Object instances are key-value stores, where keys are generally referred to as 'properties'. Property names may contain alphanumerics, - (dashes), and _ (underscores). . (dots) represent a level of nesting.

When transmitted over a network or stored on disk, object instances are generally serialized into JSON object literals.

Data within object instances is divided into three categories, input, local, and remote, each corresponding to the origin of data contained within. The inner structure of these categories is determined by a module-defined template, which is specified by the object_type property. Metadata (version, object_type, timestamps, hashes, etc.) is stored alongside these categories. The full template of any object instance is the combination of the base template and the module-defined template - as in, the module-defined template is merely an extension to the base template.

For referencing object instances, the properties version and input_sha256 are used, separated by a . (dot) symbol.

Example: 0.abcd...1234

To refer to a property within an object, add a : (colon) symbol, followed by the property name.

Example: 0.abcd...1234:input.value

if the property being referred to is an array, the position of the target value must also be specified within [] (square brackets):

Example: 0.abcd...1234:remote.some.thing[2]

Base template

The following properties serve as the starting point for the structure of all object instances.

Categories

property transforms subobjects conditions array value
input none none none false Holds essential data needed for defining the object instance's content, and populating other properties
local none none none false Holds locally calculated data
remote none none none false Holds remotely obtained data
logs none none none true Holds logs produced in relation to the object instance

General

property transforms subobjects conditions array value
object_type meta/text:value none none false Specifies the object's expected content. Content before the / (slash) is the module name, after that . (dots) are used to separate layers of nesting
version meta/number:value none none false Used for differentiating between multiple object instances with the same input_sha256. starts at 0 (zero) for each hash.

Timestamps

property transforms subobjects conditions array value
created meta/timestamp:unix-ms none none false Timestamp of the object instance's creation
edited meta/timestamp:unix-ms none none false Timestamp of the object instance's last manual edit
hashed meta/timestamp:unix-ms none none false Timestamp of input_sha256 being calculated

Hashing

property transforms subobjects conditions array value
input_sha256 meta/text:value none none false Used for naming, querying, and referencing object instances (together with the version property)
full_sha256 meta/text:value none none false Using for validating object instances (together with properties under timestamps)

input_sha256 is calculated as the SHA-256 checksum of the property object_type and properties under input, including the start and end braces, separated by + (plus) symbols.

Example: meta/dummy+{"value":"you are now breathing manually"} -> 5aea ... b0ca

full_sha256 is calculated as the SHA-256 checksum of the property object_type, the timestamp hashed, and the categories input, local, and remote, including the start and end braces, separated by + (plus) symbols.

Example: meta/dummy+1735689600000+{"value":"you are now breathing manually"}+{}+{} -> 7f17 ... cbdc

Logging

property transforms subobjects conditions array value
logs.level meta/text:value none logs.level: error, warn, info, debug false Categorizing the log entry by severity level
logs.variant meta/text:value none none false Categorizing the log entry generally
logs.timestamp meta/timestamp:unix-ms none none false Timestamp of the log entry
logs.message meta/text:value none none false Human-readable message

Transformations

A transformation (most commonly referred to as a 'transform') is an operation involving taking any property of an object instance, and placing its value into that of a property under input of a new object instance. These are some examples of common transforms:

  • existing meta/text:local.length -> new meta/number:value
  • existing net/url:local.domain -> new net/domain:value
  • existing social/twitter.post:remote.posted -> new meta/timestamp:unix-ms

(all of these examples are only demonstrational)

Outgoing transforms of any property are defined in the template of the source object under the source property as the destination object & property, which must always be under input.

If an entire object template is used as an outgoing transform, a level of nesting is created within the source property, where the properties under input of the destination object template are placed.

Subobjects

Object instances may have parent-child relationships with other object instances as defined in the parent object instance's template, provided the parent and child object instances share a property. This works similarly to SQL's primary keys, and is commonly used when a piece of data is categorized under another in the data origin. These are some examples:

  • social/twitter.post is a subobject of social/twitter.user, using remote.author
  • web/youtube.video is a subobject of web/youtube.user, using remote.author
  • web/gitea.repository is a subobject of web/gitea.user and web/gitea.organization, using remote.owner

(all of these examples are only demonstrational)

Similarly to transformations, subobjects are defined in the template of the parent object, as the child object & the two objects' shared property (most commonly an identifier, such as an incrementing number or UUID). This shared property must have the same outgoing transforms in both the parent and child.

Object templates

Object templates (also referred to as 'templates') are a collection of TOML tables representing & named after each property. Modules may expose templates (see: enum variant crate::modules::ModuleItem::Template), which extend the base template for object instances of a specific object_type.

For referencing templates, the name of the module and the name of the template are used, separated by a / (slash) symbol.

Example: meta/text

A good example is the (possibly simplest) module-exposed template of the meta/dummy object type.

[input.value]
transforms = ["meta/text:value"]
subobjects = []
conditions = {}
array = false

Which then looks like this in documentation:

property transforms subobjects conditions array value
input.value meta/text:value none none false dummy text value

transforms

Array of object types & their properties under input, that this property transforms to, separated by a : (colon) symbol.

subobjects

Array of objects & their properties under input, that this property matches the outgoing transforms of, separated by a : (colon) symbol.

conditions

Inline table containing arrays named after references to other properties within the object (which may be itself), containing values the target property must have for this property to be valid in resulting object instances. Some examples:

conditions = {"input.value" = ["owo"], "input.version" = ["fact-checked by real transgender patriots"]}
conditions = {"input.version" = ["function", "transform", "subobject", "custom"]}

If the parent property of a property with conditions is an array, the conditions apply to all items of the array.

array

If true, the property will act as a collection of properties (referred to as an 'array'). When used as a transform destination, arrays add the source property's value to the first (0th) position.

Functions

Functions (see: enum variant crate::modules::ModuleItem::Function) are for calculating more object instances from existing object instances.

Functions are also used for ensuring the integrity and accuracy of data contained within object instances. Such functions (known as 'validators') take only one object instance, and return one log (see: struct crate::types::Log) about the state of said object instance. In general, validators have the name validator - thus, they are addressed as object_type:func:validator.

Function take one or two key-value stores (for inputs and parameters respectively), where the key is an item's name, and the value is a vector of object instances (see: type crate::modules::FunctionData, HashMap<String, Vec<ObjectInstance>>). The difference between inputs and parameters is that inputs are generally data the function works with, while parameters are generally instructions for working with this data.

A simple example of that would be a function which produces a SHA(256|512) digest of a meta/text:input.value:

[conditions]
"params.algorithm:input.value" = ["sha256", "sha512"]

[inputs.main]
required = true
object_type = "meta/text"
amount = 1

[params.algorithm]
required = true
object_type = "meta/text"
amount = 1

[outputs.main]
required = true
object_type = "meta/text"
amount = 1

Functions return a vector of logs, as well as a key-value store, where the key is an item's name, and the value is a data structure containing instructions to create resulting object instances (see: type crate::modules::FunctionResult, (Vec<Log>, HashMap<String, FunctionOutput>)) (see: struct crate::modules::FunctionOutput).

Note that functions themselves don't create object instances on their own. They merely return the instructions needed (an object_type and values of input, local, and remote) to do so by the implementation.

Function templates

The below appplies to the function as a whole.

conditions

Table containing arrays named after references to items within the function, as well as to properties of the object instances within, seperated by a : (colon) symbol, containing values the target property must have for the function as a whole to be valid.

The below apply to all function items (inputs, parameters, and outputs).

required

Boolean indicating if the item must be present in any instance of the function.

object_type

The object_type of the object instances present within the item.

amount

The expected amount of object instances present within the item.

Modularity

The modular object format is, by definition, modular.

Module items (see: enum crate::modules::ModuleItem) should be exposed as a key-value store, where the key is a reference string and the value is the module item (see: function crate::modules::item_registry -> HashMap<&'static str, ModuleItem>).

The reference strings for module items are defined as follows:

variant of crate::modules::ModuleItem reference string
Template object type
Function (optional) object type:func:name