added more prose on functions and did some minor copyedits
parent
5ce88985f7
commit
ff6ee67261
1 changed files with 61 additions and 21 deletions
82
format.md
82
format.md
|
|
@ -1,14 +1,20 @@
|
|||
# Modular Object Format Specification
|
||||
|
||||
> This specification makes several references to structs, enums, types, etc. in the implementation found in https://git.linuxposting.xyz/sel/wgu. These references generally begin with `crate::`.
|
||||
>
|
||||
> I'll probably remove those references whenever this document becomes ready enough to stand as a thing of its own.
|
||||
>
|
||||
> For now, please treat this as a draft / project documentation / NAND dump of my brain :3
|
||||
|
||||
## Introduction
|
||||
|
||||
The **modular object format** is a framework-esque *informal* set of schemas & an *informal* specification, for analyzing and performing basic operations on structured computer data, referred to as **object instances**.
|
||||
|
||||
The basic data structure of object instances is defined in the struct `wgu::types::ObjectInstance` / by the **base template**, available below. Object instances are key-value stores, where keys are generally referred to as 'properties'. Property names may contain [alphanumerics](https://en.wikipedia.org/wiki/Alphanumericals), `-` (dashes), and `_` (underscores). `.` (dots) represent a level of nesting.
|
||||
The basic data structure of object instances is defined in the **base template** below (see: struct `crate::types::ObjectInstance`). Object instances are key-value stores, where keys are generally referred to as 'properties'. Property names may contain [alphanumerics](https://en.wikipedia.org/wiki/Alphanumericals), `-` (dashes), and `_` (underscores). `.` (dots) represent a level of nesting.
|
||||
|
||||
When transmitted over a network or stored on disk, object instances are generally serialized into [JSON](https://www.json.org/json-en.html) [object literals](https://benalman.com/news/2010/03/theres-no-such-thing-as-a-json/).
|
||||
|
||||
Data within object instances is divided into three categories, `input`, `local`, and `remote`, each corresponding to the origin of data contained within. The inner structure of these categories is determined by a module-defined template, which is specified by the `object_type` property. Metadata (`version`, `object_type`, timestamps, hashes, etc.) is stored alongside these categories.
|
||||
Data within object instances is divided into three categories, `input`, `local`, and `remote`, each corresponding to the origin of data contained within. The inner structure of these categories is determined by a module-defined template, which is specified by the `object_type` property. Metadata (`version`, `object_type`, timestamps, hashes, etc.) is stored alongside these categories. **The full template of any object instance is the combination of the base template and the module-defined template - as in, the module-defined template is merely an extension to the base template.**
|
||||
|
||||
For referencing object instances, the properties `version` and `input_sha256` are used, separated by a `.` (dot) symbol.
|
||||
|
||||
|
|
@ -18,7 +24,7 @@ To refer to a property within an object, add a `:` (colon) symbol, followed by t
|
|||
|
||||
> **Example:** `0.abcd...1234:input.value`
|
||||
|
||||
if the property you're referring to is an array, the position of the target value must also be specified within `[]` (square brackets):
|
||||
if the property being referred to is an array, the position of the target value must also be specified within `[]` (square brackets):
|
||||
|
||||
> **Example:** `0.abcd...1234:remote.some.thing[2]`
|
||||
|
||||
|
|
@ -45,8 +51,6 @@ The following properties serve as the starting point for the structure of all ob
|
|||
|
||||
### Timestamps
|
||||
|
||||
> **this section needs more expansion!** you can contribute by documenting how this feature or module operates in the latest available version (also check for pull requests that would change that)
|
||||
|
||||
| property | transforms | subobjects | conditions | array | value |
|
||||
| --------- | ------------------------ | ---------- | ---------- | ----- | --------------------------------------------------- |
|
||||
| `created` | `meta/timestamp:unix-ms` | none | none | false | Timestamp of the object instance's creation |
|
||||
|
|
@ -85,6 +89,7 @@ A transformation (most commonly referred to as a 'transform') is an operation in
|
|||
- existing `meta/text:local.length` -> new `meta/number:value`
|
||||
- existing `net/url:local.domain` -> new `net/domain:value`
|
||||
- existing `social/twitter.post:remote.posted` -> new `meta/timestamp:unix-ms`
|
||||
|
||||
<small>(all of these examples are only demonstrational)</small>
|
||||
|
||||
Outgoing transforms of any property are defined in the [template of the source object](#transforms-1) under the source property as the destination object & property, which must always be under `input`.
|
||||
|
|
@ -98,13 +103,14 @@ Object instances may have parent-child relationships with other object instances
|
|||
- `social/twitter.post` is a subobject of `social/twitter.user`, using `remote.author`
|
||||
- `web/youtube.video` is a subobject of `web/youtube.user`, using `remote.author`
|
||||
- `web/gitea.repository` is a subobject of `web/gitea.user` and `web/gitea.organization`, using `remote.owner`
|
||||
|
||||
<small>(all of these examples are only demonstrational)</small>
|
||||
|
||||
Similarly to [transformations](#transformations), subobjects are defined in the [template of the parent object](#subobjects-1), as the child object & the two objects' shared property (most commonly an identifier, such as an incrementing number or UUID). This shared property must have the same [outgoing transforms](#transforms) in both the parent and child.
|
||||
|
||||
## Template structure
|
||||
## Object templates
|
||||
|
||||
Object templates (also referred to as 'templates') are a collection of [TOML tables](https://toml.io/en/v1.1.0#table) representing & named after each property. Modules may expose templates (enum variant `wgu::modules::ModuleItem::Template`), which extend the base template for object instances of a specific `object_type`.
|
||||
Object templates (also referred to as 'templates') are a collection of [TOML tables](https://toml.io/en/v1.1.0#table) representing & named after each property. Modules may expose templates (see: enum variant `crate::modules::ModuleItem::Template`), which extend the base template for object instances of a specific `object_type`.
|
||||
|
||||
For referencing templates, the name of the module and the name of the template are used, separated by a `/` (slash) symbol.
|
||||
|
||||
|
|
@ -120,7 +126,7 @@ conditions = {}
|
|||
array = false
|
||||
```
|
||||
|
||||
Which then looks like this in the documentation:
|
||||
Which then looks like this in documentation:
|
||||
|
||||
| property | transforms | subobjects | conditions | array | value |
|
||||
| ------------- | ----------------- | ---------- | ---------- | ----- | ---------------- |
|
||||
|
|
@ -136,14 +142,14 @@ Array of objects & their properties under `input`, that this property matches th
|
|||
|
||||
### `conditions`
|
||||
|
||||
Object of arrays of references to other properties within the object, which may be itself, combined with values the target property must have for this property to be valid in the object instance. Some examples:
|
||||
Inline table containing arrays named after references to other properties within the object (which may be itself), containing values the target property must have for this property to be valid in resulting object instances. Some examples:
|
||||
|
||||
```toml
|
||||
conditions = {input.value: ["owo"], input.version: ["fact-checked by real transgender patriots"]}
|
||||
conditions = {"input.value" = ["owo"], "input.version" = ["fact-checked by real transgender patriots"]}
|
||||
```
|
||||
|
||||
```toml
|
||||
conditions = {input.version: ["function", "transform", "subobject", "custom"]}
|
||||
conditions = {"input.version" = ["function", "transform", "subobject", "custom"]}
|
||||
```
|
||||
|
||||
If the parent property of a property with conditions is an array, the conditions apply to all items of the array.
|
||||
|
|
@ -154,35 +160,69 @@ If `true`, the property will act as a collection of properties (referred to as a
|
|||
|
||||
## Functions
|
||||
|
||||
> a lot of things here will have an 'in general' before them! those things are not a strict requirement of the specification, but are heavily recommended.
|
||||
Functions (see: enum variant `crate::modules::ModuleItem::Function`) are for calculating more object instances from existing object instances.
|
||||
|
||||
Functions (enum variant `wgu::modules::ModuleItem::Function`) are for calculating more object instances from existing object instances. Functions are also used for ensuring the integrity and accuracy of data contained within object instances. Such functions (known as 'validators') take one object instance, and return one log (struct `wgu::types::Log`) about the state of said object instance. In general, validators have the name `validator` - thus, they are addressed as *`object_type`*`:func:validator`
|
||||
Functions are also used for ensuring the integrity and accuracy of data contained within object instances. Such functions (known as 'validators') take only one object instance, and return one log (see: struct `crate::types::Log`) about the state of said object instance. In general, validators have the name `validator` - thus, they are addressed as *`object_type`*`:func:validator`.
|
||||
|
||||
A function takes a `FunctionData` as input and optionally another as parameters. The difference between inputs and parameters is that (in general) inputs are data the function works with, while parameters are the specifics of handling the data itself. A good example of that would be a function which produces an SHA digest of a `meta/text:input.value` (function templates will be explained in a bit):
|
||||
Function take one or two key-value stores (for inputs and parameters respectively), where the key is an item's name, and the value is a vector of object instances (see: type `crate::modules::FunctionData`, `HashMap<String, Vec<ObjectInstance>>`). The difference between inputs and parameters is that inputs are generally data the function works with, while parameters are generally instructions for working with this data.
|
||||
|
||||
A simple example of that would be a function which produces a SHA(256|512) digest of a `meta/text:input.value`:
|
||||
|
||||
```toml
|
||||
[conditions]
|
||||
"params.algorithm:input.value" = ["sha256", "sha512"]
|
||||
|
||||
[inputs.main]
|
||||
required = true
|
||||
object_type = "meta/text"
|
||||
conditions = {}
|
||||
amount = 1
|
||||
|
||||
[params.algorithm]
|
||||
required = true
|
||||
object_type = "meta/text"
|
||||
conditions = {}
|
||||
amount = 1
|
||||
|
||||
[outputs.main]
|
||||
required = true
|
||||
object_type = "meta/text"
|
||||
amount = 1
|
||||
```
|
||||
|
||||
Functions return a vector of logs, as well as a key-value store, where the key is an item's name, and the value is a data structure containing instructions to create resulting object instances (see: type `crate::modules::FunctionResult`, `(Vec<Log>, HashMap<String, FunctionOutput>)`) (see: struct `crate::modules::FunctionOutput`).
|
||||
|
||||
Note that functions themselves don't create object instances on their own. They merely return the instructions needed (an `object_type` and values of `input`) to do so by the implementation.
|
||||
|
||||
## Function templates
|
||||
|
||||
> The below appplies to the function as a whole.
|
||||
|
||||
### `conditions`
|
||||
|
||||
Table containing arrays named after references to items within the function, as well as to properties of the object instances within, seperated by a `:` (colon) symbol, containing values the target property must have for the function as a whole to be valid.
|
||||
|
||||
> The below apply to all function items (inputs, parameters, and outputs).
|
||||
|
||||
### `required`
|
||||
|
||||
Boolean indicating if the item must be present in any instance of the function.
|
||||
|
||||
### `object_type`
|
||||
|
||||
The `object_type` of the object instances present within the item.
|
||||
|
||||
### `amount`
|
||||
|
||||
The expected amount of object instances present within the item.
|
||||
|
||||
## Modularity
|
||||
|
||||
The modular object format is, by definition, modular.
|
||||
|
||||
Module items (enum `wgu::modules::ModuleItem`) should be exposed as a key-value store, where the key is a reference string and the value is the module item (see function `wgu::modules::item_registry` -> `HashMap<&'static str, ModuleItem>`).
|
||||
Module items (see: enum `crate::modules::ModuleItem`) should be exposed as a key-value store, where the key is a reference string and the value is the module item (see: function `crate::modules::item_registry` -> `HashMap<&'static str, ModuleItem>`).
|
||||
|
||||
The reference strings for module items are defined as follows:
|
||||
|
||||
| variant of `wgu::modules::ModuleItem` | reference string |
|
||||
| ------------------------------------- | ------------------------------------------ |
|
||||
| `Template` | *`object type`* |
|
||||
| `Function` | *`(optional) object type`*`:func:`*`name`* |
|
||||
| variant of `crate::modules::ModuleItem` | reference string |
|
||||
| --------------------------------------- | ------------------------------------------ |
|
||||
| `Template` | *`object type`* |
|
||||
| `Function` | *`(optional) object type`*`:func:`*`name`* |
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue