scratch/.agents/instructions/ArchieML/ARCHIEML-BETTY.md
2026-05-11 22:00:11 -04:00

647 lines
13 KiB
Markdown

# ArchieML-Veronica Language Reference
Veronica is a more specific dialect of [ArchieML](https://archieml.org) designed for newsrooms and content editors. While ArchieML is "forgiving," Veronica makes some syntax more explicit to prevent common parsing errors, especially when working with multiline content and nested structures.
## Table of Contents
- [Keys and Values](#keys-and-values)
- [Objects](#objects)
- [Nested Objects](#nested-objects)
- [Named Closures](#named-closures-veronica-extension)
- [Arrays](#arrays)
- [Arrays of Objects](#arrays-of-objects)
- [Arrays of Strings (Simple Arrays)](#arrays-of-strings-simple-arrays)
- [Freeform Arrays](#freeform-arrays)
- [Nested Arrays](#nested-arrays)
- [Named Array Closures](#named-array-closures-veronica-extension)
- [Repeating Keys in Arrays](#repeating-keys-in-arrays-veronica-extension)
- [Multiline Values](#multiline-values)
- [Veronica Multiline Syntax](#veronica-multiline-syntax-veronica-extension)
- [ArchieML Multiline Syntax](#archieml-multiline-syntax)
- [Comments and Ignored Content](#comments-and-ignored-content)
- [Escaping](#escaping)
- [Hooks and Customization](#hooks-and-customization)
## Keys and Values
The simplest form of ArchieML is a key-value pair. Keys are defined by a colon, with the key on the left and the value on the right.
```
title: My Article
author: Jane Smith
published: 2024-01-15
```
**Result:**
```json
{
"title": "My Article",
"author": "Jane Smith",
"published": "2024-01-15"
}
```
### Key Rules
- Keys can contain letters, numbers, underscores, hyphens, and Unicode characters
- Keys **cannot** contain: whitespace, `{`, `}`, `[`, `]`, `:`, `.`, or `+`
- Keys are case-sensitive (`Title` and `title` are different)
- Whitespace around keys and values is automatically trimmed
### Ignored Content
Any text that doesn't match ArchieML syntax is ignored, allowing you to include notes and comments freely:
```
This is just a note that will be ignored.
title: My Article
Here's another note about the article.
author: Jane Smith
```
## Objects
### Dot Notation
You can create nested objects using dot notation:
```
colors.red: #ff0000
colors.green: #00ff00
colors.blue: #0000ff
```
**Result:**
```json
{
"colors": {
"red": "#ff0000",
"green": "#00ff00",
"blue": "#0000ff"
}
}
```
### Object Blocks
For more complex objects, use curly braces:
```
{colors}
red: #ff0000
green: #00ff00
blue: #0000ff
{}
```
**Result:** Same as above.
### Nested Objects
Prepend a period (`.`) to a block name to nest it within the current object:
```
{author}
name: Jane Smith
email: jane@example.com
{.social}
twitter: @janesmith
github: janesmith
{}
bio: Award-winning journalist
{}
```
**Result:**
```json
{
"author": {
"name": "Jane Smith",
"email": "jane@example.com",
"social": {
"twitter": "@janesmith",
"github": "janesmith"
},
"bio": "Award-winning journalist"
}
}
```
### Closing Objects
You can close an object in several ways:
1. **Empty braces** `{}` - closes the current object
2. **Opening a new object** at the same level
3. **Named closure** (Veronica extension) - see below
### Named Closures (Veronica Extension)
Veronica extends ArchieML with named closures, allowing you to close a specific object by name, even if you're nested several levels deep:
```
{outer}
value: test
{.middle}
value: nested
{.inner}
value: deep
{/middle}
This closes middle and inner, returning to outer scope.
stillOuter: yes
{}
```
**Result:**
```json
{
"outer": {
"value": "test",
"middle": {
"value": "nested",
"inner": {
"value": "deep"
}
},
"stillOuter": "yes"
}
}
```
**Important:** The slash must be flush with the opening brace: `{/name}` works, but `{ /name }` does not.
## Arrays
### Arrays of Objects
Arrays are defined using square brackets. The first key that repeats signals the start of a new item:
```
[people]
name: Alice
age: 30
name: Bob
age: 25
[]
```
**Result:**
```json
{
"people": [
{"name": "Alice", "age": 30},
{"name": "Bob", "age": 25}
]
}
```
### Arrays of Strings (Simple Arrays)
For simple lists of strings, use asterisks:
```
[tags]
* news
* technology
* AI
[]
```
**Result:**
```json
{
"tags": ["news", "technology", "AI"]
}
```
### Freeform Arrays
Freeform arrays (marked with `[+arrayName]`) preserve the order of different types of content. They're useful for mixed content like articles with text, images, and pull quotes:
```
[+content]
This is a paragraph of text.
Another paragraph here.
{.image}
src: photo.jpg
caption: A beautiful photo
{}
More text after the image.
{.quote}
text: An inspiring quotation
author: Famous Person
{}
[]
```
**Result:**
```json
{
"content": [
{"type": "text", "value": "This is a paragraph of text."},
{"type": "text", "value": "Another paragraph here."},
{"type": "image", "value": {"src": "photo.jpg", "caption": "A beautiful photo"}},
{"type": "text", "value": "More text after the image."},
{"type": "quote", "value": {"text": "An inspiring quotation", "author": "Famous Person"}}
]
}
```
### Nested Arrays
Prepend a period to an array name to nest it:
```
[sections]
title: Introduction
[.subsections]
heading: Background
content: Context here
[]
heading: Methodology
content: How we did it
[]
[/sections]
```
**Result:**
```json
{
"sections": [
{
"title": "Introduction",
"subsections": [
{"heading": "Background", "content": "Context here"},
{"heading": "Methodology", "content": "How we did it"}
]
}
]
}
```
### Named Array Closures (Veronica Extension)
Like objects, arrays support named closures to jump out of nested structures:
```
[parent]
value: test
[.nested]
* one
* two
[/parent]
This is outside the parent array.
```
This is especially useful when you have deeply nested arrays and want to exit multiple levels at once.
### Repeating Keys in Arrays (Veronica Extension)
**This is a key difference from standard ArchieML.**
In Veronica, arrays start a new item when **any** key is redefined, not just the first key:
```
[items]
name: First
description: First item
color: red
description: Second item
name: Second
color: blue
[]
```
**Result:**
```json
{
"items": [
{"name": "First", "description": "First item", "color": "red"},
{"name": "Second", "description": "Second item", "color": "blue"}
]
}
```
In standard ArchieML, the second item would need to redefine `name` (the first key) to start a new item. Veronica is more flexible: redefining `description` also triggers a new item.
## Multiline Values
### Veronica Multiline Syntax (Veronica Extension)
Veronica introduces an explicit multiline syntax that's less ambiguous than ArchieML's `:end` syntax:
```
description::
This is the first line of my description.
This is the second paragraph.
This can contain [brackets] and {braces} safely.
::description
```
**Result:**
```json
{
"description": "This is the first line of my description.\n\nThis is the second paragraph.\n\nThis can contain [brackets] and {braces} safely."
}
```
**Syntax:**
- Open with `key::` (key followed by double colon)
- Write your content on following lines
- Close with `::key` (double colon followed by the same key name)
This syntax is more explicit and helps prevent accidentally consuming subsequent content.
### ArchieML Multiline Syntax
Veronica still supports the traditional ArchieML multiline syntax with `:end`:
```
description:
This is a multiline value.
It continues until :end is found.
:end
```
**Important:** Within multiline blocks, ArchieML syntax is **not** parsed. To include literal backslashes or the `:end` marker, use a backslash escape:
```
code:
To end a multiline, use \:end
A literal backslash: \\
:end
```
## Comments and Ignored Content
### Block Comments
Use `:skip` and `:endskip` to comment out entire sections:
```
title: My Article
:skip
This entire section is ignored.
author: Will Not Parse
{test}
{}
:endskip
published: 2024-01-15
```
**Result:**
```json
{
"title": "My Article",
"published": "2024-01-15"
}
```
### Stop Parsing
Use `:ignore` to immediately stop parsing. Everything after `:ignore` is ignored:
```
title: My Article
author: Jane Smith
:ignore
This and everything below is completely ignored.
Nothing here will be parsed.
```
**Result:**
```json
{
"title": "My Article",
"author": "Jane Smith"
}
```
## Escaping
Use a backslash (`\`) to escape special ArchieML characters when they appear at the start of a line in a multiline context:
```
description:
\[This is not an array]
\{This is not an object}
\:end is not the end marker
The real end:
:end
```
**Note:** Escaping only works in multiline contexts and only for characters at the start of a line. Outside of multiline blocks, special characters are generally ignored if they don't form valid syntax.
## Hooks and Customization
Veronica provides several hooks for customizing parsing behavior:
### `onFieldName(name: string) => string`
Transform field names during parsing. Useful for normalizing keys:
```javascript
const result = parse(text, {
onFieldName: (name) => name.toLowerCase()
});
```
This is particularly helpful when working with Google Docs, which may capitalize the first word of a line.
### `onValue(value: any, key: string) => any`
Transform values during parsing. Useful for type coercion:
```javascript
const result = parse(text, {
onValue: (value, key) => {
// Auto-convert boolean strings
if (value === "true") return true;
if (value === "false") return false;
// Auto-convert numbers
if (/^\d+$/.test(value)) return parseInt(value);
if (/^\d+\.\d+$/.test(value)) return parseFloat(value);
// Auto-parse ISO dates
if (/^\d{4}-\d{2}-\d{2}/.test(value)) {
return new Date(value);
}
return value;
}
});
```
### `onEnter(keypath: string[], item: any) => void`
Called when entering an object or array. The `keypath` is an array of strings representing the path from the root:
```javascript
const result = parse(text, {
onEnter: (keypath, item) => {
console.log(`Entering: ${keypath.join('.')}`);
console.log(`Type: ${Array.isArray(item) ? 'array' : 'object'}`);
}
});
```
### `onExit(keypath: string[], item: any) => any`
Called when exiting an object or array. This is ideal for validation and adding computed properties. If you return a value, it replaces the item in the output:
```javascript
const result = parse(text, {
onExit: (keypath, item) => {
// Validate required fields
if (keypath[0] === "person" && !item.name) {
throw new Error("Person must have a name");
}
// Add computed properties
if (keypath[0] === "person" && item.firstName && item.lastName) {
item.fullName = `${item.firstName} ${item.lastName}`;
}
return item; // Return the modified item
}
});
```
**Example with validation:**
```javascript
const text = `
{author}
firstName: Jane
lastName: Smith
{/author}
`;
const result = parse(text, {
onExit: (keypath, item) => {
if (keypath[0] === "author") {
if (!item.firstName || !item.lastName) {
throw new Error("Author must have both firstName and lastName");
}
// Add computed fullName
item.fullName = `${item.firstName} ${item.lastName}`;
}
return item;
}
});
// Result: { author: { firstName: "Jane", lastName: "Smith", fullName: "Jane Smith" } }
```
### `verbose: boolean`
Enable verbose logging to see detailed parsing information:
```javascript
const result = parse(text, {
verbose: true
});
```
## Complete Example
Here's a comprehensive example showing many Veronica features:
```
title: Understanding Veronica
subtitle: A Guide to Structured Content
published: 2024-01-15
{metadata}
tags.primary: archieml
tags.secondary: parsing
wordCount: 1500
{/metadata}
intro::
This is a multiline introduction to Veronica.
It supports multiple paragraphs and preserves formatting.
::intro
[sections]
heading: Introduction
body: Welcome to Veronica, a dialect of ArchieML.
heading: Features
body: Veronica adds several improvements over standard ArchieML.
[.examples]
* Named closures
* Explicit multiline syntax
* Better array handling
[/sections]
[+mixedContent]
This is a text paragraph.
{.callout}
type: warning
message: Remember to close your arrays!
{}
Another text paragraph here.
[]
{author}
firstName: Jane
lastName: Smith
email: jane@example.com
{.social}
twitter: @janesmith
github: janesmith
{/author}
footer: Copyright 2024
```
This example demonstrates:
- Simple key-value pairs
- Object blocks with nested objects
- Dot notation for nested keys
- Veronica multiline syntax (`key::` / `::key`)
- Arrays of objects with nested arrays
- Freeform arrays with mixed content
- Named closures (`{/author}`, `[/sections]`)
## Summary of Veronica Extensions
Veronica extends ArchieML with these key features:
1. **Named closures** - `{/name}` and `[/name]` to exit specific nesting levels
2. **Explicit multiline syntax** - `key::` / `::key` delimiters for unambiguous multiline values
3. **Flexible array items** - Any redefined key starts a new array item, not just the first key
4. **Lifecycle hooks** - `onEnter` and `onExit` callbacks for validation and transformation
5. **Value transformation** - `onFieldName` and `onValue` hooks for custom processing
These changes make Veronica more predictable and robust when working with complex structured content, especially in collaborative editing environments like Google Docs.