Arch
Summary
We use unified plugins to parse markdown. Every plugin can be composed of the following three components:
- parser: classify tokens and attach metadata to tokens
- transformer: transform the entire syntax tree
- compiler: change token to output format
Unified gives a few ways of working with processors:
parse: calls theparserand results in an abstract syntax treeprocess: callsparser,transformer, andcompilerand usually results in a string output
Components
Steps
Dendron has different processors depending on the desired output (eg. markdown vs HTML). Below is the interface for markdown
- source: procRemarkFull
static procRemarkFull(
data: ProcDataFullOptsV5,
opts?: { mode?: ProcMode; flavor?: ProcFlavor }
)
Each processor takes the following arguments
-
ProcMode
-
ProcFlavor
-
ProcData
-
NOTE: disregard the
V5suffix, this is an artifact of our current migration to the new processor architecture and will be removed in future versions
ProcMode
/**
* What mode a processor should run in
*/
export enum ProcMode {
/**
* Expect no properties from {@link ProcDataFullV5} when running the processor
*/
NO_DATA = "NO_DATA",
/**
* Expect all properties from {@link ProcDataFullV5} when running the processor
*/
FULL = "all data",
/**
* Running processor in import mode. Notes don't exist. Used for import pods like {@link MarkdownPod}
* where notes don't exist in the engine prior to import.
*/
IMPORT = "IMPORT",
}
ProcFlavor
export enum ProcFlavor {
/**
* No special processing
*/
REGULAR = "REGULAR",
/**
* Apply publishing rules
*/
PUBLISHING = "PUBLISHING",
/**
* Apply preview rules
*/
PREVIEW = "PREVIEW",
}
Compilation
Depending on what output you plan on converting Dendron into, different plugin combinations get invoked.
Dendron has a few different target outputs which are listed, under DendronASTTypes, here.
For everything except HTML, Dendron will call the compile of the specific plugin.
For HTML, Dendron will transform remark nodes into rehype, which means that compile methods on remark plugins will never be called. If you are writing a custom plugin, use the transformer, dendronPub (https://github.com/dendronhq/dendron/blob/master/packages/engine-server/src/markdown/remark/dendronPub.ts), which runs after parsing but before compilation to transform the remark nodes into rehype nodes.
You can see Writing a custom Dendron Unified Plugin for more details on this
Engine-Less Processor Design
Background
As we migrate to Engine V3, data passed into the unified processor cannot contain the engine for several reasons:
- The
engine.notesproperty is deprecated and will be removed with engine v3 - The replacement for
engine.notesisgetNote, which necessarily must be an async function. Async functions are not allowed in unified parser and compiler methods.
Furthermore, having the full engine be a part of the 'data' that's passed into the unified processor causes performance problems. Whenever a unified processor gets frozen, any modifications to it cause a deep clone of the processor to be created - this may cause us to create multiple copies of all notes (50k+ for org private) in memory! This has caused problems in the past where hover preview would render very slowly and cause VS Code to slow down.
Engine-Less Workflow
We need the engine today during the processing to help with wikilinks and note references.
- In order to know where a wikilink points to, we need to find the wikilink's target note's information.
- In order to render a note reference, we need the NoteProps of the note reference target.
We can get this data without needing an engine be a part of the data payload by separating the parsing and compiling steps with the following workflow:
- Processor Setup - Don't include
engineas part of theProcDataFullOptsV5data payload. (Somewhat similar toprocRemarkParseNoDatamode) - Parsing Step - unchanged
- Data Gathering (new):
- After the parsing step, visit the nodes in the AST returned from the previous step. Figure out all of the note dependencies we have from wikilinks, noteRefs, or anything else needed for rendering or compiling.
- Call
engine.getNote(s)/engine.findNotes()to get the data. This call can (and should) be async b/c we're executing it in our own code, not in the unified code. Optimization: only getNotePropsMetadatafor wikilinks instead of all NoteProps fields - Now, add this
NoteProps[]data to the processor data.
- Transformers Step - unchanged
- Compiling Step - we replace
engine.notescalls and use our small note cache instead.
Backlinks