-
Notifications
You must be signed in to change notification settings - Fork 15
Parsing structure
This is a follow-up article that explains the structural behaviour of Skript and how elements behave in a Skript. It is advised that you read this article first to get a better understanding of how the pattern matching works. Skript may sometimes be very frightening to new contributors, since the parsing systems seems such a mess at times and it is difficult to get a hold of what is going on at which moment. This leads in contributors being scared away or only contributing to the so-called syntax classes, since they follow a simple and consistent path.
That's why I made this document, which will tell you as detailed as possible how Skript handles parsing lexically and structurally.
This document was made by Mwexim and is shared on the wiki to provide further information to potential contributors.
Terminology used in this document:
- the parsing chain: registration time, parse time and runtime
- registration time: when syntaxes are registered
- parse time: when scripts are parsed, after registration time
- runtime: when scripts are run, after parse time
- Skript: the Skript Spigot plugin
- skript-parser: this project
- structural elements: Statements, Effects, CodeSections, Triggers and so forth. Expressions are not a part of this.
-
parse tree: a tree consisting of
Statements
that define some sort of inheritance. This means that some statements are nested inside another one.
As you have read in the other article, the parsing chain starts with the registration of all syntax. We will not be repeating what was already said, but the main key to remember here is that all syntax is bundled before the actual script is parsed. This means that all rulesets already have been defined at this point.
The parser will take in a file and will read its lines one by one. This means that each line should contain a different form of syntax and will perform a different kind of task. This is the general rule in Skript and ensures the easy-to-understand feeling.
Before the parser reads the lines, they are structured into FileElements
:
- Events and sections are put into
FileSections
instead (extendsFileElement
of course). All elements that have more indentation than said section are put inside thisFileSection
. - Comments are erased from this list and will be put into
VoidElements
, essentially doing nothing. - Multiline tokens are taken into account and their lines are merged into one
FileElement
:
on script load:
print "Hello World, this string\
continues at the next line"
- Other add-ons can participate in this process as well, allowing unique structures that would not be possible otherwise.
As said before, all elements are structured by indentation in their respective FileSection
. This process is now terminated and it will return a List with all top-level elements (which are always FileSections
, because they are all triggers). This is very handy, as we don't need to order any element further down the parsing chain.
Thus, the elements that have no indentation will be loaded first. They are called triggers or events. These triggers are parsed in the order they occur in the file but are loaded in a different order.
As described in the previous paragraph, triggers are not parsed and loaded in the same order.
-
Parsing is the process of matching lines of code with the registered syntax. It parses the code into structural elements (
Triggers
in this example) that can be used later to perform actions. -
Loading code means that you execute the code within the trigger (or section). In order to do that, we have to parse the code that happens within this trigger (otherwise we don't know what we need to execute).
Therefore, loading a trigger would mean you would parse all the lines within that trigger and transform them into structural elements. These elements can be
Sections
,Effects
and evenExpressions
. Then, these parsed structural elements are executed.
Parsing triggers is done in the order they occur in the file, but loading their content is done in the order specified in the syntax class (SkriptEvent#getLoadingPriority()
). High-priority events will be loaded first.
We arranged the file and parsed all triggers into their respective structural elements. What happens next is the loading of all the elements inside that trigger. These elements can be CodeSections
(like the if
-statement) or Effects
(like the print
-effect). Let's look at our previous example and expand it a bit:
on script load:
if 1 > 0:
print "Correct: 1 > 0!"
else:
print "False: 1 <= 0!"
The parser will bundle this script into a structure like this:
Trigger
|___ CodeSection (Condtional)
|___ Effect (EffPrint)
|___ CodeSection (Conditional)
|___ Effect (EffPrint)
We can see a very obvious categorisation of the script in some sort of a parse tree. The main trigger is branched into different elements (two Conditionals
) which branch again in an Effect
. Note that these effects cannot branch further: only when a section is parsed, we can branch into more statements.
The general rule of thumb is when a line is indented more than the line before, it is a branch of that line, which is probably a trigger or a section.
This means that the following process has started after we successfully parsed a trigger:
- The content of the trigger is loaded:
- If it is a
CodeSection
, the content of that section is loaded. This is an infinite process. There can be multiple sections nested inside each other! Go back to step 1, but now for the parsed section. - If it is an
Effect
, it is parsed. This means that the line of code is parsed against all registered effects. If a match is found, the effect is initiated and stored as an item of the section it is in. Note that triggers are essentially equal to sections. The only difference is that they are only allowed at top-level code.
- If it is a
- The parsed and loaded triggers are now broadcast to their respective registrars.
- A trigger cannot do anything on its own. The registrar of the trigger needs to handle it.
- For example:
EvtScriptLoad
does something when the script is loaded. When the trigger of this event is successfully loaded, it is handled by its registrar (in this case Skript itself). The registrar adds the trigger to a list, so it can use it later. This process is called handling the trigger. Most of the time, the trigger is just added to a list so it can be used again later (see step 3).
- For example:
- A trigger cannot do anything on its own. The registrar of the trigger needs to handle it.
-
SkriptAddon#finishedLoading()
is called on all registered addons.- It's at this place that the magic happens. Remember the list of triggers we stored earlier in step 2? It's here that we'll need them.
- What happens is that each addon can decide what they will do with the loaded triggers. In our example with
EvtScriptLoad
, Skript will run all statements in that trigger. This can be done with the parse tree wee stored earlier. Each element from that parse tree is essentially run until all elements have run successfully.
There is a static method in the Statement
class called runAll(Statement, TriggerContext)
, which runs all the statements, starting from the first statement given as a parameter of that function.
How does the parser know if a statement has nested content? Remember what we discussed earlier: sections can add another level of nested code into your parse tree, and that needs to be handled as well. Each statement has an abstract walk(TriggerContext)
method. This method defines the behaviour of the syntax element. For Effects
, this behaviour is defined to run the execute(TriggerContext)
method, which performs an action. For sections, however, this behaviour is different.
This walk-method also returns a Statement
. It will be the next statement to run. The runAll(Statement, TriggerContext)
method, therefore, iterates over all the items by walking over the first one and then using the result of that function as the next item. When there are no items left, the trigger has successfully run and gets terminated.