HSP2HF

This article is about the OHRRPGCE FMF project, which is an alternate implementation of the OHRRPGCE for Java mobile phones. Technical implementation details discussed here should not be confused with those of the RPG format

Most of you will never have to touch Henceforth if you don't want to. (But who wouldn't want to?) The HSP2HF cross-compiler is used to automatically convert your HamsterSpeak into Henceforth bytecode. It is part of the RPG2XRPG conversion process.

How the Cross-Compiler Works: Motivations[edit]

Some parts of the OHRRPGCE FMF project are slightly incompatible with the standard OHRRPGCE, but for scripting this is simply unacceptable. Unlike, say, a slightly jittery enemy HP value -which is immediately apparent when you first load your game on a phone- a silent error in a script is worth hours of headache-inducing debugging, and probably not worth anything at all in the end.

So, the goal of the cross-compiler is script-level compatibility. Efficiency and conciseness, while important, take a backseat to this driving need.

How the Cross-Compiler Works: Naive Compiler[edit]

The Henceforth cross-compiler benefits from HamsterSpeak's tree-like structure; a naive conversion can simply convert each node to a local Henceforth function. Consider the following script (from Wandering Hamster, loaded in the HSP2HF utility)

Typical to most HSZ scripts, "setnpcspeed" contains a head "do" node. This node happens to contain only one element, which takes three arguments, each of which is a simple number or variable. Clicking "cross-compile" will invoke the naive converter, which produces the following code:

 \[14]{
   [1]@
 }
 \[12]{
   3
 }
 \[10]{
   [0]@
 }
 \[4]{
   [10]()
   [12]()
   [14]()
   [HS:78]()
 }
 
 #Init local variables
 @[1]
 @[0]
 
 #Main script loop
 do_start
 [4]()
 do_end

Let's start from the local variables section:

 @[1]

This is a shorthand syntax; it basically calls "local store", a function deep within the script interpreter which does something like this:

 void localStore(int arg) {
   local_variables[arg] = pop_parent();
 }

Next, we have the main loop. The "do_start" and "do_end" primitives are there to help the "break" and "continue" primitives to function properly. The meat of the main loop is the call to:

 [4]()

...which is simply a call to a script-local subroutine.

The following code defines script-local subroutine 4:

 \[4]{
   [10]()
   [12]()
   [14]()
   [HS:78]()
 }

...as three script-local calls([10], [12], and [14]) and one built-in function call (HamsterSpeak:78, alterNPC). The remaining local functions are equally easy to understand. For example, "[1]@" calls "local load" with "1" as an argument. Local load is defined internally as:

 void localLoad(arg1) {
   push(local_variables[arg1]);
 }

The HSP2HF utility is also an excellent example of a place where recommended syntax is ignored. Although alter_npc is more readable to humans, [HS:78]() is a lot easier to parse for a compiler. Likewise, do_start and do_end is cleaner machine syntax than do{ ... }.

How the Cross-Compiler Works: Reasonable Inlining[edit]

Simple functions like "setnpcspeed" are very easy to inline, just by copying the leaf nodes' source into their parents. The previous script can be re-written as:

 #Init local variables
 @[1]
 @[0]
 
 #Main script loop
 do_start
 [0]@
 3
 [1]@
 [HS:78]()
 do_end

...which is much more concise. Due to the nature of HF bytecode, inlining usually improves both performance and storage efficiency. (When I have time to profile, I hope to collect some facts to back this up.) However, inlining everything is often either impossible or unwise, which is why one needs a policy for inlining. At the moment, the OHRRPGCE FMF's cross-compiler uses the following algorithm to determine what to inline:

 1) Start by doing a naive conversion, and retain the tree structure. 
    Mark all nodes as "not dead" and with a "count" of 1.
 2) Loop until all nodes are inlined or dead. At each iteration, do the following.
 3) Determine which nodes are leaf nodes. (Leaf nodes have no children, or only dead children).
    a) If this node cannot be inlined (e.g., it's self-referential, or checks b/c/d/e below fail), 
       mark it as "dead".
    b) If this node's count is "1", inline it. (Copy its corresponding source into any node that 
       references it, incrementing their "count" value by 1, and then delete it.)
    c) If this node's count is "2", and it is referenced 8 times or less, inline it.
    d) If this node's count is "3", and it is referenced 4 times or less, inline it.
    e) If this node's count is "4" or more, only inline it if it is referenced exactly once.

We are still discussing what makes a node impossible to inline. Technically, the problem is difficult, but Hamsterspeak byte-code is fairly structured in nature, which means we can probably define a few simple criteria for exclusion.

Primitives & HSPEAK->HF Snippits[edit]

The cross-compiler inserts snippets of HF code when it encounters an HSPEAK node of a given type. For example, at node 10, given a "number" node with value 75, it inserts:

 \[10] {
   75
 }

Henceforth, we shall refer to a node of ID n as: \[n] {} --this allows us to generalize HSPEAK nodes into simple templates. Just a reminder: \[n] {} represents a script-local subroutine; it is not valid Format T syntax.

The following templates are loaded when the HVM initializes, so the cross-compiler makes use of them to simplify syntax:

Templates

#Template: -2 4 set_var will set local variable 1 to value 4
\set_var {
  swap
  dup
  0 lt if {
    1 add -1 mult
    @[]
  } else {
    @[.G]
  }
}

#Template: 2 get_var will return the contents of global variable 2
\get_var {
  dup
  0 lt if {
    1 add -1 mult
    []@
  } else {
    [.G]@
  }
}

Here are the snippets used by the cross-compiler; we repeat numbers for the sake of completeness:

Numbers
HSpeak Parameters			Henceforth Snippet
Kind	ID	args[]
number	value		\[n] { value }

Do Loops
HSpeak Parameters			Henceforth Snippet
Kind	ID	args[]
flow	do	node_x node_y ...	\[n] { do{ [x]() [y]() ... } }

If Statements
HSpeak Parameters			Henceforth Snippet
Kind	ID	args[]
flow	if	conditional_x then_y else_z	\[n]{ [x]() if { [y]() } else { [z]() } }

Then/Else Loops
HSpeak Parameters			Henceforth Snippet
Kind	ID	args[]
flow	then/else	node_x node_y ...	\[n] { [x]() [y]() ... }

Break/Continue
HSpeak Parameters			Henceforth Snippet
Kind	ID	args[]
flow	command	amount if amount==1 then: skip amount else: append "_x" to command	\[n] { [amount] command }

Returning
HSpeak Parameters			Henceforth Snippet
Kind	ID	args[]	Henceforth Snippet
flow	return	value	\[n] { value @[-1] }
flow	exitscript At a given depth		\[n] { invalid @[-1] depth break_x }
flow	exitscript At a given depth	value	\[n] { value @[-1] depth break_x }

While/For
HSpeak Parameters			Henceforth Snippet
Kind	ID	args[]	Henceforth Snippet
flow	while	conditional_x do_y	\[n] { do { [x]() not if { break } inline_y{ y_command_1 y_command_2 ... y_command_z } } }
flow	for	count_id_x counter_start_s count_end_e counter_step_w do_y	\[n] { [x]() [s]() set_var do { [w]() 0 gt [x]() get_var [e]() gt xor not [x]() get_var [e]() neq and if { break } inline_y{ y_command_1 y_command_2 ... y_command_z } [x]() get_var [w]() add } }
Note 1: The block inline_y simply unrolls the do_y block into the body of [n](). This is done so that break* and continue will function properly.* Note 2: The upshot of this is that do_y will be instantly culled from the source, unless another node references it (which would be a bit of a hack, in my opinion. Regardless, this is fine, and will not affect program validity in any way.

Switch
HSpeak Parameters			Henceforth Snippet
Kind	ID	args[]
flow	switch	???	This is not yet documented in HSZ, so we will deal with it later.

Variable Access
HSpeak Parameters			Henceforth Snippet
Kind	ID	args[]	Henceforth Snippet
global	variable_x		\[n] { [x]() [.G]@ }
local	variable_x		\[n] { [x]() []@ }

Math Functions
HSpeak Parameters			Henceforth Snippet
Kind	ID	args[]	Henceforth Snippet
math	set_variable	lhs_l rhs_r	\[n] { [l]() [r]() set_var }
math	increment_variable	lhs_l rhs_r	\[n] { [l]() get_var [r]() add set_var }
math	decrement_variable	lhs_l rhs_r	\[n] { [l]() get_var [r]() sub set_var }
math	not	lhs_l	\[n] { [l]() not }
math	and	lhs_l rhs_r	\[n] { [l]() if { [r]() } else { False } }
math	or	lhs_l rhs_r	\[n] { [l]() not if { [r]() } else { True } }
math	operand If the operand is listed above, use that code block, not this one.	lhs_l rhs_r	\[n] { [l]() [r]() operand }

Built-In and User-Defined Functions
HSpeak Parameters			Henceforth Snippet
Kind	ID	args[]	Henceforth Snippet
built-in	func_id_x		\[n] { hspeak_api_call_x }
user-script	func_id_x		\[n] { user_script_call_x }

Resolution Engine[edit]

After cross-compiling, the HSP2HF utility is basically left with a sequence of bytecodes for each script. The final step is to lump these together into HF lumps. The size of a script, along with its potential to call other scripts, is used to properly group several scripts into one HF lump. To gather this information, a single pass of each script is made. Simultaneously, the resolution engine scans scripts to find simple optimizations which can be performed in place. A list of these optimizations follows.

Found	Replaced By	Reasoning
number get_var	[number.G]@ #If number is < 0 [-(number+1)]@ #Otherwise	The parameter ID is often known.
number value set_var	value @[number.G] #If number is < 0 value @[-(number+1)] #Otherwise	Ditto to the above.

HSP2HF

Contents

How the Cross-Compiler Works: Motivations[edit]

How the Cross-Compiler Works: Naive Compiler[edit]

How the Cross-Compiler Works: Reasonable Inlining[edit]

Primitives & HSPEAK->HF Snippits[edit]

Resolution Engine[edit]

Navigation menu

Views

Personal tools

Project Info

Resources

Community

Wiki

Search

Tools