6. Toolchain
6.1. spicy-build
spicy-build is a shell frontend that compiles Spicy source code
into a standalone executable by running spicyc to generate the
necessary C++ code, then spawning the system compiler to compile and
link that.
spicy-build [options] <input files>
-d Build a debug version.
-o <file> Destination name for the compiled executable; default is "a.out".
-t Do not delete tmp files (useful for inspecting, and use with debugger)
-v Verbose output, display command lines executing.
-S Do not compile the "spicy-driver" host application into executable.
Input files may be anything that spicyc can compile to C++.
6.2. spicy-config
spicy-config reports information about Spicy’s build &
installation options.
Usage: spicy-config [options]
Available options:
--bindir Prints the path to the directory where binaries are installed.
--build Prints "debug" or "release", depending on the build configuration.
--cmake-path Prints the path to Spicy-provided CMake modules
--cxx Print the path to the C++ compiler used to build Spicy
--cxxflags Print flags for C++ compiler when compiling generated code statically
--cxxflags-hlto Print flags for C++ compiler when building precompiled HLTO libraries
--debug Output flags for working with debugging versions.
--distbase Print path of the Spicy source distribution.
--dynamic-loading Adjust --ldflags for host applications that dynamically load precompiled modules
--have-toolchain Prints 'yes' if the Spicy toolchain was built, 'no' otherwise.
--have-zeek Prints 'yes' if the Spicy was compiled with Zeek support, 'no' otherwise.
--help Print this usage summary
--include-dirs Prints the Spicy runtime's C++ include directories
--ldflags Print flags for linker when compiling generated code statically
--ldflags-hlto Print flags for linker linker when building precompiled HLTO libraries
--libdirs Print standard Spicy library directories.
--prefix Print path of installation
--spicy-build Print the path to the spicy-build script.
--spicyc Print the path to the spicyc binary.
--version Print the Spicy version as a string.
--version-number Print the Spicy version as a numerical value.
--zeek Print the path to the Zeek executable
--zeek-include-dirs Print the Spicy runtime's C++ include directories
--zeek-module-path Print the path of the directory the Zeek plugin searches for *.hlto modules
--zeek-plugin-path Print the path to go into ZEEK_PLUGIN_PATH for enabling the Zeek Spicy plugin
--zeek-prefix Print the path to the Zeek installation prefix
--zeek-version Print the Zeek version (empty if no Zeek available)
--zeek-version-number Print the Zeek version as a numerical value (zero if no Zeek available)
6.3. spicyc
spicyc compiles Spicy code into C++ output, optionally also
executing it directly through JIT.
Usage: spicyc [options] <inputs>
Options controlling code generation:
-c | --output-c++ Print out all generated C++ code (including linker glue by default).
-d | --debug Include debug instrumentation into generated code.
-e | --output-all-dependencies Output list of dependencies for all compiled modules.
-g | --disable-optimizations Disable HILTI-side optimizations of the generated code.
-j | --jit-code Fully compile all code, and then execute it unless --output-to gives a file to store it
-l | --output-linker Print out only generated HILTI linker glue code.
-o | --output-to <path> Path for saving output.
-p | --output-hilti Just output parsed HILTI code again.
-v | --version Print version information.
-A | --abort-on-exceptions When executing compiled code, abort() instead of throwing HILTI exceptions.
-B | --show-backtraces Include backtraces when reporting unhandled exceptions.
-C | --dump-code Dump all generated code to disk for debugging.
-D | --compiler-debug <streams> Activate compile-time debugging output for given debug streams (comma-separated; 'help' for list).
-E | --output-code-dependencies Output list of dependencies for all compiled modules that require separate compilation of their own.
-L | --library-path <path> Add path to list of directories to search when importing modules.
-O | --optimize Build optimized release version of generated code.
-P | --output-prototypes Output C++ header with prototypes for public functionality.
-R | --report-times Report a break-down of compiler's execution time.
-S | --skip-dependencies Do not automatically compile dependencies during JIT.
-T | --keep-tmps Do not delete any temporary files created.
-V | --skip-validation Don't validate ASTs (for debugging only).
-X | --debug-addl <addl> Implies -d and adds selected additional instrumentation (comma-separated; see 'help' for list).
--cxx-link <lib> Link specified static archive or shared library during JIT or to produced HLTO file. Can be given multiple times.
-Q | --include-offsets Include stream offsets of parsed data in output.
Inputs can be .spicy, .hlt, .cc/.cxx, *.hlto.
spicyc also supports the following environment variables to
control the compilation process:
SPICY_PATHReplaces the built-in search path for *.spicy source files.
SPICY_CACHELocation for storing precompiled C++ headers. Default is
~/.cache/spicy/<VERSION>.HILTI_CXXSpecifies the path to the C++ compiler to use.
HILTI_CXX_COMPILER_LAUNCHERSpecifies a command to prefix compiler invocations with during JIT. This can e.g., be used to use a compiler cache like ccache. If Spicy was configured with e.g.,
--with-hilti-compiler-launcher=ccache(the equivalent CMake option isHILTI_COMPILER_LAUNCHER)ccachewould automatically be used during JIT. Setting this variable to an empty value disables use ofccachein that case.HILTI_CXX_INCLUDE_DIRSSpecified additional, colon-separated C++ include directory to search for header files.
HILTI_JIT_PARALLELISMSet to specify the maximum number of background compilation jobs to run during JIT. Defaults to number of cores.
HILTI_JIT_SEQUENTIALSet to prevent spawning multiple concurrent C++ compiler instances. This overrides any value set for
HILTI_JIT_PARALLELISMand effectively sets it to one.HILTI_OPTIMIZER_PASSESColon-separated list of optimizer passes to activate. If unset uses the default-enabled set.
HILTI_PATHReplaces the built-in search path for *.hlt source files.
HILTI_PRINT_SETTINGSSet to see summary of compilation options.
6.4. spicy-driver
spicy-driver is a standalone Spicy host application that compiles
and executes Spicy parsers on the fly, and then feeds them data for
parsing from standard input.
Usage: cat <data> | spicy-driver [options] <inputs> ...
Options:
-d | --debug Include debug instrumentation into generated code.
-i | --increment <i> Feed data incrementally in chunks of size n.
-f | --file <path> Read input from <path> instead of stdin.
-l | --list-parsers List available parsers and exit.
-p | --parser <name> Use parser <name> to process input. Only neeeded if more than one parser is available.
-v | --version Print version information.
-A | --abort-on-exceptions When executing compiled code, abort() instead of throwing HILTI exceptions.
-B | --show-backtraces Include backtraces when reporting unhandled exceptions.
-D | --compiler-debug <streams> Activate compile-time debugging output for given debug streams (comma-separated; 'help' for list).
-F | --batch-file <path> Read Spicy batch input from <path>; see docs for description of format.
-L | --library-path <path> Add path to list of directories to search when importing modules.
-O | --optimize Build optimized release version of generated code.
-R | --report-times Report a break-down of compiler's execution time.
-S | --skip-dependencies Do not automatically compile dependencies during JIT.
-U | --report-resource-usage Print summary of runtime resource usage.
-X | --debug-addl <addl> Implies -d and adds selected additional instrumentation (comma-separated; see 'help' for list).
Environment variables:
SPICY_PATH Colon-separated list of directories to search for modules. In contrast to --library-paths using this flag overwrites builtin paths.
Inputs can be .hlt, .spicy, .cc/.cxx, *.o, *.hlto.
spicy-driver supports the same environment variables as
spicyc.
6.4.1. Specifying the parser to use
If there’s only single public unit in the Spicy source code,
spicy-driver will automatically use that for parsing its input. If
there’s more than one public unit, you need to tell spicy-driver
which one to use through its --parser (or -p) option. To see
the parsers that are available, use --list-parsers (or -l).
In addition to the names shown by --list-parsers, you can also
specify a parser through a port or MIME type if the corresponding unit
defines them through properties. For example,
if a unit defines %port = 80/tcp, you can use spicy-driver -p
80/tcp to select it. To specify a direction, add either %orig or
%resp (e.g., -p 80/tcp%resp); then only units with a port
tagged with an &originator or &responder attribute,
respectively, will be considered. If a unit defines %mime-type =
application/test, you can select it through spicy-driver -p
application/test. (Note that there must be exactly one unit with a
matching property for this all to work, otherwise you’ll get an error
message.)
6.4.2. Batch input
spicy-driver provides a batch input mode for processing multiple
interleaved input flows in parallel, mimicking how host applications
like Zeek would be employing Spicy parsers for processing many
sessions concurrently. The batch input must be prepared in a specific
format (see below) that provides embedded meta information about the
contained flows of input. The easiest way to generate such a batch
is a Zeek script coming with Spicy. If you run Zeek with this script
on a PCAP trace, it will record the contained TCP and UDP sessions
into a Spicy batch file:
# zeek -b -r http/methods.trace record-spicy-batch.zeek
tracking [orig_h=128.2.6.136, orig_p=46562/tcp, resp_h=173.194.75.103, resp_p=80/tcp]
tracking [orig_h=128.2.6.136, orig_p=46563/tcp, resp_h=173.194.75.103, resp_p=80/tcp]
tracking [orig_h=128.2.6.136, orig_p=46564/tcp, resp_h=173.194.75.103, resp_p=80/tcp]
tracking [orig_h=128.2.6.136, orig_p=46565/tcp, resp_h=173.194.75.103, resp_p=80/tcp]
tracking [orig_h=128.2.6.136, orig_p=46566/tcp, resp_h=173.194.75.103, resp_p=80/tcp]
tracking [orig_h=128.2.6.136, orig_p=46567/tcp, resp_h=173.194.75.103, resp_p=80/tcp]
[...]
tracking [orig_h=128.2.6.136, orig_p=46608/tcp, resp_h=173.194.75.103, resp_p=80/tcp]
tracking [orig_h=128.2.6.136, orig_p=46609/tcp, resp_h=173.194.75.103, resp_p=80/tcp]
tracking [orig_h=128.2.6.136, orig_p=46610/tcp, resp_h=173.194.75.103, resp_p=80/tcp]
recorded 49 sessions total
output in batch.dat
You will now have a file batch.dat that you can use with
spicy-driver -F batch.data ....
The batch created by the Zeek script will select parsers for the
contained sessions through well-known ports. That means your units
need to have a %port property matching the responder port of the
sessions you want them to parse. So for the HTTP trace above, our
Spicy source code would need to provide a public unit with property
%port = 80/tcp;.
In case you want to create batches yourself, we document the batch
format in the following. A batch needs to start with a line
!spicy-batch v2<NL>, followed by lines with commands of the form
@<tag> <arguments><NL>.
There are two types of input that the batch format can represent: (1)
individual, uni-directional flows; and (2) bi-directional connections
consisting in turn of one flow per side. The type is determined
through an initial command: @begin-flow starts a flow flow, and
@begin-conn starts a connection. Either form introduces a unique,
free-form ID that subsequent commands will then refer to. The
following commands are supported:
@begin-flow FID TYPE PARSER<NL>Initializes a new input flow for parsing, associating the unique ID
FIDwith it.TYPEmust be eitherstreamfor stream-based parsing (think: TCP), orblockfor parsing each data block independent of others (think: UDP).PARSERis the name of the Spicy parser to use for parsing this input flow, given in the same form as withspicy-driver’s--parseroption (i.e., either as a unit name, a%port, or a%mime-type).@begin-conn CID TYPE ORIG_FID ORIG_PARSER RESP_FID RESP_PARSER<NL>Initializes a new input connection for parsing, associating the unique connection ID
CIDwith it.TYPEmust be eitherstreamfor stream-based parsing (think: TCP), orblockfor parsing each data block independent of others (think: UDP).ORIG_FIDis separate unique ID for the originator-side flow, andORIG_PARSERis the name of the Spicy parser to use for parsing that flow.RESP_FIDandRESP_PARSERwork accordingly for the responder-side flow. The parsers can be given in the same form as withspicy-driver’s--parseroption (i.e., either as a unit name, a%port, or a%mime-type).@data FID SIZE<NL>A block of data for the input flow
FID. This command must be followed directly by binary data of lengthSIZE, plus a final newline character. The data represents the next chunk of input for the corresponding flow.@datacan be used only inside corresponding@begin-*and@end-*commands bracketing the flow ID.@end-flow FID<NL>Finalizes parsing of the input flow associated with
FID, releasing all state. This must come only after a corresponding@begin-flowcommand, and every@begin-flowmust eventually be followed by an@end-flow.@end-conn CID<NL>Finalizes parsing the input connection associated with
CID, releasing all state (including for its two flows). This must come only after a corresponding@begin-conncommand, and every@begin-connmust eventually be followed by an@end-end.
6.5. spicy-dump
spicy-dump is a standalone Spicy host application that compiles
and executes Spicy parsers on the fly, feeds them data for processing,
and then at the end prints out the parsed information in either a
readable, custom ASCII format, or as JSON (--json or -J). By
default, spicy-dump disables showing the output of Spicy print
statements, --enable-print or -P reenables that.
Usage: cat <data> | spicy-dump [options] <inputs> ...
Options:
-d | --debug Include debug instrumentation into generated code.
-f | --file <path> Read input from <path> instead of stdin.
-l | --list-parsers List available parsers and exit.
-p | --parser <name> Use parser <name> to process input. Only neeeded if more than one parser is available.
-v | --version Print version information.
-A | --abort-on-exceptions When executing compiled code, abort() instead of throwing HILTI exceptions.
-B | --show-backtraces Include backtraces when reporting unhandled exceptions.
-D | --compiler-debug <streams> Activate compile-time debugging output for given debug streams (comma-separated; 'help' for list).
-L | --library-path <path> Add path to list of directories to search when importing modules.
-J | --json Print JSON output.
-O | --optimize Build optimized release version of generated code.
-P | --enable-print Show output of Spicy 'print' statements (default: off).
-Q | --include-offsets Include stream offsets of parsed data in output.
-R | --report-times Report a break-down of compiler's execution time.
-S | --skip-dependencies Do not automatically compile dependencies during JIT.
-X | --debug-addl <addl> Implies -d and adds selected additional instrumentation (comma-separated; see 'help' for list).
Environment variables:
SPICY_PATH Colon-separated list of directories to search for modules. In contrast to --library-paths using this flag overwrites builtin paths.
Inputs can be .hlt, .spicy, *.spicy *.hlt *.hlto.