Releases: zeek/spicy
v1.8.2
-
GH-1571: Remove trimming inside individual chunks.
Trimming
Chunk
s (always from the left) causes a lot of internal work with only limited benefit since we manage visibility withstream::View
s on top ofChunk
s anyway.This patch removes trimming inside
Chunk
s so now any trimming only removesChunk
s fromChain
s, but does not internally change individualChunk
s anymore. This might lead to slightly increased memory use, but callers usually have that data in memory anyway. -
GH-1549: GH-1554: Fix potential infinite loop when trimming data before stream.
Previously we would trigger an infinite loop if one tried to trim before the head chunk of a stream. In praxis this seem to have been no issue due to #1549 and us emitting way less calls to trim than possible.
This patch adds an explicit check whether we need to trim anything, and exits the low-level function early for such cases.
-
GH-1550: Replace recursive deletion with explicit loop to avoid stack overflow.
-
GH-1549: Add feature guards to accesses of a unit's
__position
.Access of
__position
triggers a random access functionality. In order to distinguish our internal uses from accesses due to user code, most access in our generated code should be guarded with a feature constant (if
or ternary).In this patch add proper guards for a couple instances where we did not do that correctly. That mishap caused all units with containers to be random access (even the root unit) which in turn could have lead to e.g., unbounded memory growth, or runtime overhead due to generation and execution of unneeded code, or expensive cleanup on very large untrimmed inputs.
-
Artificially limit the number of open files.
This works around a silent failure in reproc where it would refuse to run on systems which huge rlimits for the number of open files. We have seen this hit on huge production boxes.
-
Add begin to parser state.
This patch adds the current begin position to the parser state, and makes the corresponding changes to generated parser functions so it is passed down.
We already modelled the semantic beginning of the input in the unit, but had no reliable way to keep this up-to-date across non-unit contexts like
&parse-from
. This would then for certain setups lead to generated code whereinput
andposition
would point to different inputs which in turn causedoffset
(modelled asposition - input
) to be incorrect. -
Expand validator error message.
-
Disable a few newer clang-tidy categories.
The options disabled here and triggered in newer versions of clang-tidy.
-
Drop
-noall_load
linker option.We added this linker option on macos. This option was already obsolete, e.g., in the
ld
manpage:-noall_load This is the default. This option is obsolete.
Newer versions of xcode do not know this option anymore and instead generate a hard error.
-
Declare Spicy pygments extension as parallel-safe.
We previously would not declare that the Spicy pygments highlighter is safe to execute in parallel (reading or writing of sources). Sphinx then assumed that the extension was not safe to run in parallel and instead ran jobs sequentially.
This patch declares the extension as able to execute in parallel. Since the extension does not manage any external state this is safe.
-
Use
find_package(Python)
with version.Zeek's configure sets
Python_EXECUTABLE
has hint, but Spicy is usingfind_package(Python3)
and would only usePython3_EXECUTABLE
as hint. This results in Spicy finding a different (the default) Python executable when configuring Zeek with--with-python=/opt/custom/bin/python3
.Switch Spicy over to use find_package(Python) and add the minimum version so it knows to look for Python3.
v1.8.1
v1.8.0
New Functionality
Add new skip
keyword to let unit items efficiently skip over uninteresting data.
For cases where your parser just needs to skip over some data, without needing access to its content, Spicy provides a skip
keyword to prefix corresponding fields with:
module Test;
public type Foo = unit {
x: int8;
: skip bytes &size=5;
y: int8; on %done { print self; }
};
skip
works for all kinds of fields but is particularly efficient with bytes
fields, for which it will generate optimized code avoiding the overhead of storing any data.
skip
fields may have conditions and hooks attached, like any other fields. However, they do not support $$
in expressions and hooks.
For readability, a skip
field may be named (e.g., padding: skip bytes &size=3;
), but even with a name, its value cannot be accessed.
skip
fields extend support for void
with attributes fields which are now deprecated.
Add runtime profiling infrastructure.
This add an option --enable-profiling
to the HILTI and Spicy compilers. Use of the option does two things: (1) it sets a flag enabling inserting additional profiling instrumentation into generated C++ code, and (2) it enables using instrumentation for recording profiling information during execution of the compiled code, including dumping out a profiling report at the end. The profiling information collected includes time spent in HILTI functions as well as for parsing Spicy units and unit fields.
Changed Functionality
Optimizations for improved runtime performance.
This release contains a number of changes to improve the runtime performance of generated parsers. This includes tweaks for generating more performant code for parsers, low-level optimizations of types in to runtime support library as well as fine-tuning of parser execution at runtime.
- Do not force locale on users of libhilti.
- Avoid expensive checked iterator for internal
Bytes
iteration. - GH-1089: Allow to use
offset()
without enabling full random-access support. - GH-1394: Fix C++ normalization of generated enum values.
- Disallow using
$$
with anonymous containers.
Bug fixes
- GH-1386: Prevent internal error when passed invalid context.
- Fix potential use-after-move bug.
- GH-1390: Initialize
Bytes
internal control block for all constructors. - GH-1396: Fix regex performance regression introduced by constant folding.
- GH-1399: Guard access to unit
_filters
member with feature flag. - GH-1421: Store numerical offset in units instead of iterator for position.
- GH-1436: Make sure
Bytes::sub
only throws HILTI exceptions. - GH-1447: Do not forcibly make
strong_ref
in
function parameters immutable. - GH-1452: Allow resolving of unit parameters before
self
is fully resolved. - Make sure Spicy runtime config is initialized after
spicy::rt::init
. - Adjustments for building with GCC-13.
Documentation
- Document how to check whether an
optional
value is set. - Preserve indention when extracting comments in doc generation.
- Fix docs for long-form of
-x
flag to spicyc.
v1.5.4
- GH-1436: Make sure
Bytes::sub
only throws HILTI exceptions. - Allow building with gcc-13.
- Allow optimizer to remove unused filter functionality in units.
- Avoid expensive checked iterator for internal
Bytes
iteration. - GH-1390: Initialize
Bytes
internal control block for all constructors. - Do not force locale on users of libhilti.
- Fix potential use-after-move bug.
- GH-1310: Fix ASAN false positive with GCC.
- Skip clang-specific ASAN flags with other compilers.
- Don't instantiate a debug logger if we aren't going to debug log.
- Simplify extract methods.
- Shortcut some offset computations.
- GH-1345: Apply alternative fix for #1345.
- Make
printParserState
cheaper to call if debug logging is disabled. - GH-1367: Use unique filename for all object files generated during JIT.
- Fix code generation for
-X flow
or-X trace
. - Remove potential race during JIT when using
HILTI_CXX_COMPILER_LAUNCHER
.
v1.7.0
New Functionality
- Support Zeek-style documentation strings in Spicy source code.
- Provide ability for host applications to initiate runtime's module-pre-init phase manually.
- Add DPD-style
spicy::accept_input()
andspicy::decline_input()
. - Add driver option to output full set of generated C++ files.
- GH-1123: Support arbitrary expression as argument to type constructors, such as
interval(...)
.
Changed Functionality
- Search
HILTI_CXX_INCLUDE_DIRS
paths before default include paths. - Search user module paths before system paths.
- Streamline runtime exception hierarchy.
- Fix bug in cast from
real
tointerval
. - GH-1326: Generate proper runtime types for enums.
- GH-1330: Reject uses of imported module IDs as expression.
Bug fixes
- GH-1310: Fix ASAN false positive with GCC.
- GH-1345: Improve runtime performance of stream iteration.
- GH-1367: Use unique filename for all object files generated during JIT.
- Remove potential race during JIT when using
HILTI_CXX_COMPILER_LAUNCHER
. - GH-1349: Fix incremental regexp matching for potentially empty results.
v1.6.1
Bug fixes
- GH-1356: Add stringification to UnitContext.
- GH-1357: Remove potential race during JIT when using
HILTI_CXX_COMPILER_LAUNCHER
. - GH-1349: Fix incremental regexp matching for potentially empty results.
- Fix test
Backtrace.comparison
for aggressive constant folding. - Install full nlohmann JSON header as well.
- GH-1345: Fix pathological performance for
_haveEod
on highly chunked streams. - Remove performance pessimization introduced in
84fb4f21f494
.
v1.5.3
Bug fixes
- GH-1349: Fix incremental regexp matching for potentially empty results.
- Install full nlohmann JSON header as well.
- GH-1345: Fix pathological performance for
_haveEod
on highly chunked streams. - cmake/FindGoldLinker: Do not use gold by default
- cmake/FindGoldLinker: Put -fuse-ld=gold before existing flags
- GH-1303: Fix incorrect offset computation in
advanceToNextData
.
v1.6.0
New Functionality
-
GH-1249: Allow combining
&eod
with&until
or&until-including
. -
GH-1251: When decoding bytes into a string using a given character set, allow caller to control error handling.
All methods taking a charset parameters now take an additional enum selecting 1 of 3 possible error handling strategies in case a character can't be decoded/represented:
STRICT
throws an error,IGNORE
skips the problematic character and proceeds with the next, andREPLACE
replaces the problematic character with a safe substitute.REPLACE
is the default everywhere now, so that by default no errors are triggered.This comes with an additional functional change for the ASCII encoding: we now consistently sanitize characters that ASCII can't represent when in
REPLACE
/IGNORE
modes (and, hence, by default), and trigger errors inSTRICT
mode. Previously, we'd sometimes let them through, and never triggered any errors. This also fixes a bug with the ASCII encoding sometimes turning a non-printable character into multiple repeated substitutes. -
GH-1294: Add library function to parse an address from string or bytes.
-
HLTO files now perform a version check when loaded.
We previously would potentially allow building a HLTO file against one version of the Spicy runtime, and then load it with a different version. If exposed symbols matched loading might have succeeded, but could still have lead to sublte bugs at runtime.
We now embed a runtime version string in HLTO files and reject loading HLTO files into a different runtime version. We require an exact version match.
-
New
pack
andunpack
operators.These provide low-level primitives for transforming a value into, or out of, a binary representations, see the docs for details.
Changed Functionality
-
GH-1236: Add support for adding link dependencies via
--cxx-link
. -
GH-1285: C++ identifiers referenced in
&cxxname
are now automatically interpreted to be in the global namespace. -
Synchronization-related debug messages are now logged to the
spicy-verbose
stream. We added logging of successful synchronization. -
Downgrade required Flex version. We previously required at least flex-2.6.0; we can now build against flex-2.5.37.
-
Improve C++ caching during JIT.
We improved caching behavior via
HILTI_CXX_COMPILER_LAUNCHER
if the configuration ofspicyc
was changed without changing the C++ file produced during JIT. -
hilti::rt::isDebugVersion
has been removed. -
The
-O | --optimize
flag has been removed from command line tools.This was already a no-op without observable side-effects.
-
GH-1311: Reject use of
context()
unit method if unit does not declare a context with%context
. -
GH-1319: Unsupported unit variable attributes are now rejected.
-
GH-1299: Add validator for bitfield field ranges.
-
We now reject uses of
self
as an ID. -
GH-1233: Reject key types for maps that can't be sorted.
-
Fix validator for field
&default
expression types for constness.When checking types of field
&default
expressions we previously would also consider their constness. This breaks e.g., cases where the used expression is not a LHS like the field the&default
is defined for,type X = unit { var x: bytes = b"" + a; };
We now do not consider constness in the type check anymore. Since fields are never const this allows us to set a
&default
with constant expressions as well.
Bug fixes
- GH-1231: Add special handling for potential
advance
failure in trial mode. - GH-1115, GH-1196: Explicitly type temporary value used by
&max_size
logic. - GH-1143, GH-1220: Add coercion on assignment for optionals that only differ in constness of their inner types.
- GH-1230: Add coercion to default argument of
map::get
. - GH-1234, GH-1238: Fix assertions with anonymous struct constructor.
- GH-1248: Fix
stop
for unbounded loop. - GH-1250: Fix internal errors when seeing unsupported character classes in regular expression.
- GH-1170: Fix contexts not allowing being passed
inout
. - GH-1266: Fix wrong type for Spicy-side
self
expression. - GH-1261: Fix inability to access unit fields through
self
in&convert
expressions. - GH-1267: Install only needed headers from bundled SafeInt library.
- GH-1227: Fix code generation when a module's file could be imported through different means.
- GH-1273: Remove bundled code licensed under CPOL license.
- GH-1303: Fix potentially late synchronization when jumping over gaps during synchronization.
- Do not force gold linker with user-provided linker flags or when built as a CMake subproject.
- Improve efficiency of
startsWith
for long inputs.
Documentation
- The documentation now reflects Zeek package manager Spicy feature templates.
- The documentation for bitfields was clarified.
- Documentation for casts from integers to boolean was added.
- We added documentation for how to expose custom C++ code in Spicy.
- Update doc link to commits mailing list.
- Clarify that
%context
can only be used in top-level units. - Clarify that
&until
consumes the delimiter. - GH-1240: Clarify docs on
SPICY_VERSION
. - Add FAQ item on source locations.
- Add example for use of
?.
.
v1.5.2
v1.5.1
Bug fixes
- GH-1248: Fix
stop
for unbounded loop. - GH-1230: Add coercion to default argument of
map::get
. - GH-1143, GH-1220: Add coercion on assignment for optionals that only differ in constness of their inner types.
- GH-1196: Fix validator for field
&default
expression types for constness. - GH-1227: Fix code generation when a module's file could be imported through different means.