diff --git a/docs/language.rst b/docs/language.rst index 5e0c8eab03520725114ea0edd2dee3f1261cbd53..50b59616ffcf021381269e4aebc424c5917989e8 100644 --- a/docs/language.rst +++ b/docs/language.rst @@ -1,76 +1,166 @@ -Language reference -================== +.. include:: <isonum.txt> -This document describes the syntax and semantics of the custom language used in ransack. +Language reference +****************** +This document describes the syntax and semantics of the custom language used in **ransack**. .. contents:: Table of Contents :depth: 2 :local: -Basic Data Types ----------------- +Basic data types +================ -- **Number**: Decimal or scientific notation. +Number +------ +Numbers can be represented as integers, floating-point values, or in scientific notation. - - ``42``, ``3.14``, ``1e-5`` +Examples: +``27``, ``-15``, ``3.14``, ``2e25``, ``2.15e-17`` -- **String**: Enclosed in single or double quotes. +String +------ +A sequence of characters enclosed in single or double quotes. - - ``"hello"``, ``'world'`` +Examples: +``"hello"``, ``'world'`` -- **Variable**: Alphanumeric with optional dots, underscores, or hyphens. +Datetime +-------- +Combination of date and time, optionally joined by ``T``. - - ``Source.IP4``, ``.Description`` +- **Date**: ``YYYY-MM-DD`` -- **Datetime**: Combination of date and time, optionally joined by "T". +- **Time**: ``HH:MM:SS[.fraction][Z|(+|-)HH:MM]`` - - Full: ``2025-04-11T14:30:00``, ``2025-04-11 14:30:00Z`` - - Date only: ``2025-04-11`` (interpreted as datetime with zeroed time) +If the time component is omitted, it defaults to midnight. -- **Timedelta**: Duration formatted as ``[D]HH:MM:SS`` +Examples: +``2025-04-11T14:30:00``, ``2025-04-11 14:30:00Z``, ``2025-04-11``, +``2025-04-11T14:30:00+07:00`` - - ``1D12:00:00``, ``23:59:59`` +Timedelta +--------- +Represents a time interval. An optional number of days (``d`` or ``D``) may precede the required +time component (``HH:MM:SS``). -- **IPv4**: +Examples: +``1D12:00:00``, ``23:59:59`` - - Single: ``192.168.1.1`` - - Range: ``192.168.1.1-192.168.1.100`` - - CIDR: ``192.168.1.0/24`` +IPv4 & IPv6 +----------- +Supports individual addresses, ranges, and CIDR notation. -- **IPv6**: +Examples: +``192.168.1.1``, ``192.168.1.1-192.168.1.100``, ``192.168.1.0/24`` +``2001:db8::1``, ``2001:db8::1-2001:db8::ff``, ``2001:db8::/64`` - - Single: ``2001:db8::1`` - - Range: ``2001:db8::1-2001:db8::ff`` - - CIDR: ``2001:db8::/64`` +.. warning:: + IP addresses in quotation marks (e.g., ``"192.168.0.1"``) are recognized as strings. + That is, they are no longer recognized as IP addresses! -Collections ------------ +Variables +========= +Alphanumeric with optional dots, underscores, or hyphens. -- **List**: Comma-separated values in square brackets. +Examples: +``Source.IP4``, ``.Description`` - - ``[1, 2, 3.0]``, ``[192.168.0.1, 192.168.0.0/24]`` +Context +------- -- **Range**: +The parser accepts an optional **context** dictionary at initialization, allowing users to define static variables that can be referenced in queries. - - ``1..1024``, ``1.0 .. 0``, ``2025-01-01 .. 2025-12-31`` +Context variables: -Arithmetic Operators --------------------- +- Are separate from input data. +- Take precedence over similarly named data variables. +- Allow for reusable query logic. -- ``+`` : Addition -- ``-`` : Subtraction / Negation (unary) -- ``*`` : Multiplication -- ``/`` : Division -- ``%`` : Modulo +To resolve ambiguity, a prefixing convention is used: -Logical Operators ------------------ +- Variables prefixed with a dot (e.g., ``.foo``) are explicitly taken from the input data. +- Variables without a prefix (e.g., ``foo``) are looked up in the context if available. -- ``and`` / ``&&`` : Logical AND -- ``or`` / ``||`` : Logical OR -- ``not`` / ``!`` : Logical NOT +Example: + +.. code-block:: python + + ctx = {"threshold": 100} + parser = Parser(context=ctx) + parser.parse("threshold > .load") # Context variable vs data variable + + +Functions +========= + +Functions use parentheses and comma-separated arguments. -Comparison Operators +For the full list and documentation of available functions, see :doc:`function`. + +Functions return either a basic data type or a collection. + +Examples: +``len(Source.IP4)``, ``now()`` + +Collections +=========== + +List +---- +Comma-separated values in square brackets. + +Examples: +``[1, 2, 3.0]``, ``[192.168.0.1, 192.168.0.0/24]`` + +Range +----- +Defines an inclusive sequence of numbers, datetimes, or IP addresses. + +Examples: +``1..1024``, ``1.0 .. 0``, ``2025-01-01 .. 2025-12-31`` + +Operators +========= + +Arithmetic operators +-------------------- + ++------------+-----------------------------+----------------+------------------------------------------+ +| Operator | Operand Types | Result Type | Description | ++============+=============================+================+==========================================+ +| ``+`` | ``Number + Number`` | ``Number`` | Standard numeric addition | ++ +-----------------------------+----------------+------------------------------------------+ +| | ``Datetime + Timedelta`` | ``Datetime`` | Adds duration to datetime | ++ +-----------------------------+----------------+------------------------------------------+ +| | ``Timedelta + Timedelta`` | ``Timedelta`` | Adds two time intervals | ++------------+-----------------------------+----------------+------------------------------------------+ +| ``-`` | ``Number - Number`` | ``Number`` | Standard numeric subtraction | ++ +-----------------------------+----------------+------------------------------------------+ +| | ``Datetime - Timedelta`` | ``Datetime`` | Subtracts duration from datetime | ++ +-----------------------------+----------------+------------------------------------------+ +| | ``Timedelta - Timedelta`` | ``Timedelta`` | Subtracts two time intervals | ++ +-----------------------------+----------------+------------------------------------------+ +| | ``Datetime - Datetime`` | ``Timedelta`` | Time difference between two datetimes | ++------------+-----------------------------+----------------+------------------------------------------+ +| ``*`` | ``Timedelta * Number`` | ``Timedelta`` | Scales a time interval | ++ +-----------------------------+----------------+------------------------------------------+ +| | ``Number * Number`` | ``Number`` | Standard numeric multiplication | ++------------+-----------------------------+----------------+------------------------------------------+ +| ``/`` | ``Timedelta / Timedelta`` | ``Number`` | Ratio of two time intervals | ++ +-----------------------------+----------------+------------------------------------------+ +| | ``Number / Number`` | ``Number`` | Standard numeric division | ++------------+-----------------------------+----------------+------------------------------------------+ +| ``%`` | ``Timedelta % Timedelta`` | ``Timedelta`` | Remainder of timedelta division | ++ +-----------------------------+----------------+------------------------------------------+ +| | ``Number % Number`` | ``Number`` | Standard numeric modulo | ++------------+-----------------------------+----------------+------------------------------------------+ + +- Basic data type vs. collection: Operation is applied element-wise. + +- Collection vs. collection: **Disallowed**. + +Comparison operators -------------------- - ``=`` : Loose equality @@ -78,39 +168,82 @@ Comparison Operators - ``>`` / ``>=`` : Greater than / Greater than or equal - ``<`` / ``<=`` : Less than / Less than or equal - ``like`` / ``LIKE`` : Pattern matching -- ``in`` / ``IN`` : Membership test -- ``contains`` / ``CONTAINS`` : Collection containment -Special Operators +Comparisons are valid for numbers, datetimes, timedeltas, and IP addresses. + +**Collections**: True if **any** element satisfies the comparison. + +Comparing two collections is treated as: +“Does **any element in the left collection** match **any element in the right collection**?” + +.. warning:: + + The ``==`` operator now represents *strong equality*, in contrast to the older behavior where it also matched elements within collections. + + To support collection-aware comparisons, a new ``=`` operator has been introduced. This operator returns ``True`` if *any* element in the collection satisfies the comparison — mirroring the behavior of ``>``, ``>=``, ``<``, and ``<=``. + + For example:: + + 1 == [1, 2] # False + 1 = [1, 2] # True + + This change promotes consistency across all comparison operators, but be aware that it may lead to different results if you previously relied on ``==`` for implicit collection matching. + +Special operators ----------------- -- ``??`` : Existence check or default fallback +in operator +^^^^^^^^^^^ +Tests membership. Returns true if the left-hand element exists in the right-hand side +(collection or IP/CIDR range). Recurses into nested collections. - - ``foo ?? "default"``, ``Source.Port??[]``, ``Description??`` +contains operator +^^^^^^^^^^^^^^^^^ +String containment check. -- ``.`` : Concatenation +Example: +``"abcdef" contains "abc"`` |rarr| ``True`` - - ``'abc'.'def'`` , ``[1, 2, 3] . [4, 5, 6]`` +.. warning:: + ``"abc" in ["abcdef", "cccabcddd"]`` |rarr| ``False`` -Functions ---------- +?? operator +^^^^^^^^^^^ +Existence/default fallback operator. -Functions are called with parentheses. Arguments are comma-separated: +- Unary: ``var??`` — checks if a variable exists. -.. code-block:: text +- Binary: ``var ?? default`` — returns ``default`` if ``var`` is not found. + +Examples: +``Description??``, ``foo ?? "default"``, ``Source.Port??[]`` - func_name(arg1, arg2) +\. (Concatenation) +^^^^^^^^^^^^^^^^^^ +Used to concatenate strings or merge lists. Examples: +``'abc'.'def'`` , ``[1, 2, 3] . [4, 5, 6]`` -.. code-block:: text +\.. (Range) +^^^^^^^^^^^ +Defines a range between two values (inclusive). - len(Source.IP4) - now() +Examples: +``1 .. 13``, ``1..Source.Port``, ``192.168.0.1..192.168.0.255`` +Logical operators +----------------- -Evaluation Order (Precedence) ------------------------------ +- ``and`` / ``&&`` : Logical AND +- ``or`` / ``||`` : Logical OR +- ``not`` / ``!`` : Logical NOT + +Return a boolean. + + +Evaluation order (precedence) +============================= From highest to lowest: @@ -119,19 +252,19 @@ From highest to lowest: 3. Existence ``??`` 4. Multiplicative: ``*``, ``/``, ``%`` 5. Additive: ``+``, ``-`` -6. Concatenation: ``.`` and Ranges: ``..`` -7. Comparison: ``=, ==, >, <, >=, <=, like, in, contains`` +6. Concatenation / Range: ``.``, ``..`` +7. Comparison: ``=``, ``==``, ``>``, ``<``, ``>=``, ``<=``, ``like``, ``in``, ``contains`` 8. Logical NOT: ``!``, ``not`` 9. Logical AND: ``&&``, ``and`` 10. Logical OR: ``||``, ``or`` Examples --------- +======== .. code-block:: text (3 + 4) * 2 "tcp" in Source.Proto??[] not (Format == "IDEA0") - Source.IP4 = 192.168.0.1 or 10.0.0.0/8 in Source.IP4 + Source.IP4 = 192.168.0.1 or Source.IP4 in 10.0.0.0/8