PineScript v6 Compiler¶

A custom 4-stage compilation pipeline that transforms PineScript v6 source code into vectorized Python functions. This is not a general-purpose PineScript interpreter — it is optimized for the rigid, predictable structure generated by the visual strategy builder. The compiler handles the subset of PineScript v6 needed for signal-based strategies: strategy() declarations, input.*() parameters, ta.* indicator calls, boolean expressions, and if blocks with strategy.entry()/strategy.close().

Compilation Pipeline¶

flowchart TB
    A[".pine Source"] --> B["Tokenizer"]
    B --> C["Token Stream"]
    C --> D["Parser"]
    D --> E["AST"]
    E --> F["Code Generator"]
    F --> G["Python Source<br/>_compute + _compute_fast"]
    G --> H["TransformedStrategy"]

flowchart TB
    subgraph Input
        SRC["strategy('MACD Cross', overlay=true)<br/>fast = input.int(12, 'Fast')<br/>[m, s, _] = ta.macd(close, fast, 26, 9)<br/>if ta.crossover(m, s)<br/>    strategy.entry('Long', strategy.long)"]
    end

    subgraph Stage1["Stage 1 — Tokenizer"]
        TOK["KEYWORD:strategy LPAREN STRING:'MACD Cross' ...<br/>IDENT:fast ASSIGN KEYWORD:input DOT IDENT:int ...<br/>LBRACKET IDENT:m COMMA IDENT:s COMMA ...<br/>KEYWORD:if IDENT:ta DOT IDENT:crossover ...<br/>INDENT IDENT:strategy DOT IDENT:entry ..."]
    end

    subgraph Stage2["Stage 2 — Parser"]
        AST["Program<br/>├── StrategyDecl(name='MACD Cross', overlay=true)<br/>├── InputDecl(fast, int, default=12)<br/>├── Assignment([m, s, _] = ta.macd(...))<br/>└── IfBlock<br/>    ├── condition: ta.crossover(m, s)<br/>    └── body: strategy.entry('Long', long)"]
    end

    subgraph Stage3["Stage 3 — Code Generator"]
        PY["def _compute(df, params):<br/>    _close = df['close']<br/>    fast = params.get('Fast', 12)<br/>    (m, s, _) = ta.macd(_close, fast, 26, 9)<br/>    long_entry = ta.crossover(m, s).fillna(False)<br/>    return long_entry, long_exit, short_entry, short_exit"]
    end

    subgraph Output
        OBJ["TransformedStrategy<br/>name = 'MACD Cross'<br/>compute = _compute<br/>warmup = 52<br/>inputs = {Fast: IntInput(12)}"]
    end

    Input --> Stage1 --> Stage2 --> Stage3 --> Output

Stage 1: Tokenizer¶

File: backtest/pine/tokens.py

The tokenizer converts raw PineScript source into a flat stream of typed tokens. It handles several PineScript-specific behaviors:

Behavior	Description	Example
Comment stripping	Removes `//` line comments, preserving `//` inside string literals	`rsi = ta.rsi(close, 14) // lookback` → strips comment
Continuation joining	Joins lines with unbalanced parentheses into a single logical line	`[a, b, c] = ta.macd(close,` `12, 26, 9)` → one line
Indent tracking	Emits `INDENT`/`DEDENT` tokens for PineScript's if-block structure	`if condition` `strategy.entry(...)`
Keyword recognition	Distinguishes keywords (`if`, `and`, `or`, `not`, `true`, `false`) from identifiers	`if` → `KEYWORD`, `rsi_val` → `IDENT`

Token Types¶

NUMBER      12, 3.14, 0.5
STRING      "MACD Cross", 'Long'
IDENT       rsi_val, macdLine, fast_length
KEYWORD     if, and, or, not, true, false, strategy, input
DOT         .
COMMA       ,
ASSIGN      =
LPAREN (    RPAREN )
LBRACKET [  RBRACKET ]
COMPARE     >, <, >=, <=, ==, !=
OPERATOR    +, -, *, /
INDENT      (indentation increase)
DEDENT      (indentation decrease)
NEWLINE     (logical line boundary)

Stage 2: Parser¶

File: backtest/pine/parser.py

A recursive descent parser that consumes the token stream and produces an Abstract Syntax Tree. The parser recognizes the following PineScript constructs:

AST Node	PineScript Construct	Example
`StrategyDecl`	`strategy()` declaration	`strategy("Name", overlay=true, initial_capital=10000)`
`InputDecl`	`input.*()` parameter definition	`fast = input.int(12, "Fast Length")`
`Assignment`	Variable assignment (single or tuple destructuring)	`rsi = ta.rsi(close, 14)` or `[m, s, h] = ta.macd(...)`
`IfBlock`	`if` block with `strategy.entry/close/exit`	`if longCond` `strategy.entry("Long", strategy.long)`
`FunctionCall`	`ta.`, `math.`, `nz()`, `na()` calls	`ta.crossover(macdLine, signalLine)`
`BinaryOp`	Boolean and arithmetic expressions	`rsi > 70 and macd > 0`
`UnaryOp`	Negation	`not condition`

AST Structure Example¶

For a simple RSI strategy:

Program(
    strategy=StrategyDecl(
        name="RSI Overbought/Oversold",
        settings={"overlay": True, "initial_capital": 10000}
    ),
    inputs=[
        InputDecl(var="length", type="int", default=14, title="RSI Length"),
        InputDecl(var="upper", type="int", default=70, title="Overbought"),
        InputDecl(var="lower", type="int", default=30, title="Oversold"),
    ],
    assignments=[
        Assignment(var="rsi_val", expr=FunctionCall("ta.rsi", [Ident("close"), Ident("length")])),
    ],
    blocks=[
        IfBlock(
            condition=BinaryOp(Ident("rsi_val"), "<", Ident("lower")),
            body=[StrategyAction("entry", "Long", "strategy.long")]
        ),
        IfBlock(
            condition=BinaryOp(Ident("rsi_val"), ">", Ident("upper")),
            body=[StrategyAction("entry", "Short", "strategy.short")]
        ),
    ]
)

Stage 3: Code Generator¶

File: backtest/pine/codegen.py

The code generator walks the AST and emits two Python functions:

`_compute(df, params)` — Pandas Path¶

Full DataFrame operations. Used by standard backtesting mode where the strategy runs once over the entire dataset.

def _compute(df, params):
    _open  = df['open']
    _high  = df['high']
    _low   = df['low']
    _close = df['close']
    _volume = df['volume']

    length = params.get('RSI Length', 14)
    upper  = params.get('Overbought', 70)
    lower  = params.get('Oversold', 30)

    rsi_val = ta.rsi(_close, length)

    long_entry  = (rsi_val < lower).fillna(False)
    long_exit   = pd.Series(False, index=df.index)
    short_entry = (rsi_val > upper).fillna(False)
    short_exit  = pd.Series(False, index=df.index)

    return long_entry, long_exit, short_entry, short_exit

`_compute_fast(opens, highs, lows, closes, volumes, params)` — NumPy Fast Path¶

Scalar-only operations on raw NumPy arrays. Returns 4 scalar booleans for the last bar only. Used by the magnifier's inner loop where compute is called potentially thousands of times per backtest — once per sub-bar.

def _compute_fast(opens, highs, lows, closes, volumes, params):
    length = params.get('RSI Length', 14)
    upper  = params.get('Overbought', 70)
    lower  = params.get('Oversold', 30)

    rsi_val = ta_fast.rsi(closes, length)

    long_entry  = rsi_val < lower
    long_exit   = False
    short_entry = rsi_val > upper
    short_exit  = False

    return long_entry, long_exit, short_entry, short_exit

Key Transformations¶

The code generator performs several critical translations to bridge PineScript semantics with vectorized Python:

Price Builtins → DataFrame Columns¶

PineScript	Generated Python	Reason
`close`	`_close` (alias for `df['close']`)	Avoid Python builtin shadowing
`open`	`_open` (alias for `df['open']`)	`open` is a Python builtin
`high`	`_high`	Consistency
`low`	`_low`	Consistency
`volume`	`_volume`	Consistency
`hlc3`	`(_high + _low + _close) / 3`	Derived price source
`ohlc4`	`(_open + _high + _low + _close) / 4`	Derived price source

Implicit Argument Injection¶

PineScript's ta.atr(14) implicitly uses high, low, close. The generated Python must make these explicit:

IMPLICIT_ARGS = {
    "atr":        ("_high", "_low", "_close"),
    "supertrend": ("_high", "_low", "_close"),
    "sar":        ("_high", "_low"),
    "dmi":        ("_high", "_low", "_close"),
    "obv":        ("_close", "_volume"),
    "mfi":        ("_high", "_low", "_close", "_volume"),
    "vwap":       ("_high", "_low", "_close", "_volume"),
    "ad":         ("_high", "_low", "_close", "_volume"),
    "wad":        ("_high", "_low", "_close"),
}

PineScript	Generated Python
`ta.atr(14)`	`ta.atr(_high, _low, _close, 14)`
`ta.rsi(close, 14)`	`ta.rsi(_close, 14)`
`ta.obv()`	`ta.obv(_close, _volume)`
`ta.supertrend(3, 10)`	`ta.supertrend(_high, _low, _close, 3, 10)`

Boolean Operators¶

PineScript	Generated Python	Reason
`a and b`	`(a) & (b)`	pandas Series require bitwise `&`, not `and`
`a or b`	`(a) \\| (b)`	pandas Series require bitwise `\\|`, not `or`
`not a`	`~(a)`	Bitwise NOT for Series

Parenthesization is critical

Without explicit parentheses, a & b | c is evaluated as a & (b | c) due to Python operator precedence. The codegen wraps every operand: (a) & (b), (a) | (b).

Other Translations¶

PineScript	Generated Python
`math.abs(x)`	`np.abs(x)`
`math.max(a, b)`	`np.maximum(a, b)`
`math.min(a, b)`	`np.minimum(a, b)`
`math.sqrt(x)`	`np.sqrt(x)`
`nz(x)`	`x.fillna(0)`
`na(x)`	`x.isna()`
`[a, b, c] = f()`	`(a, b, c) = f()`
`true` / `false`	`True` / `False`

NaN Safety¶

Every signal condition is wrapped with .fillna(False). Indicators return NaN during their warmup period (e.g., the first 14 bars for RSI-14), and NaN must never propagate as a True signal.

Why Two Compute Paths¶

Path	Used By	Input	Output	Overhead
`_compute()`	Standard backtest	`pd.DataFrame` (full dataset)	4 × `pd.Series` (boolean)	DataFrame allocation, index alignment
`_compute_fast()`	Magnifier inner loop	5 × `np.ndarray` (raw arrays)	4 × `bool` (scalars)	Minimal — pure NumPy

The magnifier recomputes signals on every sub-bar — potentially 10,000+ calls per backtest (1,000 chart bars × 10 sub-bars each). At that volume, pandas DataFrame overhead dominates:

Standard path:  ~2ms per call × 10,000 = 20 seconds
Fast path:      ~0.1ms per call × 10,000 = 1 second

The fast path eliminates pandas entirely — raw NumPy arrays in, scalar booleans out. When _compute_fast is not available (complex strategies with unsupported operations), the magnifier falls back to _compute with a performance penalty.

Supported Indicators¶

The TA indicator library (backtest/ta.py) implements 37 indicators as static methods on a ta class. All accept and return pd.Series, use vectorized operations (no Python loops), and are verified against TradingView output.

Trend¶

Indicator	Function	Parameters
Simple Moving Average	`ta.sma(source, length)`	source, period
Exponential Moving Average	`ta.ema(source, length)`	source, period
Weighted Moving Average	`ta.wma(source, length)`	source, period
Volume-Weighted MA	`ta.vwma(source, volume, length)`	source, volume, period
Hull Moving Average	`ta.hma(source, length)`	source, period
Running Moving Average	`ta.rma(source, length)`	source, period
Arnaud Legoux MA	`ta.alma(source, length, offset, sigma)`	source, period, offset, sigma
Symmetrically-Weighted MA	`ta.swma(source)`	source
SuperTrend	`ta.supertrend(high, low, close, factor, period)`	factor, ATR period

Momentum¶

Indicator	Function	Parameters
Relative Strength Index	`ta.rsi(source, length)`	source, period
MACD	`ta.macd(source, fast, slow, signal)`	source, fast/slow/signal periods
Stochastic	`ta.stoch(high, low, close, k, d, smooth)`	K period, D period, smoothing
Commodity Channel Index	`ta.cci(high, low, close, length)`	period
Money Flow Index	`ta.mfi(high, low, close, volume, length)`	period
Chande Momentum Oscillator	`ta.cmo(source, length)`	source, period
Rate of Change	`ta.roc(source, length)`	source, period
True Strength Index	`ta.tsi(source, long, short)`	source, long/short periods
Momentum	`ta.mom(source, length)`	source, period
Williams %R	`ta.wpr(high, low, close, length)`	period
Percent Rank	`ta.percentrank(source, length)`	source, period

Volatility¶

Indicator	Function	Parameters
Average True Range	`ta.atr(high, low, close, length)`	period
Bollinger Bands	`ta.bb(source, length, mult)`	source, period, multiplier
Bollinger Band Width	`ta.bbw(source, length, mult)`	source, period, multiplier
Keltner Channel	`ta.kc(high, low, close, length, mult)`	period, multiplier
Keltner Channel Width	`ta.kcw(high, low, close, length, mult)`	period, multiplier
Directional Movement Index	`ta.dmi(high, low, close, length)`	period
Standard Deviation	`ta.stdev(source, length)`	source, period
Parabolic SAR	`ta.sar(high, low, start, inc, max)`	start, increment, max
Center of Gravity	`ta.cog(source, length)`	source, period

Volume¶

Indicator	Function	Parameters
On-Balance Volume	`ta.obv(close, volume)`	—
Accumulation/Distribution	`ta.ad(high, low, close, volume)`	—
Price Volume Trend	`ta.pvt(close, volume)`	—
Williams A/D	`ta.wad(high, low, close)`	—
VWAP	`ta.vwap(high, low, close, volume)`	—

Utility¶

Indicator	Function	Parameters
Highest	`ta.highest(source, length)`	source, period
Lowest	`ta.lowest(source, length)`	source, period
Change	`ta.change(source, length)`	source, period
Median	`ta.median(source, length)`	source, period
Range	`ta.range(high, low)`	—
Linear Regression	`ta.linreg(source, length, offset)`	source, period, offset
Rising	`ta.rising(source, length)`	source, period
Falling	`ta.falling(source, length)`	source, period
Cumulative Sum	`ta.cum(source)`	source

Cross Detection¶

Function	Returns `True` When
`ta.crossover(a, b)`	`a` crosses above `b` (`a > b` and `a.shift(1) <= b.shift(1)`)
`ta.crossunder(a, b)`	`a` crosses below `b` (`a < b` and `a.shift(1) >= b.shift(1)`)
`ta.cross(a, b)`	Either crossover or crossunder

Compiled Strategy Object¶

The final output of the compilation pipeline:

@dataclass
class TransformedStrategy:
    name: str                       # From strategy("name", ...)
    inputs: dict[str, InputParam]   # {paramTitle: IntInput(default=12, ...), ...}
    compute: Callable               # (df, params) -> (le, lx, se, sx)
    compute_fast: Callable | None   # (opens, highs, lows, closes, vols, params) -> 4 bools
    warmup: int                     # max(all_indicator_periods) * 2
    source_code: str                # Original PineScript source
    generated_code: str             # Generated Python (for debugging)
    settings: dict                  # {initial_capital, commission, slippage}

Warmup Calculation

The compiler scans all indicator period arguments and sets warmup = max(periods) * 2. This is conservative — EMA technically needs infinite history, but 2× the longest period is practical. The backtester skips the first warmup bars to avoid NaN-contaminated signals.

Security¶

The generated Python executes via exec() in a restricted namespace:

namespace = {"ta": ta, "pd": pd, "np": np}
exec(generated_source, namespace)
compute_fn = namespace["_compute"]

The namespace deliberately excludes os, sys, subprocess, importlib, and all other modules that could enable filesystem or network access. The compiler only generates code from the builder's constrained PineScript subset — it does not accept arbitrary user code.

Untrusted Input

If the compiler is ever exposed to arbitrary PineScript from untrusted users (beyond the builder's constrained output), additional sandboxing is required: RestrictedPython, subprocess isolation, or WASM execution. The current exec() approach is safe only because the builder generates a predictable, auditable subset of PineScript.

PineScript Subset — In Scope vs. Out of Scope¶

In Scope	Out of Scope
`strategy()` declaration	`for` / `while` loops (hard to vectorize)
`input.int()`, `input.float()`, `input.bool()`, `input.string()`	`var` / `varip` (persistent state)
Variable assignments	`request.security()` (multi-timeframe)
Tuple destructuring `[a, b, c] = f()`	`plot()`, `plotshape()` (visual only)
`ta.*` indicator calls (37 indicators)	User-defined functions
Boolean expressions (`and`, `or`, `not`)	`array.` / `matrix.` types
`if` blocks with `strategy.entry/close/exit`	`switch` / `ternary` expressions
`math.*` functions	String manipulation
`nz()`, `na()`	Type casting

File Map¶

Concept	File
Public API (`transform_pinescript`)	`backtest/pine/__init__.py`
Tokenizer	`backtest/pine/tokens.py`
Recursive descent parser	`backtest/pine/parser.py`
AST node definitions	`backtest/pine/ast_nodes.py`
Code generator (AST → Python)	`backtest/pine/codegen.py`
TA indicator library (37 indicators)	`backtest/ta.py`
TransformedStrategy dataclass	`backtest/strategy.py`
Input parameter types	`backtest/strategy.py` (`IntInput`, `FloatInput`, etc.)
Sample `.pine` strategies	`backtest/strategies/`