Skip to content

PineScript v6 Compiler

A custom 4-stage compilation pipeline that transforms PineScript v6 source code into vectorized Python functions. This is not a general-purpose PineScript interpreter — it is optimized for the rigid, predictable structure generated by the visual strategy builder. The compiler handles the subset of PineScript v6 needed for signal-based strategies: strategy() declarations, input.*() parameters, ta.* indicator calls, boolean expressions, and if blocks with strategy.entry()/strategy.close().


Compilation Pipeline

flowchart TB
    A[".pine Source"] --> B["Tokenizer"]
    B --> C["Token Stream"]
    C --> D["Parser"]
    D --> E["AST"]
    E --> F["Code Generator"]
    F --> G["Python Source<br/>_compute + _compute_fast"]
    G --> H["TransformedStrategy"]
flowchart TB
    subgraph Input
        SRC["strategy('MACD Cross', overlay=true)<br/>fast = input.int(12, 'Fast')<br/>[m, s, _] = ta.macd(close, fast, 26, 9)<br/>if ta.crossover(m, s)<br/>    strategy.entry('Long', strategy.long)"]
    end

    subgraph Stage1["Stage 1 — Tokenizer"]
        TOK["KEYWORD:strategy LPAREN STRING:'MACD Cross' ...<br/>IDENT:fast ASSIGN KEYWORD:input DOT IDENT:int ...<br/>LBRACKET IDENT:m COMMA IDENT:s COMMA ...<br/>KEYWORD:if IDENT:ta DOT IDENT:crossover ...<br/>INDENT IDENT:strategy DOT IDENT:entry ..."]
    end

    subgraph Stage2["Stage 2 — Parser"]
        AST["Program<br/>├── StrategyDecl(name='MACD Cross', overlay=true)<br/>├── InputDecl(fast, int, default=12)<br/>├── Assignment([m, s, _] = ta.macd(...))<br/>└── IfBlock<br/>    ├── condition: ta.crossover(m, s)<br/>    └── body: strategy.entry('Long', long)"]
    end

    subgraph Stage3["Stage 3 — Code Generator"]
        PY["def _compute(df, params):<br/>    _close = df['close']<br/>    fast = params.get('Fast', 12)<br/>    (m, s, _) = ta.macd(_close, fast, 26, 9)<br/>    long_entry = ta.crossover(m, s).fillna(False)<br/>    return long_entry, long_exit, short_entry, short_exit"]
    end

    subgraph Output
        OBJ["TransformedStrategy<br/>name = 'MACD Cross'<br/>compute = _compute<br/>warmup = 52<br/>inputs = {Fast: IntInput(12)}"]
    end

    Input --> Stage1 --> Stage2 --> Stage3 --> Output

Stage 1: Tokenizer

File: backtest/pine/tokens.py

The tokenizer converts raw PineScript source into a flat stream of typed tokens. It handles several PineScript-specific behaviors:

Behavior Description Example
Comment stripping Removes // line comments, preserving // inside string literals rsi = ta.rsi(close, 14) // lookback → strips comment
Continuation joining Joins lines with unbalanced parentheses into a single logical line [a, b, c] = ta.macd(close,
12, 26, 9) → one line
Indent tracking Emits INDENT/DEDENT tokens for PineScript's if-block structure if condition
strategy.entry(...)
Keyword recognition Distinguishes keywords (if, and, or, not, true, false) from identifiers ifKEYWORD, rsi_valIDENT

Token Types

NUMBER      12, 3.14, 0.5
STRING      "MACD Cross", 'Long'
IDENT       rsi_val, macdLine, fast_length
KEYWORD     if, and, or, not, true, false, strategy, input
DOT         .
COMMA       ,
ASSIGN      =
LPAREN (    RPAREN )
LBRACKET [  RBRACKET ]
COMPARE     >, <, >=, <=, ==, !=
OPERATOR    +, -, *, /
INDENT      (indentation increase)
DEDENT      (indentation decrease)
NEWLINE     (logical line boundary)

Stage 2: Parser

File: backtest/pine/parser.py

A recursive descent parser that consumes the token stream and produces an Abstract Syntax Tree. The parser recognizes the following PineScript constructs:

AST Node PineScript Construct Example
StrategyDecl strategy() declaration strategy("Name", overlay=true, initial_capital=10000)
InputDecl input.*() parameter definition fast = input.int(12, "Fast Length")
Assignment Variable assignment (single or tuple destructuring) rsi = ta.rsi(close, 14) or [m, s, h] = ta.macd(...)
IfBlock if block with strategy.entry/close/exit if longCond
strategy.entry("Long", strategy.long)
FunctionCall ta.*, math.*, nz(), na() calls ta.crossover(macdLine, signalLine)
BinaryOp Boolean and arithmetic expressions rsi > 70 and macd > 0
UnaryOp Negation not condition

AST Structure Example

For a simple RSI strategy:

Program(
    strategy=StrategyDecl(
        name="RSI Overbought/Oversold",
        settings={"overlay": True, "initial_capital": 10000}
    ),
    inputs=[
        InputDecl(var="length", type="int", default=14, title="RSI Length"),
        InputDecl(var="upper", type="int", default=70, title="Overbought"),
        InputDecl(var="lower", type="int", default=30, title="Oversold"),
    ],
    assignments=[
        Assignment(var="rsi_val", expr=FunctionCall("ta.rsi", [Ident("close"), Ident("length")])),
    ],
    blocks=[
        IfBlock(
            condition=BinaryOp(Ident("rsi_val"), "<", Ident("lower")),
            body=[StrategyAction("entry", "Long", "strategy.long")]
        ),
        IfBlock(
            condition=BinaryOp(Ident("rsi_val"), ">", Ident("upper")),
            body=[StrategyAction("entry", "Short", "strategy.short")]
        ),
    ]
)

Stage 3: Code Generator

File: backtest/pine/codegen.py

The code generator walks the AST and emits two Python functions:

_compute(df, params) — Pandas Path

Full DataFrame operations. Used by standard backtesting mode where the strategy runs once over the entire dataset.

def _compute(df, params):
    _open  = df['open']
    _high  = df['high']
    _low   = df['low']
    _close = df['close']
    _volume = df['volume']

    length = params.get('RSI Length', 14)
    upper  = params.get('Overbought', 70)
    lower  = params.get('Oversold', 30)

    rsi_val = ta.rsi(_close, length)

    long_entry  = (rsi_val < lower).fillna(False)
    long_exit   = pd.Series(False, index=df.index)
    short_entry = (rsi_val > upper).fillna(False)
    short_exit  = pd.Series(False, index=df.index)

    return long_entry, long_exit, short_entry, short_exit

_compute_fast(opens, highs, lows, closes, volumes, params) — NumPy Fast Path

Scalar-only operations on raw NumPy arrays. Returns 4 scalar booleans for the last bar only. Used by the magnifier's inner loop where compute is called potentially thousands of times per backtest — once per sub-bar.

def _compute_fast(opens, highs, lows, closes, volumes, params):
    length = params.get('RSI Length', 14)
    upper  = params.get('Overbought', 70)
    lower  = params.get('Oversold', 30)

    rsi_val = ta_fast.rsi(closes, length)

    long_entry  = rsi_val < lower
    long_exit   = False
    short_entry = rsi_val > upper
    short_exit  = False

    return long_entry, long_exit, short_entry, short_exit

Key Transformations

The code generator performs several critical translations to bridge PineScript semantics with vectorized Python:

Price Builtins → DataFrame Columns

PineScript Generated Python Reason
close _close (alias for df['close']) Avoid Python builtin shadowing
open _open (alias for df['open']) open is a Python builtin
high _high Consistency
low _low Consistency
volume _volume Consistency
hlc3 (_high + _low + _close) / 3 Derived price source
ohlc4 (_open + _high + _low + _close) / 4 Derived price source

Implicit Argument Injection

PineScript's ta.atr(14) implicitly uses high, low, close. The generated Python must make these explicit:

IMPLICIT_ARGS = {
    "atr":        ("_high", "_low", "_close"),
    "supertrend": ("_high", "_low", "_close"),
    "sar":        ("_high", "_low"),
    "dmi":        ("_high", "_low", "_close"),
    "obv":        ("_close", "_volume"),
    "mfi":        ("_high", "_low", "_close", "_volume"),
    "vwap":       ("_high", "_low", "_close", "_volume"),
    "ad":         ("_high", "_low", "_close", "_volume"),
    "wad":        ("_high", "_low", "_close"),
}
PineScript Generated Python
ta.atr(14) ta.atr(_high, _low, _close, 14)
ta.rsi(close, 14) ta.rsi(_close, 14)
ta.obv() ta.obv(_close, _volume)
ta.supertrend(3, 10) ta.supertrend(_high, _low, _close, 3, 10)

Boolean Operators

PineScript Generated Python Reason
a and b (a) & (b) pandas Series require bitwise &, not and
a or b (a) \| (b) pandas Series require bitwise \|, not or
not a ~(a) Bitwise NOT for Series

Parenthesization is critical

Without explicit parentheses, a & b | c is evaluated as a & (b | c) due to Python operator precedence. The codegen wraps every operand: (a) & (b), (a) | (b).

Other Translations

PineScript Generated Python
math.abs(x) np.abs(x)
math.max(a, b) np.maximum(a, b)
math.min(a, b) np.minimum(a, b)
math.sqrt(x) np.sqrt(x)
nz(x) x.fillna(0)
na(x) x.isna()
[a, b, c] = f() (a, b, c) = f()
true / false True / False

NaN Safety

Every signal condition is wrapped with .fillna(False). Indicators return NaN during their warmup period (e.g., the first 14 bars for RSI-14), and NaN must never propagate as a True signal.


Why Two Compute Paths

Path Used By Input Output Overhead
_compute() Standard backtest pd.DataFrame (full dataset) 4 × pd.Series (boolean) DataFrame allocation, index alignment
_compute_fast() Magnifier inner loop 5 × np.ndarray (raw arrays) 4 × bool (scalars) Minimal — pure NumPy

The magnifier recomputes signals on every sub-bar — potentially 10,000+ calls per backtest (1,000 chart bars × 10 sub-bars each). At that volume, pandas DataFrame overhead dominates:

Standard path:  ~2ms per call × 10,000 = 20 seconds
Fast path:      ~0.1ms per call × 10,000 = 1 second

The fast path eliminates pandas entirely — raw NumPy arrays in, scalar booleans out. When _compute_fast is not available (complex strategies with unsupported operations), the magnifier falls back to _compute with a performance penalty.


Supported Indicators

The TA indicator library (backtest/ta.py) implements 37 indicators as static methods on a ta class. All accept and return pd.Series, use vectorized operations (no Python loops), and are verified against TradingView output.

Trend

Indicator Function Parameters
Simple Moving Average ta.sma(source, length) source, period
Exponential Moving Average ta.ema(source, length) source, period
Weighted Moving Average ta.wma(source, length) source, period
Volume-Weighted MA ta.vwma(source, volume, length) source, volume, period
Hull Moving Average ta.hma(source, length) source, period
Running Moving Average ta.rma(source, length) source, period
Arnaud Legoux MA ta.alma(source, length, offset, sigma) source, period, offset, sigma
Symmetrically-Weighted MA ta.swma(source) source
SuperTrend ta.supertrend(high, low, close, factor, period) factor, ATR period

Momentum

Indicator Function Parameters
Relative Strength Index ta.rsi(source, length) source, period
MACD ta.macd(source, fast, slow, signal) source, fast/slow/signal periods
Stochastic ta.stoch(high, low, close, k, d, smooth) K period, D period, smoothing
Commodity Channel Index ta.cci(high, low, close, length) period
Money Flow Index ta.mfi(high, low, close, volume, length) period
Chande Momentum Oscillator ta.cmo(source, length) source, period
Rate of Change ta.roc(source, length) source, period
True Strength Index ta.tsi(source, long, short) source, long/short periods
Momentum ta.mom(source, length) source, period
Williams %R ta.wpr(high, low, close, length) period
Percent Rank ta.percentrank(source, length) source, period

Volatility

Indicator Function Parameters
Average True Range ta.atr(high, low, close, length) period
Bollinger Bands ta.bb(source, length, mult) source, period, multiplier
Bollinger Band Width ta.bbw(source, length, mult) source, period, multiplier
Keltner Channel ta.kc(high, low, close, length, mult) period, multiplier
Keltner Channel Width ta.kcw(high, low, close, length, mult) period, multiplier
Directional Movement Index ta.dmi(high, low, close, length) period
Standard Deviation ta.stdev(source, length) source, period
Parabolic SAR ta.sar(high, low, start, inc, max) start, increment, max
Center of Gravity ta.cog(source, length) source, period

Volume

Indicator Function Parameters
On-Balance Volume ta.obv(close, volume)
Accumulation/Distribution ta.ad(high, low, close, volume)
Price Volume Trend ta.pvt(close, volume)
Williams A/D ta.wad(high, low, close)
VWAP ta.vwap(high, low, close, volume)

Utility

Indicator Function Parameters
Highest ta.highest(source, length) source, period
Lowest ta.lowest(source, length) source, period
Change ta.change(source, length) source, period
Median ta.median(source, length) source, period
Range ta.range(high, low)
Linear Regression ta.linreg(source, length, offset) source, period, offset
Rising ta.rising(source, length) source, period
Falling ta.falling(source, length) source, period
Cumulative Sum ta.cum(source) source

Cross Detection

Function Returns True When
ta.crossover(a, b) a crosses above b (a > b and a.shift(1) <= b.shift(1))
ta.crossunder(a, b) a crosses below b (a < b and a.shift(1) >= b.shift(1))
ta.cross(a, b) Either crossover or crossunder

Compiled Strategy Object

The final output of the compilation pipeline:

@dataclass
class TransformedStrategy:
    name: str                       # From strategy("name", ...)
    inputs: dict[str, InputParam]   # {paramTitle: IntInput(default=12, ...), ...}
    compute: Callable               # (df, params) -> (le, lx, se, sx)
    compute_fast: Callable | None   # (opens, highs, lows, closes, vols, params) -> 4 bools
    warmup: int                     # max(all_indicator_periods) * 2
    source_code: str                # Original PineScript source
    generated_code: str             # Generated Python (for debugging)
    settings: dict                  # {initial_capital, commission, slippage}

Warmup Calculation

The compiler scans all indicator period arguments and sets warmup = max(periods) * 2. This is conservative — EMA technically needs infinite history, but the longest period is practical. The backtester skips the first warmup bars to avoid NaN-contaminated signals.


Security

The generated Python executes via exec() in a restricted namespace:

namespace = {"ta": ta, "pd": pd, "np": np}
exec(generated_source, namespace)
compute_fn = namespace["_compute"]

The namespace deliberately excludes os, sys, subprocess, importlib, and all other modules that could enable filesystem or network access. The compiler only generates code from the builder's constrained PineScript subset — it does not accept arbitrary user code.

Untrusted Input

If the compiler is ever exposed to arbitrary PineScript from untrusted users (beyond the builder's constrained output), additional sandboxing is required: RestrictedPython, subprocess isolation, or WASM execution. The current exec() approach is safe only because the builder generates a predictable, auditable subset of PineScript.


PineScript Subset — In Scope vs. Out of Scope

In Scope Out of Scope
strategy() declaration for / while loops (hard to vectorize)
input.int(), input.float(), input.bool(), input.string() var / varip (persistent state)
Variable assignments request.security() (multi-timeframe)
Tuple destructuring [a, b, c] = f() plot(), plotshape() (visual only)
ta.* indicator calls (37 indicators) User-defined functions
Boolean expressions (and, or, not) array.* / matrix.* types
if blocks with strategy.entry/close/exit switch / ternary expressions
math.* functions String manipulation
nz(), na() Type casting

File Map

Concept File
Public API (transform_pinescript) backtest/pine/__init__.py
Tokenizer backtest/pine/tokens.py
Recursive descent parser backtest/pine/parser.py
AST node definitions backtest/pine/ast_nodes.py
Code generator (AST → Python) backtest/pine/codegen.py
TA indicator library (37 indicators) backtest/ta.py
TransformedStrategy dataclass backtest/strategy.py
Input parameter types backtest/strategy.py (IntInput, FloatInput, etc.)
Sample .pine strategies backtest/strategies/