PineScript v6 Compiler¶
A custom 4-stage compilation pipeline that transforms PineScript v6 source code into vectorized Python functions. This is not a general-purpose PineScript interpreter — it is optimized for the rigid, predictable structure generated by the visual strategy builder. The compiler handles the subset of PineScript v6 needed for signal-based strategies: strategy() declarations, input.*() parameters, ta.* indicator calls, boolean expressions, and if blocks with strategy.entry()/strategy.close().
Compilation Pipeline¶
flowchart TB
A[".pine Source"] --> B["Tokenizer"]
B --> C["Token Stream"]
C --> D["Parser"]
D --> E["AST"]
E --> F["Code Generator"]
F --> G["Python Source<br/>_compute + _compute_fast"]
G --> H["TransformedStrategy"]
flowchart TB
subgraph Input
SRC["strategy('MACD Cross', overlay=true)<br/>fast = input.int(12, 'Fast')<br/>[m, s, _] = ta.macd(close, fast, 26, 9)<br/>if ta.crossover(m, s)<br/> strategy.entry('Long', strategy.long)"]
end
subgraph Stage1["Stage 1 — Tokenizer"]
TOK["KEYWORD:strategy LPAREN STRING:'MACD Cross' ...<br/>IDENT:fast ASSIGN KEYWORD:input DOT IDENT:int ...<br/>LBRACKET IDENT:m COMMA IDENT:s COMMA ...<br/>KEYWORD:if IDENT:ta DOT IDENT:crossover ...<br/>INDENT IDENT:strategy DOT IDENT:entry ..."]
end
subgraph Stage2["Stage 2 — Parser"]
AST["Program<br/>├── StrategyDecl(name='MACD Cross', overlay=true)<br/>├── InputDecl(fast, int, default=12)<br/>├── Assignment([m, s, _] = ta.macd(...))<br/>└── IfBlock<br/> ├── condition: ta.crossover(m, s)<br/> └── body: strategy.entry('Long', long)"]
end
subgraph Stage3["Stage 3 — Code Generator"]
PY["def _compute(df, params):<br/> _close = df['close']<br/> fast = params.get('Fast', 12)<br/> (m, s, _) = ta.macd(_close, fast, 26, 9)<br/> long_entry = ta.crossover(m, s).fillna(False)<br/> return long_entry, long_exit, short_entry, short_exit"]
end
subgraph Output
OBJ["TransformedStrategy<br/>name = 'MACD Cross'<br/>compute = _compute<br/>warmup = 52<br/>inputs = {Fast: IntInput(12)}"]
end
Input --> Stage1 --> Stage2 --> Stage3 --> Output
Stage 1: Tokenizer¶
File: backtest/pine/tokens.py
The tokenizer converts raw PineScript source into a flat stream of typed tokens. It handles several PineScript-specific behaviors:
| Behavior | Description | Example |
|---|---|---|
| Comment stripping | Removes // line comments, preserving // inside string literals |
rsi = ta.rsi(close, 14) // lookback → strips comment |
| Continuation joining | Joins lines with unbalanced parentheses into a single logical line | [a, b, c] = ta.macd(close,12, 26, 9) → one line |
| Indent tracking | Emits INDENT/DEDENT tokens for PineScript's if-block structure |
if conditionstrategy.entry(...) |
| Keyword recognition | Distinguishes keywords (if, and, or, not, true, false) from identifiers |
if → KEYWORD, rsi_val → IDENT |
Token Types¶
NUMBER 12, 3.14, 0.5
STRING "MACD Cross", 'Long'
IDENT rsi_val, macdLine, fast_length
KEYWORD if, and, or, not, true, false, strategy, input
DOT .
COMMA ,
ASSIGN =
LPAREN ( RPAREN )
LBRACKET [ RBRACKET ]
COMPARE >, <, >=, <=, ==, !=
OPERATOR +, -, *, /
INDENT (indentation increase)
DEDENT (indentation decrease)
NEWLINE (logical line boundary)
Stage 2: Parser¶
File: backtest/pine/parser.py
A recursive descent parser that consumes the token stream and produces an Abstract Syntax Tree. The parser recognizes the following PineScript constructs:
| AST Node | PineScript Construct | Example |
|---|---|---|
StrategyDecl |
strategy() declaration |
strategy("Name", overlay=true, initial_capital=10000) |
InputDecl |
input.*() parameter definition |
fast = input.int(12, "Fast Length") |
Assignment |
Variable assignment (single or tuple destructuring) | rsi = ta.rsi(close, 14) or [m, s, h] = ta.macd(...) |
IfBlock |
if block with strategy.entry/close/exit |
if longCondstrategy.entry("Long", strategy.long) |
FunctionCall |
ta.*, math.*, nz(), na() calls |
ta.crossover(macdLine, signalLine) |
BinaryOp |
Boolean and arithmetic expressions | rsi > 70 and macd > 0 |
UnaryOp |
Negation | not condition |
AST Structure Example¶
For a simple RSI strategy:
Program(
strategy=StrategyDecl(
name="RSI Overbought/Oversold",
settings={"overlay": True, "initial_capital": 10000}
),
inputs=[
InputDecl(var="length", type="int", default=14, title="RSI Length"),
InputDecl(var="upper", type="int", default=70, title="Overbought"),
InputDecl(var="lower", type="int", default=30, title="Oversold"),
],
assignments=[
Assignment(var="rsi_val", expr=FunctionCall("ta.rsi", [Ident("close"), Ident("length")])),
],
blocks=[
IfBlock(
condition=BinaryOp(Ident("rsi_val"), "<", Ident("lower")),
body=[StrategyAction("entry", "Long", "strategy.long")]
),
IfBlock(
condition=BinaryOp(Ident("rsi_val"), ">", Ident("upper")),
body=[StrategyAction("entry", "Short", "strategy.short")]
),
]
)
Stage 3: Code Generator¶
File: backtest/pine/codegen.py
The code generator walks the AST and emits two Python functions:
_compute(df, params) — Pandas Path¶
Full DataFrame operations. Used by standard backtesting mode where the strategy runs once over the entire dataset.
def _compute(df, params):
_open = df['open']
_high = df['high']
_low = df['low']
_close = df['close']
_volume = df['volume']
length = params.get('RSI Length', 14)
upper = params.get('Overbought', 70)
lower = params.get('Oversold', 30)
rsi_val = ta.rsi(_close, length)
long_entry = (rsi_val < lower).fillna(False)
long_exit = pd.Series(False, index=df.index)
short_entry = (rsi_val > upper).fillna(False)
short_exit = pd.Series(False, index=df.index)
return long_entry, long_exit, short_entry, short_exit
_compute_fast(opens, highs, lows, closes, volumes, params) — NumPy Fast Path¶
Scalar-only operations on raw NumPy arrays. Returns 4 scalar booleans for the last bar only. Used by the magnifier's inner loop where compute is called potentially thousands of times per backtest — once per sub-bar.
def _compute_fast(opens, highs, lows, closes, volumes, params):
length = params.get('RSI Length', 14)
upper = params.get('Overbought', 70)
lower = params.get('Oversold', 30)
rsi_val = ta_fast.rsi(closes, length)
long_entry = rsi_val < lower
long_exit = False
short_entry = rsi_val > upper
short_exit = False
return long_entry, long_exit, short_entry, short_exit
Key Transformations¶
The code generator performs several critical translations to bridge PineScript semantics with vectorized Python:
Price Builtins → DataFrame Columns¶
| PineScript | Generated Python | Reason |
|---|---|---|
close |
_close (alias for df['close']) |
Avoid Python builtin shadowing |
open |
_open (alias for df['open']) |
open is a Python builtin |
high |
_high |
Consistency |
low |
_low |
Consistency |
volume |
_volume |
Consistency |
hlc3 |
(_high + _low + _close) / 3 |
Derived price source |
ohlc4 |
(_open + _high + _low + _close) / 4 |
Derived price source |
Implicit Argument Injection¶
PineScript's ta.atr(14) implicitly uses high, low, close. The generated Python must make these explicit:
IMPLICIT_ARGS = {
"atr": ("_high", "_low", "_close"),
"supertrend": ("_high", "_low", "_close"),
"sar": ("_high", "_low"),
"dmi": ("_high", "_low", "_close"),
"obv": ("_close", "_volume"),
"mfi": ("_high", "_low", "_close", "_volume"),
"vwap": ("_high", "_low", "_close", "_volume"),
"ad": ("_high", "_low", "_close", "_volume"),
"wad": ("_high", "_low", "_close"),
}
| PineScript | Generated Python |
|---|---|
ta.atr(14) |
ta.atr(_high, _low, _close, 14) |
ta.rsi(close, 14) |
ta.rsi(_close, 14) |
ta.obv() |
ta.obv(_close, _volume) |
ta.supertrend(3, 10) |
ta.supertrend(_high, _low, _close, 3, 10) |
Boolean Operators¶
| PineScript | Generated Python | Reason |
|---|---|---|
a and b |
(a) & (b) |
pandas Series require bitwise &, not and |
a or b |
(a) \| (b) |
pandas Series require bitwise \|, not or |
not a |
~(a) |
Bitwise NOT for Series |
Parenthesization is critical
Without explicit parentheses, a & b | c is evaluated as a & (b | c) due to Python operator precedence. The codegen wraps every operand: (a) & (b), (a) | (b).
Other Translations¶
| PineScript | Generated Python |
|---|---|
math.abs(x) |
np.abs(x) |
math.max(a, b) |
np.maximum(a, b) |
math.min(a, b) |
np.minimum(a, b) |
math.sqrt(x) |
np.sqrt(x) |
nz(x) |
x.fillna(0) |
na(x) |
x.isna() |
[a, b, c] = f() |
(a, b, c) = f() |
true / false |
True / False |
NaN Safety¶
Every signal condition is wrapped with .fillna(False). Indicators return NaN during their warmup period (e.g., the first 14 bars for RSI-14), and NaN must never propagate as a True signal.
Why Two Compute Paths¶
| Path | Used By | Input | Output | Overhead |
|---|---|---|---|---|
_compute() |
Standard backtest | pd.DataFrame (full dataset) |
4 × pd.Series (boolean) |
DataFrame allocation, index alignment |
_compute_fast() |
Magnifier inner loop | 5 × np.ndarray (raw arrays) |
4 × bool (scalars) |
Minimal — pure NumPy |
The magnifier recomputes signals on every sub-bar — potentially 10,000+ calls per backtest (1,000 chart bars × 10 sub-bars each). At that volume, pandas DataFrame overhead dominates:
The fast path eliminates pandas entirely — raw NumPy arrays in, scalar booleans out. When _compute_fast is not available (complex strategies with unsupported operations), the magnifier falls back to _compute with a performance penalty.
Supported Indicators¶
The TA indicator library (backtest/ta.py) implements 37 indicators as static methods on a ta class. All accept and return pd.Series, use vectorized operations (no Python loops), and are verified against TradingView output.
Trend¶
| Indicator | Function | Parameters |
|---|---|---|
| Simple Moving Average | ta.sma(source, length) |
source, period |
| Exponential Moving Average | ta.ema(source, length) |
source, period |
| Weighted Moving Average | ta.wma(source, length) |
source, period |
| Volume-Weighted MA | ta.vwma(source, volume, length) |
source, volume, period |
| Hull Moving Average | ta.hma(source, length) |
source, period |
| Running Moving Average | ta.rma(source, length) |
source, period |
| Arnaud Legoux MA | ta.alma(source, length, offset, sigma) |
source, period, offset, sigma |
| Symmetrically-Weighted MA | ta.swma(source) |
source |
| SuperTrend | ta.supertrend(high, low, close, factor, period) |
factor, ATR period |
Momentum¶
| Indicator | Function | Parameters |
|---|---|---|
| Relative Strength Index | ta.rsi(source, length) |
source, period |
| MACD | ta.macd(source, fast, slow, signal) |
source, fast/slow/signal periods |
| Stochastic | ta.stoch(high, low, close, k, d, smooth) |
K period, D period, smoothing |
| Commodity Channel Index | ta.cci(high, low, close, length) |
period |
| Money Flow Index | ta.mfi(high, low, close, volume, length) |
period |
| Chande Momentum Oscillator | ta.cmo(source, length) |
source, period |
| Rate of Change | ta.roc(source, length) |
source, period |
| True Strength Index | ta.tsi(source, long, short) |
source, long/short periods |
| Momentum | ta.mom(source, length) |
source, period |
| Williams %R | ta.wpr(high, low, close, length) |
period |
| Percent Rank | ta.percentrank(source, length) |
source, period |
Volatility¶
| Indicator | Function | Parameters |
|---|---|---|
| Average True Range | ta.atr(high, low, close, length) |
period |
| Bollinger Bands | ta.bb(source, length, mult) |
source, period, multiplier |
| Bollinger Band Width | ta.bbw(source, length, mult) |
source, period, multiplier |
| Keltner Channel | ta.kc(high, low, close, length, mult) |
period, multiplier |
| Keltner Channel Width | ta.kcw(high, low, close, length, mult) |
period, multiplier |
| Directional Movement Index | ta.dmi(high, low, close, length) |
period |
| Standard Deviation | ta.stdev(source, length) |
source, period |
| Parabolic SAR | ta.sar(high, low, start, inc, max) |
start, increment, max |
| Center of Gravity | ta.cog(source, length) |
source, period |
Volume¶
| Indicator | Function | Parameters |
|---|---|---|
| On-Balance Volume | ta.obv(close, volume) |
— |
| Accumulation/Distribution | ta.ad(high, low, close, volume) |
— |
| Price Volume Trend | ta.pvt(close, volume) |
— |
| Williams A/D | ta.wad(high, low, close) |
— |
| VWAP | ta.vwap(high, low, close, volume) |
— |
Utility¶
| Indicator | Function | Parameters |
|---|---|---|
| Highest | ta.highest(source, length) |
source, period |
| Lowest | ta.lowest(source, length) |
source, period |
| Change | ta.change(source, length) |
source, period |
| Median | ta.median(source, length) |
source, period |
| Range | ta.range(high, low) |
— |
| Linear Regression | ta.linreg(source, length, offset) |
source, period, offset |
| Rising | ta.rising(source, length) |
source, period |
| Falling | ta.falling(source, length) |
source, period |
| Cumulative Sum | ta.cum(source) |
source |
Cross Detection¶
| Function | Returns True When |
|---|---|
ta.crossover(a, b) |
a crosses above b (a > b and a.shift(1) <= b.shift(1)) |
ta.crossunder(a, b) |
a crosses below b (a < b and a.shift(1) >= b.shift(1)) |
ta.cross(a, b) |
Either crossover or crossunder |
Compiled Strategy Object¶
The final output of the compilation pipeline:
@dataclass
class TransformedStrategy:
name: str # From strategy("name", ...)
inputs: dict[str, InputParam] # {paramTitle: IntInput(default=12, ...), ...}
compute: Callable # (df, params) -> (le, lx, se, sx)
compute_fast: Callable | None # (opens, highs, lows, closes, vols, params) -> 4 bools
warmup: int # max(all_indicator_periods) * 2
source_code: str # Original PineScript source
generated_code: str # Generated Python (for debugging)
settings: dict # {initial_capital, commission, slippage}
Warmup Calculation
The compiler scans all indicator period arguments and sets warmup = max(periods) * 2. This is conservative — EMA technically needs infinite history, but 2× the longest period is practical. The backtester skips the first warmup bars to avoid NaN-contaminated signals.
Security¶
The generated Python executes via exec() in a restricted namespace:
namespace = {"ta": ta, "pd": pd, "np": np}
exec(generated_source, namespace)
compute_fn = namespace["_compute"]
The namespace deliberately excludes os, sys, subprocess, importlib, and all other modules that could enable filesystem or network access. The compiler only generates code from the builder's constrained PineScript subset — it does not accept arbitrary user code.
Untrusted Input
If the compiler is ever exposed to arbitrary PineScript from untrusted users (beyond the builder's constrained output), additional sandboxing is required: RestrictedPython, subprocess isolation, or WASM execution. The current exec() approach is safe only because the builder generates a predictable, auditable subset of PineScript.
PineScript Subset — In Scope vs. Out of Scope¶
| In Scope | Out of Scope |
|---|---|
strategy() declaration |
for / while loops (hard to vectorize) |
input.int(), input.float(), input.bool(), input.string() |
var / varip (persistent state) |
| Variable assignments | request.security() (multi-timeframe) |
Tuple destructuring [a, b, c] = f() |
plot(), plotshape() (visual only) |
ta.* indicator calls (37 indicators) |
User-defined functions |
Boolean expressions (and, or, not) |
array.* / matrix.* types |
if blocks with strategy.entry/close/exit |
switch / ternary expressions |
math.* functions |
String manipulation |
nz(), na() |
Type casting |
File Map¶
| Concept | File |
|---|---|
Public API (transform_pinescript) |
backtest/pine/__init__.py |
| Tokenizer | backtest/pine/tokens.py |
| Recursive descent parser | backtest/pine/parser.py |
| AST node definitions | backtest/pine/ast_nodes.py |
| Code generator (AST → Python) | backtest/pine/codegen.py |
| TA indicator library (37 indicators) | backtest/ta.py |
| TransformedStrategy dataclass | backtest/strategy.py |
| Input parameter types | backtest/strategy.py (IntInput, FloatInput, etc.) |
Sample .pine strategies |
backtest/strategies/ |