samlogic.org / lab open notes

hl-bot · how it thinks.

A Hyperliquid perp trading bot. Mostly trend on majors with a mean-reverting sleeve and a daily mining loop that promotes new buckets to live. Written up so the curious — quants, traders, builders — can tell me where the edge is fragile.

Last updated 2026-04-27 Reading time ~12 min By @samlogic

00Read this first

Frame

This is a working bot, not a finished one. It runs live on a small VPS and writes both shadow and live trades to a single SQLite ledger. I am sharing the thinking, not the keys.

What I want from anyone reading: a sparring read. The "Open questions" section (10) is where I think the edge is most fragile. I would rather you tear those up than be polite about anything.

Out of scope: code style, repo layout, infra. In scope: edge, signal, friction, sizing, regime, sample size, anything you would test first.

01Edge thesis

Hyperliquid is a perps DEX with thin retail flow on tail listings, fast funding-rate swings, and a maker rebate that most of the on-chain crowd ignores. The bot bets that directional setups (momentum after consolidation, mean reversion after exhaustion) clear realistic friction (4.5 bps taker, 5 bps slip, hourly funding) more often on this venue than on a CEX of comparable depth, because the marginal participant is less informed and less hedged. That is the prior. The job of every iteration is to test it harder.

02What's running

03The loop, one tick

1. BTC regime check          → Long | Short | Neutral (single asset)
2. Account state             → equity, drawdown, daily PnL
   ├─ drawdown kill switch?   yes → flatten + halt
   └─ daily loss halt?        yes → no new entries
3. Manage open positions     → stop / target / [email protected]
                              → BE move + trail (v4 only)
                              → liquidation check
4. Market snapshot           → funding rates + mark prices
5. Per-coin scan             (every coin, every tick)
   ├─ momentum signal?        record shadow if new candle
   │                          score × regime multiplier
   ├─ mean_reversion signal?  record shadow if new candle
   │                          score × regime multiplier
   └─ best signal wins
6. Live gate                 score ≥ threshold AND
                             matches live_candidates.json bucket
                             AND not shadow_only
7. Execute                   IOC limit, size = compute_size(risk_fraction)
                             reduce-only stop posted as backup order

Two things worth flagging. One: regime is a scoring input, not a hard gate. Mean reversion still runs in trend regimes, just with a 0.6 multiplier on score. Two: shadow inserts always happen, even when live is gated off, because the training loop runs on shadow data.

04Strategies

momentum

strategy.rs

EMA8/21 cross on 15m, confirmed by 20-bar breakout, RSI zone, volume spike, ATR floor, ADX ≥ 25 with rising slope, funding filter.

atr_stop
1.5×
atr_target
3.0×
rr
1:2
adx_min
25
vol_mult
1.5
rsi_long
40-80 (loose), with directional confirmation
Shadow: Same logic, ADX ≥ 15, vol ≥ 1.0×, RSI 40-80 LONG / 20-60 SHORT. Wider net for ML training data.

mean_reversion

strategy.rs

RSI extreme + price outside Bollinger Band + tight ATR stop. Skipped when bands are expanding (BB width > 5%, that is a trend, not a reversion).

atr_stop
1.0×
atr_target
2.0×
rr
1:2
rsi_oversold
30
rsi_overbought
70
bb_width_max
5%
vol_max
2.5× (skip if spiking)
Shadow: RSI 42 / 58 (approach to extreme), BB width ≤ 8%.

ema_pullback

live_candidates.json

Renamed from ema_cross when entry semantics changed (commit 95aa680). Now requires pullback to EMA, not raw cross. Promoted via the alpha-mining loop; lives in live_candidates.json gated by (session, regime, vol_bucket).

context
Currently active in EU_OPEN and US_MAIN sessions, low-vol bucket (vol_ratio < 0.6).
Shadow: Same.

trend_alpha

live_candidates.json

4H trend-following bucket on ETH / SOL / AVAX. Promoted from the mining loop, not coded as a first-class strategy in strategy.rs.

context
Live on the 4H timeframe basket. Recent commit history shows iteration on trail config (3R → 2R → revert).
Shadow: Same.

Real-vs-shadow split is the most important architectural choice in here. Shadow casts a wide net so the training loop has signal density. Live tightens every threshold. It is the cheapest way I have found to train on more data than I trade on without contaminating live PnL.

05Regime and scoring

Regime is computed off BTC alone (1H bias plus a longer-window classifier in strategy::btc_regime) and applied as a scalar multiplier on the score of every candidate signal across the basket:

momentum     mean_reversion
Regime::Long/Short      × 1.0          × 0.6
Regime::Neutral         × 0.7          × 1.0

This is a soft preference, not a filter. Mean reversion can still beat momentum on score in a trend regime if the setup is clean enough. The regime multiplier just makes that uphill. The honest critique I expect is that single-asset BTC regime is a coarse proxy, and that alts do their own thing during liq cascades. I have left this on the open-questions list.

06Friction model evolution

Every shadow row is tagged with a sim_model_version string so we can re-run analyses against any single friction model and never accidentally mix regimes. The current default is v4, mirroring the live engine. Pre-v4 paper stats are not directly comparable.

Tag Added Why Learned
legacy
pre-2026-04
legacy_pre_friction_v2
First Rust port of the Python prototype. Toy execution model, no realistic costs. Get a Rust loop running end-to-end and recording shadow trades to a single SQLite file. PnL plotted in shadow was meaningfully better than what live trading produced. Friction was the lurking variable.
v2
2026-04
friction_v2_fee4p5_slip5_funding_hourly
Taker fee 4.5 bps, fixed slippage 5 bps, hourly funding cost. Sim entries and exits use these. Stop calling shadow PnL "edge" until it survives realistic execution costs. Most legacy "winners" were costs in disguise. Edge survival rate dropped roughly to a third.
v3
2026-04
friction_v3_setup_dedup_fee4p5_slip5_funding_hourly
Setup-key dedup. One economic setup, even if multiple strategy tags fire on the same candle, becomes one row in training data. Earlier records double-counted: momentum and ema_cross would both fire on the same breakout, and the trainer treated them as two independent observations. Effective sample size shrank, which made some "high-conf" buckets clearly overfit on rebased data.
v4
2026-04 (current)
friction_v4_liq_partial_trail_fee4p5_slip5_funding_hourly
Liquidation tracking, partial-TP at 1.5R for 50% of size, stop moves to breakeven on partial, trail kicks in further out. Live engine had partial-TP and trail; shadow did not. Paper and live diverged on every winning trade. v4 mirrors live, so paper and live now share the same friction model. Pre-v4 paper stats are not comparable. Last two weeks of commits are still iterating on partial vs no-partial (Option A revert in 4e08a53).

07Risk and exit management

08Alpha mining loop

Every day at 03:00 UTC a Python loop on the same VPS reads the rolling shadow ledger, groups closed trades by (strategy, side, session, regime, vol_bucket), computes rolling-Sharpe over a window, and writes the surviving tuples to live_candidates.json. The bot reloads that file at startup. A second cron at 03:15 UTC takes any auto-implementable ideas from a queue and turns them into PRs.

The promotion gate is the part I am least confident in. Sample sizes per bucket are small and parameter counts are not. I have a hand-wavy "Sharpe over rolling window" rule rather than an adversarial one, and that is open question 8.

09Live data

Pulls from https://hl-cot.samlogic.org on page load. Auth-gated by Cloudflare Access; if a panel says "auth required", open https://hl-cot.samlogic.org/health once to sign in, then refresh.

Open positions
loading…
7-day performance
loading…
Last 5 closed trades
loading…
Latest decisions (open + closed)
loading…

Each closed-trade row answers three questions: which strategy fired, what the bot saw at decision time, and what happened. The strategy field is the link from "this trade" to "this paragraph above", so you can always trace a winner or a loser back to the rules that produced it.

ts            2026-04-26T14:32:18Z
coin          ETH
side          LONG
strategy      momentum                ← which strategy decided
regime        Long                    ← BTC regime at decision
score         0.78                    ← composite after regime + session mults
entry_px      3247.10
stop_px       3198.40                 ← 1.5× ATR below
target_px     3344.50                 ← 3.0× ATR above (1:2 R:R)
exit_reason   target                  ← target | stop | trail_stop |
                                        liquidation | time_exit
sim_pnl_frac  +0.0186
features      { rsi: 62.1, ema_spread_pct: 0.0041, vol_ratio: 1.74,
                atr_pct: 0.0028, adx: 31.2, adx_slope: 0.45,
                bb_width: 0.038, funding_rate: 0.00018,
                session_label: "US_MAIN", regime: "Long" }

No free-text rationale, but (strategy, regime, score, features) is sufficient to reproduce the "why" deterministically. If you see a trade that looks wrong, the strategy tag points you to which gate let it through.

10Open questions

Things I think are weak. Ranked rough by my own confidence that they matter, descending. Tear up freely.

01 Single-asset regime classifier on BTC

Regime::Long/Short/Neutral is computed from BTC alone, then applied as a scoring multiplier across the basket. Alts decorrelate during liq cascades. Does this hold up on SOL/DOGE during BTC chop, or is the multiplier numerically right and economically wrong?

02 Partial-TP at 1.5R, then BE+trail

Recent commits flip-flop: Option A disabled partial entirely with a looser 3R/1.5R trail, then reverted to partial=0.5 at 2R trail. Is partial structurally negative EV in trend regimes (you cap the right tail), or just noisy on the current sample? Should partial be regime-conditional?

03 Slippage as a fixed 5 bps

Same slip applied to BTC ($M depth at 5bps) and DOGE ($k depth at 5bps). Probably too generous on tail names, too punitive on majors. Worth a per-coin or per-depth model? Or is constant-slip honest enough that the basket averages out?

04 ADX rising-over-3-bars filter

Adds 45 minutes of latency on the 15m before momentum can fire. Walk-forward shows it filters losers, but is the filter cutting fresh trends with the bad ones?

05 Concurrent correlated trades

Momentum on ETH + SOL + AVAX in a trend regime is functionally one trade with three legs. risk_fraction is per-position, not per-cluster. Should sizing penalize implied correlation, or is the basket a feature?

06 Funding filter thresholds

Headwind cap at 0.05%/8h, tailwind bonus floor at 0.01%/8h with bonus capped at 20 pts. The bonus floor is small enough to be noise on most pairs. Is funding a real signal or a costs-only term?

07 No book-side signals in the live path

book.rs and mm.rs exist but I have not yet wired their output into scoring. Closed-candle features are most of the signal. A ranked list of which microstructure features are worth the engineering cost is more useful than a yes/no.

08 Promotion gate from shadow to live

Daily mining loop promotes (strategy, side, session, regime, vol_bucket) tuples to live_candidates.json. The gate is rolling-Sharpe over a window. Sample size and alpha threshold are not yet adversarial. What is the right minimum n and t-stat before any bucket touches real money?

11What I'd value from you

  1. Read the open questions and tell me which are real and which are paper tigers.
  2. Anywhere you see structural overfitting (parameter counts vs effective sample size) flag it.
  3. If you had two weeks on this, what would you test first and why?
  4. Anything I am not asking that I should be?

Notes via @samlogic on X are easiest. I keep this page updated, so a sharp comment one week can change a section the next.