Kronos: A Foundation Model for the Language of Financial Markets

TL;DR

What it solves: Kronos converts OHLCV K-line sequences into discrete tokens and predicts future K-line sequences.
How to use: Install requirements.txt, load the tokenizer and model from Hugging Face, and call KronosPredictor.predict.
Best for: Quant engineers and researchers prototyping forecasting and backtesting pipelines.
Quick win: Run examples/prediction_example.py to get a 120-step forecast in minutes.
Caveat: Finetuning needs Qlib and careful data preparation; production requires model conversion and latency testing.

Situation

You have raw K-line CSVs from exchanges, inconsistent timestamps, and a short deadline to produce a forecast series for backtesting. Typical baselines return a single number or a naive rolling mean. Kronos gives you a tokenizer and an autoregressive model trained on global exchange data so you can produce coherent multi-step forecasts instead. Think of K-line sequences as sentences in a noisy newspaper: the tokenizer is the stenographer that converts each paragraph into tokens, and Kronos is the weather map that predicts the next few paragraphs of market moves.

What Kronos is

Kronos is a family of decoder-only foundation models pre-trained on hierarchical discrete tokens derived from OHLCV K-lines. The repo provides tokenizer code, model classes, pre-trained checkpoints on the Hugging Face Hub, an easy KronosPredictor wrapper for inference, and finetuning scripts that use Qlib for data preparation.

In short: this repository provides a specialized tokenizer and pre-trained autoregressive models for K-line forecasting so that quant engineers and researchers can prototype multi-step forecasts and run quick backtests without building a full training pipeline from scratch.

Quickstart

Create a Python 3.10+ virtualenv.
Install deps:

pip install -r requirements.txt

Minimal inference (example):

from model import Kronos, KronosTokenizer, KronosPredictor

# load tokenizer and model from Hugging Face
tokenizer = KronosTokenizer.from_pretrained("NeoQuasar/Kronos-Tokenizer-base")
model = Kronos.from_pretrained("NeoQuasar/Kronos-small")

# prepare predictor
predictor = KronosPredictor(model, tokenizer, max_context=512)

# prepare x_df, x_timestamp, y_timestamp according to repo examples
pred_df = predictor.predict(
    df=x_df,
    x_timestamp=x_ts,
    y_timestamp=y_ts,
    pred_len=120,
    T=1.0,
    top_p=0.9,
    sample_count=1
)

print(pred_df.head())

Example output (pred_df.head()):

# index                 open    high    low     close  volume
2024-01-01 00:00:00     100.50  101.20  100.10  100.90  12345
2024-01-02 00:00:00     100.90  102.00  100.50  101.80  15000
2024-01-03 00:00:00     101.80  103.30  101.50  102.90  14000
2024-01-04 00:00:00     102.90  104.10  102.30  103.70  13000
2024-01-05 00:00:00     103.70  104.00  103.20  103.50  12500

Result: the predictor returns a pandas DataFrame indexed by future timestamps containing forecasted OHLC (and optional volume/amount). The sample above is a small illustration of the DataFrame shape and types you should expect.

Illustration (Markdy scene)

This small scene visualizes the data flow: CSV -> tokenizer -> model -> forecast.

Markdy animation

Where it fits

Prototype forecasting, research, and backtest workflows.
Use Kronos-mini/small/base depending on GPU or CPU budget.
max_context for most released models is 512; longer sequences require chunking.

Metaphor: treat Kronos as a forecasting newsroom - it reads recent ‘sentences’ of price action and writes the next few lines. The tokenizer is the stenographer; the model is the editor deciding the headline.

Reality check

predict_batch requires consistent lookback and pred_len across series.
Finetuning examples rely on Qlib and assume prepared data; follow finetune/config.py.
For production latency, convert models to ONNX/torchscript and benchmark.

💡 Tip: prefer safetensors checkpoints and test the pipeline on a small subset before full-scale finetuning to catch data-shape mismatches early.

⚠️ Warning: downloading model weights and tokenizers from Hugging Face requires network access and disk space; validate checksums and prefer trusted model IDs when running in production or audit-sensitive environments.