# Funding Rates Dataset — historical perp funding data across 20 exchanges

Compiled on 2026-04-10 04:05 UTC.

## Coverage

- **Exchanges**: 20 (Binance, Bybit, OKX, Bitget, MEXC, Hyperliquid, Gate.io, dYdX v4, Bitfinex, Kraken Futures, BingX, BitMEX, KuCoin Futures, HTX, Phemex, BTSE, Coinbase International, Deribit, WhiteBit, Backpack)
- **Total rows**: 3,828,133
- **Date range**: 2026-04-07 19:23 UTC → 2026-04-10 03:59 UTC
- **Time span**: 56.6 hours
- **Refresh cadence**: every 5 minutes (one row per `(exchange, symbol)` per cycle, conditional on the exchange API responding)

## Schema

Every row is a single observation: one funding rate quote on one exchange for one symbol at one point in time.

| Column | Type | Description |
|---|---|---|
| `exchange` | string | Lowercase short name (e.g. `binance`, `kraken`, `coinbase_intx`) |
| `symbol` | string | Always normalized to `{BASE}USDT` form regardless of how the upstream labels it |
| `base` | string | Base coin in canonical form (XBT → BTC, etc.) |
| `funding_rate` | float | Decimal funding rate per period (e.g. `0.0001` = 0.01%) — NOT annualized, NOT a percentage |
| `funding_interval_hours` | int | Period length in hours: 1, 4, 8, or 24 |
| `next_funding_time` | int | Unix seconds, 0 if unknown |
| `mark_price` | float | Mark price in USD per unit base |
| `volume_24h_usd` | float | 24h dollar volume in USD |
| `fetched_at` | int | Unix seconds when WE fetched it (not when the upstream computed it) |

## Files in this archive

| File | Compressed size | Rows |
|---|---|---|
| `funding_rates_combined.csv.gz` | 72.9 MB | 3,828,133 |
| `funding_rates_backpack.csv.gz` | 478 KB | 26,296 |
| `funding_rates_binance.csv.gz` | 9.0 MB | 433,490 |
| `funding_rates_bingx.csv.gz` | 5.6 MB | 314,825 |
| `funding_rates_bitfinex.csv.gz` | 496 KB | 29,456 |
| `funding_rates_bitget.csv.gz` | 6.2 MB | 333,196 |
| `funding_rates_bitmex.csv.gz` | 247 KB | 22,860 |
| `funding_rates_btse.csv.gz` | 1.0 MB | 69,131 |
| `funding_rates_bybit.csv.gz` | 6.6 MB | 350,748 |
| `funding_rates_coinbase_intx.csv.gz` | 640 KB | 67,716 |
| `funding_rates_deribit.csv.gz` | 80 KB | 6,840 |
| `funding_rates_dydx.csv.gz` | 965 KB | 75,144 |
| `funding_rates_gateio.csv.gz` | 6.0 MB | 395,457 |
| `funding_rates_htx.csv.gz` | 1.3 MB | 109,719 |
| `funding_rates_hyperliquid.csv.gz` | 1.9 MB | 141,064 |
| `funding_rates_kraken.csv.gz` | 3.6 MB | 164,153 |
| `funding_rates_kucoin.csv.gz` | 6.0 MB | 281,818 |
| `funding_rates_mexc.csv.gz` | 8.7 MB | 470,290 |
| `funding_rates_okx.csv.gz` | 2.8 MB | 178,193 |
| `funding_rates_phemex.csv.gz` | 3.2 MB | 251,015 |
| `funding_rates_whitebit.csv.gz` | 1.4 MB | 106,722 |

## Per-exchange row counts

| Exchange | Rows | Distinct symbols |
|---|---|---|
| backpack | 26,296 | 76 |
| binance | 433,490 | 670 |
| bingx | 314,825 | 634 |
| bitfinex | 29,456 | 56 |
| bitget | 333,196 | 544 |
| bitmex | 22,860 | 46 |
| btse | 69,131 | 160 |
| bybit | 350,748 | 544 |
| coinbase_intx | 67,716 | 171 |
| deribit | 6,840 | 18 |
| dydx | 75,144 | 124 |
| gateio | 395,457 | 644 |
| htx | 109,719 | 219 |
| hyperliquid | 141,064 | 229 |
| kraken | 164,153 | 322 |
| kucoin | 281,818 | 562 |
| mexc | 470,290 | 764 |
| okx | 178,193 | 288 |
| phemex | 251,015 | 508 |
| whitebit | 106,722 | 294 |

## Quick start

```python
import csv
import gzip

with gzip.open("funding_rates_combined.csv.gz", "rt") as f:
    reader = csv.DictReader(f)
    for row in reader:
        if row["exchange"] == "binance" and row["base"] == "BTC":
            print(row["funding_rate"], row["fetched_at"])
```

```python
# With pandas
import pandas as pd
df = pd.read_csv("funding_rates_combined.csv.gz", compression="gzip")
btc = df[df["base"] == "BTC"]
print(btc.groupby("exchange")["funding_rate"].describe())
```

## Notes on quality

- **Funding rate sign convention**: positive means longs pay shorts, negative means shorts pay longs. This matches the convention used by every major exchange.
- **Per-symbol funding interval**: about 30% of perps on Binance and Bybit settle every 4h instead of 8h, and Hyperliquid is entirely 1h. The `funding_interval_hours` column reflects each symbol's actual interval as fetched from the upstream `fundingInfo` / `instruments-info` / equivalent endpoint.
- **Symbol normalization**: every row uses `{BASE}USDT` form for consistency. Cross-exchange queries (e.g. "BTC across all venues") work without alias mapping.
- **Volume**: the `volume_24h_usd` column is the 24h dollar volume normalized from each exchange's quirky unit conventions. Four normalization bugs were caught during integration:
  - OKX `volCcy24h` is in BASE coin units for SWAPs (not USDT) → multiply by `last`
  - BTSE `volume` is contract count (not base units) → multiply by `contractSize`
  - Kraken `fundingRate` is USD-per-contract per period (not decimal) → divide by `mark_price`
  - BitMEX uses XBT for Bitcoin internally → normalize to BTC

## Limitations

- This is a **point-in-time snapshot** dataset, not a true streaming feed. The data is collected by polling, so within-cycle changes are lost.
- The `fetched_at` timestamp is when WE polled the exchange, not when the upstream itself updated the rate. The actual funding rate may have been computed slightly earlier.
- Coverage starts from when the collector was first deployed (2026-04-07 19:23 UTC). There is no historical backfill.
- No order book depth data, no liquidations, no spot prices. This is a focused dataset.

## License

Data: CC BY 4.0 (free to use, share, modify, with attribution to "Funding Finder").
The methodology README and the build script: MIT.

## Citation

If you use this dataset in academic or commercial research, please cite:

> Funding Finder. *Cross-exchange perpetual funding rate dataset*. 2026-04. http://178.104.60.252:8083/datasets/

## Source code

The collector that produces this dataset is open source: a single Python file (`collector.py`, ~1500 lines) that polls 20 exchanges every 5 minutes and writes to a SQLite table. Available as a downloadable tarball at `http://178.104.60.252:8083/downloads/funding-collector-0.4.3.tar.gz` (will move to GitHub when credentials arrive).
