Source

This dataset is a processed version of high-frequency futures data from the Chinese commodity market, covering a one-year period from 2022-08-01 to 2023-08-01. The original raw data is described in the paper [1].

Access to the raw data can be requested by contacting the authors directly.

File: price_df_v2.parquet

This file contains millisecond-level price data for the most liquid futures contracts of selected commodity products.

Columns

Column NameDescription
TimeindexMillisecond-level timestamp index
asset.{product}_0Price of the most liquid futures contract for the corresponding {product}
info.segment_indexSegment identifier used to separate trading periods

Notes:

  • The segment_index column is useful for avoiding calculations (e.g., returns) that span across different trading segments, such as overnight gaps or session breaks.
  • The corresponding commodities for each {product} identifier are provided in Table 3 of the Supplementary Material of paper [1].

References

[1] Jiang, J., Richards, J., Huser, R., & Bolin, D. (2025). The Efficient Tail Hypothesis: An Extreme Value Perspective on Market Efficiency. Journal of Business & Economic Statistics, to appear.