Einsum Benchmark

Enabling the Development of Next-Generation Tensor Execution Engines

Install Benchmark Package Download Dataset

Quick Start

Python Package
Raw Dataset

To install the base package, Python 3.10 or higher is required.

pip install einsum_benchmark

Afterwards you can load and run an instance like this:

import opt_einsum as oe
import einsum_benchmark

instance = einsum_benchmark.instances["qc_circuit_n49_m14_s9_e6_pEFGH_simplified"]

opt_size_path_meta = instance.paths.opt_size
print("Size optimized path")
print("log10[FLOPS]:", round(opt_size_path_meta.flops, 2))
print("log2[SIZE]:", round(opt_size_path_meta.size, 2))
result = oe.contract(
    instance.format_string, *instance.tensors, optimize=opt_size_path_meta.path
)
print("sum[OUTPUT]:", result.sum(), instance.result_sum)

Download and unzip the dataset from zenodo. Afterwards, install opt_einsum and numpy.

pip install opt_einsum numpy

Now you can load and run an instance like this:

import opt_einsum as oe
import pickle
import numpy as np

if __name__ == "__main__":

    with open(
        "./instances/qc_circuit_n49_m14_s9_e6_pEFGH_simplified.pkl", "rb"
    ) as file:
        format_string, tensors, path_meta, sum_output = pickle.load(file)

    # path optimized for minimal intermediate size
    path, size_log2, flops_log10, min_density, avg_density = path_meta[0]
    # path optimized for minimal total flops is stored und the key 1 in path_meta
    print("Size optimized path")
    print("log10[FLOPS]:", round(flops_log10, 2))
    print("log2[SIZE]:", round(size_log2, 2))
    result = oe.contract(format_string, *tensors, optimize=path)
    print("sum[OUTPUT]:", np.sum(result), sum_output)

For more information please check out Getting Started.

Why was this benchmark compiled?

Modern artificial intelligence and machine learning workflows rely on efficient tensor libraries. Current einsum libraries are tuned to efficiently execute tensor expressions with only a few, relatively large, dense, floating-point tensors. But, practical applications of einsum cover a much broader range of tensor expressions than those that can currently be executed efficiently. For this reason, we have created a benchmark dataset that encompasses this broad range of tensor expressions, allowing future implementations of einsum to build upon and be evaluated against.

Overview of the instances in the dataset

The benchmark dataset consists of 168 einsum problems divided into seven categories. Hyperedges means that an einsum problem contains contraction indices that are shared by more than two tensors. Hadamard products are element-wise multiplication operations performed between two tensors of identical dimensions. And, repeating indices within a single tensor represent either tensor traces or tensor diagonals, as indicated by the indices of the output tensor.

Category	Problems	Tensors
Graphical models	10	125—3,692
Tensor network language models	25	38—178
Model counting	50	331—579,972
Quantum computing	32	202—17,136
Random problems	16	53—1,668
Structural problems	21	26—2,000
Weighted model counting	14	358—230,848