How do I add a test dataset and benchmark that include multiple special metrics? #3305
-
|
Dear MTEB Team, I would like to add the PosIR benchmark to MTEB in the near future. Details about this benchmark are available in this PR: #3147. PosIR defines at least two main metrics:
Given this, we would like the leaderboard to surface both metrics with the following columns:
Would MTEB support this change? Is there a straightforward way to implement it? Should I also update the MTEB leaderboard UI code? Thank you for your guidance. Best regards, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 6 replies
-
|
With the recent addition of RTEB, we began allowing custom summary tables. This has already been used by multiple benchmarks, including MIEB, RTEB, and HUME You can see an example of how it is implemented here: mteb/mteb/benchmarks/benchmark.py Line 113 in d2c704c I think it should cover your use-case |
Beta Was this translation helpful? Give feedback.
With the recent addition of RTEB, we began allowing custom summary tables. This has already been used by multiple benchmarks, including MIEB, RTEB, and HUME
You can see an example of how it is implemented here:
mteb/mteb/benchmarks/benchmark.py
Line 113 in d2c704c
I think it should cover your use-case