Some questions about the results in Table 5

#17

by begonie - opened Mar 25, 2025

Discussion

begonie

Mar 25, 2025

•

edited Mar 25, 2025

I am trying to reproduce the evaluation metrics provided by gte-multilingual-reranker-base

Retrieval model: gte-multilingual-base
Ranker model: gte-multilingual-reranker-base
datastes: MLDR(nDCG@10[13])

CMD:

python -m FlagEmbedding.evaluation.mldr \
    --eval_name mldr \
    --dataset_dir ./mldr/data \
    --dataset_names ar de en es fr hi it ja ko pt ru th zh \
    --splits test \
    --corpus_embd_save_dir ./mldr/corpus_embd \
    --output_dir ./mldr/search_results \
    --search_top_k 1000 \
    --rerank_top_k 100 \
    --overwrite False \
    --k_values 10 100 \
    --eval_output_method markdown \
    --eval_output_path ./mldr/mldr_eval_results.md \
    --eval_metrics ndcg_at_10 \
    --embedder_name_or_path Alibaba-NLP/gte-multilingual-base \
    --reranker_name_or_path Alibaba-NLP/gte-multilingual-reranker-base \
    --embedder_passage_max_length 8192 \
    --reranker_max_length 8192 \
    --trust_remote_code True \
    --embedder_batch_size 64 \
    --reranker_batch_size 64

Result:

Model	Reranker	average	ar-test	de-test	en-test	es-test	fr-test	hi-test	it-test	ja-test	ko-test	pt-test	ru-test	th-test	zh-test
gte-multilingual-base	gte-multilingual-reranker-base	72.875	77.082	68.048	69.663	94.798	88.294	65.428	82.078	67.169	70.880	88.400	83.732	47.039hh	44.763
gte-multilingual-base	NoReranker	56.602	54.981	55.155	51.032	81.228	76.218	45.197	66.926	52.053	46.773	79.298	64.037	35.472	27.461

I have a question. The score of gte-multilingual-base, 56.6, is consistent with that in the Table. However, after adding gte-multilingual-reranker-base, the score is only 72.875, which is not consistent with the 78.7 provided in the article. Is there something wrong with the usage?

begonie changed discussion title from Table 5 中结果的一些疑问 to Some questions about the results in Table 5 Mar 25, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment