0. Abstract

1. Introduction

→ 해결 : fill-in-the-blanks problems by different prompt

Prompt based method can avoid embedding bias and utilize the original BERT layers.
prompts : to generate positive pairs by different viewpoints from different templates
결론 : a prompt based contrastive learning method with template denoising to leverage

(1) Contrastive Learning based methods → constructing positive sentence pairs.

(2) anisotropy : BERT-flow, BERT-whitenng → to reduce the anisotropy by post-processing the sentence embeddings from original BERT

Semantic Similarity에서 BERT의 성능이 안좋은 이유는 aniostropy (X) → ineffective BERT layers and static token embedding biases

BERT layer가 미치는 영향 분석 → two sentence embedding 비교

(1) averaging static token embeddings (input of BERT)

(2) averaging last layer (output of BERT)

→ measure the sentence level anisotropy

: anisotropy of sentence embeddingd은 cosine similarity로 계산됨

→ 1에 가까울수록 the more anisotropy
bert-base-uncased, roberta-base : harm the sentence embeddings performance
또한, performance degradation of BERT layers가 NOT sentence level anisotropy 때문

→ 이유 : last layer avg.가 static token 보다 더 isotropic