1. Abstract & Introduction

Simple Contrastive Sentence Embedding Framework

Unsupervised method

⇒ 같은 문장을 Dropout을 applying 한 후 두 번 반복해서 넣음 → 이렇게 나온 2개의 임베딩 = positive pairs

⇒ 다른 sentence(negatives)를 넣고 positive를 다시 한번 predict

⇒ minimal data augmentation 효과

⇒ dropout을 제거하면 representation collapse 생김

Supervised method

NLI 데이터 사용

성능 증명

(1) alignment between semantically-related positive pairs

(2) improve uniformity

contrastive learning objective “flattens” the singular value distribution of the sentence embedding space

Evaluation