Jun*_*ari 4 nlp artificial-intelligence machine-learning bert-language-model
我使用句子转换器进行语义搜索,但有时它不理解上下文含义并返回错误的结果,例如。意大利语上下文/语义搜索的 BERT 问题
默认情况下,句子嵌入的向量边是 78 列,那么如何增加该维度,以便它能够深入理解上下文含义。
代码:
# Load the BERT Model
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('bert-base-nli-mean-tokens')
# Setup a Corpus
# A corpus is a list with documents split by sentences.
sentences = ['Absence of sanity',
'Lack of saneness',
'A man is eating food.',
'A man is eating a piece of bread.',
'The girl is carrying a baby.',
'A man is riding a horse.',
'A woman is playing violin.',
'Two men pushed carts through the woods.',
'A man is riding a white horse on an enclosed ground.',
'A monkey is playing drums.',
'A cheetah is running behind its prey.']
# Each sentence is encoded as a 1-D vector with 78 columns
sentence_embeddings = model.encode(sentences) ### how to increase vector dimention
print('Sample BERT embedding vector - length', len(sentence_embeddings[0]))
print('Sample BERT embedding vector - note includes negative values', sentence_embeddings[0])
Run Code Online (Sandbox Code Playgroud)