目标:在 AWS SageMaker Jupyter Labs 上运行此自动标签笔记本。
内核尝试过:conda_pytorch_p36, conda_python3, conda_amazonei_mxnet_p27。
! pip install farm-haystack -q
# Install the latest master of Haystack
!pip install grpcio-tools==1.34.1 -q
!pip install git+https://github.com/deepset-ai/haystack.git -q
!wget --no-check-certificate https://dl.xpdfreader.com/xpdf-tools-linux-4.03.tar.gz
!tar -xvf xpdf-tools-linux-4.03.tar.gz && sudo cp xpdf-tools-linux-4.03/bin64/pdftotext /usr/local/bin
!pip install git+https://github.com/deepset-ai/haystack.git -q
Run Code Online (Sandbox Code Playgroud)
! pip install farm-haystack -q
# Install the latest master of Haystack
!pip install grpcio-tools==1.34.1 -q
!pip install git+https://github.com/deepset-ai/haystack.git -q
!wget --no-check-certificate https://dl.xpdfreader.com/xpdf-tools-linux-4.03.tar.gz
!tar -xvf xpdf-tools-linux-4.03.tar.gz && sudo cp xpdf-tools-linux-4.03/bin64/pdftotext /usr/local/bin …Run Code Online (Sandbox Code Playgroud) 当我在文档存储中写入文档时,我使用 Haystack 来搜索查询,不幸的是我发生了这个错误。这是我的代码:
if __name__ == "__main__":
document_store = ElasticsearchDocumentStore(
host='localhost',
username='', password='',
index='aurelius'
)
df = pd.read_csv('news.csv')
print(df.columns)
data_json = [{
'text': text,
'meta': {
'source': 'news'
}
} for text in df['Text'].values]
document_store.write_documents(data_json)
retriever_elastic = DensePassageRetriever(
document_store=document_store,
query_embedding_model='facebook/dpr-question_encoder-single-nq-base',
passage_embedding_model='facebook/dpr-ctx_encoder-single-nq-base',
embed_title=True
)
document_store.update_embeddings(retriever=retriever_elastic)
print(retriever_elastic.retrieve("german business confidence slides german business confidence fell in february knocking hopes of a speedy recovery in europe s largest economy. "))
Run Code Online (Sandbox Code Playgroud)