bert+es7实现相似度搜索(待测试与更新bert中文预处理模型)

更新时间:2023-05-30 10:05:38 阅读: 评论:0

bert+es7实现相似度搜索(待测试与更新bert中⽂预处理模型)
version: '3.7'
rvices:
猪八戒图片大全#  web:
#    build: ./web
#    ports:
运动鞋网店#    - "5000:5000"
#  environment:
#  - INDEX_NAME
# depends_on:
#  - elasticarch
#  - bertrving
# deploy:
#  resources:
#    limits:
#        memory: 500M
elasticarch:
image: /elasticarch/elasticarch:7.7.1
ports:
-"9200:9200"
volumes:
- es-data:/usr/share/elasticarch/data
tty: true
environment:
deploy:
resources:
limits:
memory: 1G
bertrving:
build: ./bertrving
ports:
-"5555:5555"
-
"5556:5556"
environment:
- PATH_MODEL=${PATH_MODEL}
volumes:
-"${PATH_MODEL}:/model"
deploy:
resources:
limits:
儿童疫苗接种memory: 8G #bert-rvice运⾏需要⾼内存占⽤
volumes:
es-data:
driver: local
export PATH_MODEL=./cad_L-12_H-768_A-12
bert模型路径
2.创建es index mapping
在kibana运⾏:
PUT /quotes
{
"ttings": {
"index": {"number_of_shards": "1","number_of_replicas": "0"}
},
"mappings" : {
"properties": {
"quote" : {"type": "text"},
"vector" : {"type": "den_vector","dims" : 768 }
}
}
}
此处创建了⼀个名为quotes的index,含有quote与vector两个字段
3.bert处理数据并导⼊index
from elasticarch import Elasticarch
from elasticarch.helpers import bulk
import numpy as np
三角形的面积怎么算
from bert_rving.client import BertClient
bc = BertClient()
es = Elasticarch([{'host':'localhost','port':9200}])
def getQuotes():
f = open('/Urs/linxier/Downloads/','r')
for line in f:
quote = line.strip().lower()
print(quote)
if(len(quote.split())): # 510 IS THE MAX卖米原文
vector = bc.encode()[0].tolist()
yield {
"quote" : quote,
"vector" : vector
}
bulk(client=es, actions = getQuotes(), index="quotes",chunk_size=1000, request_timeout = 120) 4.进⾏相似度搜索
from bert_rving.client import BertClient
bc = BertClient()
from elasticarch import Elasticarch
client = Elasticarch([{'host': 'localhost','port': 9200}])
def findRelevantHits(inQuiry):
父爱如山的唯美句子
inQuiry_vector = bc.encode()[0].tolist()
queries = {
'bert': {
"script_score": {
"query": {
"match_all": {}
},
"script": {
"source": "cosineSimilarity(params.inQuiry_vector, doc['vector']) + 1.0",
"params": {
"inQuiry_vector": inQuiry_vector
}
}
}
},
'mlt': {
"more_like_this": {
"fields": ["quote"],
"like": inQuiry,
"min_term_freq": 1,
"max_query_terms": 50,
"min_doc_freq": 1
}
}
}
result = {'bert': [],'mlt': []}
休戚与共什么意思
电脑连接电视hdmifor metric, query in queries.items():
body = {"query": query,"size": 10,"_source": ["quote"]}
respon = client.arch(index='quotes', body=body)
result = [a['_source']['quote']for a in respon['hits']['hits']]
return result
inQuiry = "could i help you"
result = findRelevantHits(inQuiry.strip().lower())
print(result)
使⽤cosineSimilarity进⾏相似度距离计算,⽐较基于bert与基于es⾃带mlt⽅法的准确度。结果:
{‘bert’: [‘can i help?’, ‘could you take a picture for me?’, ‘the telephone is ringing, would you answer it, plea?’, ‘do you have some change?’, ‘i hope you have a good time on your trip.’, ‘i’d like a bowl of tamoto soup, plea.’, ‘what would you like to eat?’, ‘intelligent life on other planets? i’m not even sure there is on
earth!’, ‘if we can only encounter each other rather than stay with each other,then i wish we had never encountered.’, ‘i would like weeping with the smile rather than repenting with the cry,when my heart is broken ,is it needed to fix?’],
‘mlt’: [‘can i help?’, ‘could you take a picture for me?’, ‘i hope you have a good time on your trip.’, “you will have it if it belongs to you,whereas you don’t kvetch for it if it doesn’t appear in your life.”, ‘do you have some change?’, ‘what would you like to eat?’, ‘if we can only encounter each other rather than stay with each other,then i wish we had never encountered.’, ‘i would like weeping with the smile rather than repenting with the cry,when my heart is broken ,is it needed to fix?’, ‘the telephone is ringing, would you answer it, plea?’, ‘you have your
choice of three flavors of ice cream.’]}

本文发布于:2023-05-30 10:05:38,感谢您对本站的认可!

本文链接:https://www.wtabcd.cn/fanwen/fan/82/812763.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:相似   模型   搜索   疫苗   父爱   连接   猪八戒
相关文章
留言与评论(共有 0 条评论)
   
验证码:
推荐文章
排行榜
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图