现在存储向量数据库都用什么,哪种是未来的趋势?

2023-05-06 12:42:25 +08:00
 baoyinhe
1480 次点击
所在节点    问与答
4 条回复
vivisidea
2023-05-06 14:38:43 +08:00
刚好在隔壁看一篇关于用 openai embedding 来做检索的文章,里面有提到这个

向量数据库选项包括:

Pinecone ,完全托管的向量数据库
Weaviate ,开源向量搜索引擎
Redis ,向量数据库
Qdrant ,向量搜索引擎
Milvus ,用于可扩展相似性搜索的向量数据库
Chroma ,开源的嵌入向量存储

source
https://zhuanlan.zhihu.com/p/619233637
baoyinhe
2023-05-06 16:46:38 +08:00
@vivisidea #1 有用
lchynn
2023-05-06 17:00:16 +08:00
好奇,redis 内存数据库来做向量数据存储,那么内存如果存到后面不够了,oom 了咋办? 生产业务允许矢量化的数据丢失么? 对应到 embedding 的 gpt 例子, 好比你要查找一堆文本有关相似性的业务数据都没了?
vivisidea
2023-05-06 17:17:03 +08:00
@lchynn 我没仔细看 redis 检索的实现原理,但在 kv 应用场景,一般可以控制 key 的驱逐策略,比如可以用 noeviction 策略,拒绝新的写入,或者 LRU 把最老的 evict 掉

https://redis.io/docs/reference/eviction/

The exact behavior Redis follows when the maxmemory limit is reached is configured using the maxmemory-policy configuration directive.

The following policies are available:

noeviction: New values aren’t saved when memory limit is reached. When a database uses replication, this applies to the primary database
allkeys-lru: Keeps most recently used keys; removes least recently used (LRU) keys
allkeys-lfu: Keeps frequently used keys; removes least frequently used (LFU) keys
volatile-lru: Removes least recently used keys with the expire field set to true.
volatile-lfu: Removes least frequently used keys with the expire field set to true.
allkeys-random: Randomly removes keys to make space for the new data added.
volatile-random: Randomly removes keys with expire field set to true.
volatile-ttl: Removes keys with expire field set to true and the shortest remaining time-to-live (TTL) value.

这是一个专为移动设备优化的页面(即为了让你能够在 Google 搜索结果里秒开这个页面),如果你希望参与 V2EX 社区的讨论,你可以继续到 V2EX 上打开本讨论主题的完整版本。

https://www.v2ex.com/t/937821

V2EX 是创意工作者们的社区,是一个分享自己正在做的有趣事物、交流想法,可以遇见新朋友甚至新机会的地方。

V2EX is a community of developers, designers and creative people.

© 2021 V2EX