V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
necpowman
V2EX  ›  程序员

elasticsearch 和 mongoldb 同步数据时的错误

  •  
  •   necpowman · 2016-09-11 15:11:11 +08:00 · 3147 次点击
    这是一个创建于 2789 天前的主题,其中的信息可能已经有所发展或是发生改变。

    mongo-connector.log 的内容是

    OperationFailed: TransportError(404, u'{"_index":"nicovideo","_type":"posts","_id":"27131240","found":false}')
    2016-09-10 00:01:36,787 [ERROR] mongo_connector.oplog_manager:324 - Unable to process oplog document {u'h': -4790094769725122799L, u'ts': Timestamp(1473480246, 2), u'o': {u'$set': {u'view_count': 22, u'__v': 5, u'urls': [{u'cookie': u'sm29618346:1473480091:1473480091:0747757f18bebe4a:1', u'vip': False, u'type': u'mp4', u'get_at': datetime.datetime(2016, 9, 10, 4, 4, 6, 945000), u'value': u'http://smile-fnl11.nicovideo.jp/smile?m=29618346.70461'}]}}, u't': 2L, u'v': 2, u'ns': u'nicovideo.posts', u'o2': {u'_id': 29618346}, u'op': u'u'}
    Traceback (most recent call last):
      File "/usr/local/lib/python2.7/dist-packages/mongo_connector/oplog_manager.py", line 310, in run
        ns, timestamp)
      File "/usr/local/lib/python2.7/dist-packages/mongo_connector/util.py", line 43, in wrapped
        reraise(new_type, exc_value, exc_tb)
      File "/usr/local/lib/python2.7/dist-packages/mongo_connector/util.py", line 32, in wrapped
        return f(*args, **kwargs)
      File "/usr/local/lib/python2.7/dist-packages/mongo_connector/doc_managers/elastic2_doc_manager.py", line 161, in update
        id=u(document_id))
      File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/utils.py", line 69, in _wrapped
        return func(*args, params=params, **kwargs)
      File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/__init__.py", line 330, in get
        doc_type, id), params=params)
      File "/usr/local/lib/python2.7/dist-packages/elasticsearch/transport.py", line 307, in perform_request
        status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
      File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/http_urllib3.py", line 93, in perform_request
        self._raise_error(response.status, raw_data)
      File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/base.py", line 105, in _raise_error
        raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
    OperationFailed: TransportError(404, u'{"_index":"nicovideo","_type":"posts","_id":"29618346","found":false}')
    

    看起来像是通过 oplog 找不到对应的数据

    在 github 上面提了 issue ,没人回复,所以求助 V 友了

    8 条回复    2016-09-14 14:48:29 +08:00
    Nexvar
        1
    Nexvar  
       2016-09-11 15:22:02 +08:00 via Android
    帮顶
    yeasy
        2
    yeasy  
       2016-09-11 22:15:57 +08:00
    connect 方案不太稳定,建议还是自己实现
    Nexvar
        3
    Nexvar  
       2016-09-12 01:00:35 +08:00 via Android
    @yeasy 自己实现有什么思路吗
    dangyuluo
        4
    dangyuluo  
       2016-09-12 08:28:10 +08:00
    我是写个程序,每天凌晨 5 点自己从数据库里读数据到 ES 里。
    yybeta
        5
    yybeta  
       2016-09-12 08:59:46 +08:00 via Android
    还试过 river 插件,不太好用。如果 mongo 里有时间戳字段可以写一个定时检测和增量同步到 es 的脚本,我就是这么实现的。
    necpowman
        6
    necpowman  
    OP
       2016-09-12 14:29:49 +08:00
    @dangyuluo @yybeta
    两位前辈请问有代码给我参考一下吗,昨晚自己写了导入数据的脚步写了一宿,今天一测发现不能用。。。
    yybeta
        7
    yybeta  
       2016-09-12 17:49:27 +08:00
    @necpowman 已发已 at
    yeasy
        8
    yeasy  
       2016-09-14 14:48:29 +08:00
    思路都是差不多的。
    从 mongo 导入为 json ,导入 es 。
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   我们的愿景   ·   实用小工具   ·   2173 人在线   最高记录 6543   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 27ms · UTC 05:02 · PVG 13:02 · LAX 22:02 · JFK 01:02
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.