Skip to content
This repository was archived by the owner on Dec 28, 2023. It is now read-only.
This repository was archived by the owner on Dec 28, 2023. It is now read-only.

crawlera_fetch middleware doesn't work with @inline_requests #22

@GeorgeA92

Description

@GeorgeA92

crawlera_fetch Midlleware doesn't work on methods with scrapy inline requests decorator enabled.

Log output
2021-12-03 20:03:07 [scrapy.utils.log] INFO: Scrapy 2.5.0 started (bot: bvbot)
2021-12-03 20:03:08 [scrapy.utils.log] INFO: Versions: lxml 4.6.3.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 21.7.0, Python 3.8.11 (default, Aug  6 2021, 09:57:55) [MSC v.1916 64 bit (AMD64)], pyOpenSSL 21.0.0 (OpenSSL 1.1.1l  24 Aug 2021), cryptography 3.4.7, Platform Windows-10-10.0.19043-SP0
2021-12-03 20:03:08 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor
2021-12-03 20:03:10 [scrapy.crawler] INFO: Overridden settings:
...
2021-12-03 20:03:19 [scrapy.core.engine] INFO: Spider opened
2021-12-03 20:03:20 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2021-12-03 20:03:20 [crawlera-fetch-middleware] INFO: Using Crawlera Fetch API at http://cm-58.scrapinghub.com:8010/fetch/v2/ with apikey *****
2021-12-03 20:03:20 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
....
2021-12-03 20:04:13 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://website... > (referer: None latency: 0.00)
2021-12-03 20:04:14 [scrapy.core.scraper] ERROR: Error downloading <GET https://website... >
Traceback (most recent call last):
  File "C:\Users\...\envs\...\lib\site-packages\twisted\internet\defer.py", line 1661, in _inlineCallbacks
    result = current_context.run(gen.send, result)
  File "C:\Users\...\envs\...\lib\site-packages\scrapy\core\downloader\middleware.py", line 36, in process_request
    response = yield deferred_from_coro(method(request=request, spider=spider))
  File "C:\Users\...\envs\...\lib\site-packages\crawlera_fetch\middleware.py", line 142, in process_request
    "original_request": request_to_dict(request, spider=spider),
  File "C:\Users\...\envs\...\lib\site-packages\scrapy\utils\reqser.py", line 19, in request_to_dict
    cb = _find_method(spider, cb)
  File "C:\Users\...\envs\...\lib\site-packages\scrapy\utils\reqser.py", line 87, in _find_method
    raise ValueError(f"Function {func} is not an instance method in: {obj}")
ValueError: Function functools.partial(<bound method RequestGenerator._handleSuccess of <inline_requests.generator.RequestGenerator object at 0x000001525EFCF4F0>>, generator=<generator object TestSpider.parse_product at 0x000001525EF63CF0>) is not an instance method in: <TestSpider 'test_spider' at 0x1525d8b1490>
2021-12-03 20:04:14 [scrapy.core.scraper] ERROR: Spider error processing <GET https://website... > (referer: https://website...  )
Traceback (most recent call last):
  File "C:\Users\...\envs\...\lib\site-packages\twisted\internet\defer.py", line 858, in _runCallbacks
    current.result = callback(  # type: ignore[misc]
  File "C:\Users\...\envs\...\lib\site-packages\inline_requests\generator.py", line 107, in _handleFailure
    ret = failure.throwExceptionIntoGenerator(generator)
  File "C:\Users\...\proj\spiders\test.py", line 280, in parse
    stock_response = yield Request(
  File "C:\Users\...\envs\...\lib\site-packages\twisted\internet\defer.py", line 1661, in _inlineCallbacks
    result = current_context.run(gen.send, result)
  File "C:\Users\...\envs\...\lib\site-packages\scrapy\core\downloader\middleware.py", line 36, in process_request
    response = yield deferred_from_coro(method(request=request, spider=spider))
  File "C:\Users\...\envs\...\lib\site-packages\crawlera_fetch\middleware.py", line 142, in process_request
    "original_request": request_to_dict(request, spider=spider),
  File "C:\Users\...\envs\...\lib\site-packages\scrapy\utils\reqser.py", line 19, in request_to_dict
    cb = _find_method(spider, cb)
  File "C:\Users\...\envs\...\lib\site-packages\scrapy\utils\reqser.py", line 87, in _find_method
    raise ValueError(f"Function {func} is not an instance method in: {obj}")
ValueError: Function functools.partial(<bound method RequestGenerator._handleSuccess of <inline_requests.generator.RequestGenerator object at 0x000001525EFCF4F0>>, generator=<generator object TestSpider.parse at 0x000001525EF63CF0>) is not an instance method in: <TestSpider 'test_spider' at 0x1525d8b1490>
2021-12-03 20:04:14 [scrapy.core.engine] INFO: Closing spider (finished)
2021-12-03 20:04:14 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
...
2021-12-03 20:04:14 [scrapy.core.engine] INFO: Spider closed (finished)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions