This repository was archived by the owner on Dec 28, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 10
This repository was archived by the owner on Dec 28, 2023. It is now read-only.
crawlera_fetch middleware doesn't work with @inline_requests #22
Copy link
Copy link
Open
Description
crawlera_fetch Midlleware doesn't work on methods with scrapy inline requests decorator enabled.
Log output
2021-12-03 20:03:07 [scrapy.utils.log] INFO: Scrapy 2.5.0 started (bot: bvbot)
2021-12-03 20:03:08 [scrapy.utils.log] INFO: Versions: lxml 4.6.3.0, libxml2 2.9.10, cssselect 1.1.0, parsel 1.6.0, w3lib 1.22.0, Twisted 21.7.0, Python 3.8.11 (default, Aug 6 2021, 09:57:55) [MSC v.1916 64 bit (AMD64)], pyOpenSSL 21.0.0 (OpenSSL 1.1.1l 24 Aug 2021), cryptography 3.4.7, Platform Windows-10-10.0.19043-SP0
2021-12-03 20:03:08 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor
2021-12-03 20:03:10 [scrapy.crawler] INFO: Overridden settings:
...
2021-12-03 20:03:19 [scrapy.core.engine] INFO: Spider opened
2021-12-03 20:03:20 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2021-12-03 20:03:20 [crawlera-fetch-middleware] INFO: Using Crawlera Fetch API at http://cm-58.scrapinghub.com:8010/fetch/v2/ with apikey *****
2021-12-03 20:03:20 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
....
2021-12-03 20:04:13 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://website... > (referer: None latency: 0.00)
2021-12-03 20:04:14 [scrapy.core.scraper] ERROR: Error downloading <GET https://website... >
Traceback (most recent call last):
File "C:\Users\...\envs\...\lib\site-packages\twisted\internet\defer.py", line 1661, in _inlineCallbacks
result = current_context.run(gen.send, result)
File "C:\Users\...\envs\...\lib\site-packages\scrapy\core\downloader\middleware.py", line 36, in process_request
response = yield deferred_from_coro(method(request=request, spider=spider))
File "C:\Users\...\envs\...\lib\site-packages\crawlera_fetch\middleware.py", line 142, in process_request
"original_request": request_to_dict(request, spider=spider),
File "C:\Users\...\envs\...\lib\site-packages\scrapy\utils\reqser.py", line 19, in request_to_dict
cb = _find_method(spider, cb)
File "C:\Users\...\envs\...\lib\site-packages\scrapy\utils\reqser.py", line 87, in _find_method
raise ValueError(f"Function {func} is not an instance method in: {obj}")
ValueError: Function functools.partial(<bound method RequestGenerator._handleSuccess of <inline_requests.generator.RequestGenerator object at 0x000001525EFCF4F0>>, generator=<generator object TestSpider.parse_product at 0x000001525EF63CF0>) is not an instance method in: <TestSpider 'test_spider' at 0x1525d8b1490>
2021-12-03 20:04:14 [scrapy.core.scraper] ERROR: Spider error processing <GET https://website... > (referer: https://website... )
Traceback (most recent call last):
File "C:\Users\...\envs\...\lib\site-packages\twisted\internet\defer.py", line 858, in _runCallbacks
current.result = callback( # type: ignore[misc]
File "C:\Users\...\envs\...\lib\site-packages\inline_requests\generator.py", line 107, in _handleFailure
ret = failure.throwExceptionIntoGenerator(generator)
File "C:\Users\...\proj\spiders\test.py", line 280, in parse
stock_response = yield Request(
File "C:\Users\...\envs\...\lib\site-packages\twisted\internet\defer.py", line 1661, in _inlineCallbacks
result = current_context.run(gen.send, result)
File "C:\Users\...\envs\...\lib\site-packages\scrapy\core\downloader\middleware.py", line 36, in process_request
response = yield deferred_from_coro(method(request=request, spider=spider))
File "C:\Users\...\envs\...\lib\site-packages\crawlera_fetch\middleware.py", line 142, in process_request
"original_request": request_to_dict(request, spider=spider),
File "C:\Users\...\envs\...\lib\site-packages\scrapy\utils\reqser.py", line 19, in request_to_dict
cb = _find_method(spider, cb)
File "C:\Users\...\envs\...\lib\site-packages\scrapy\utils\reqser.py", line 87, in _find_method
raise ValueError(f"Function {func} is not an instance method in: {obj}")
ValueError: Function functools.partial(<bound method RequestGenerator._handleSuccess of <inline_requests.generator.RequestGenerator object at 0x000001525EFCF4F0>>, generator=<generator object TestSpider.parse at 0x000001525EF63CF0>) is not an instance method in: <TestSpider 'test_spider' at 0x1525d8b1490>
2021-12-03 20:04:14 [scrapy.core.engine] INFO: Closing spider (finished)
2021-12-03 20:04:14 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
...
2021-12-03 20:04:14 [scrapy.core.engine] INFO: Spider closed (finished)Metadata
Metadata
Assignees
Labels
No labels