Skip to content

[BUG] 贴吧搜索反馈ip被block,实际上我在本机浏览器是能正常访问贴吧的页面的 #737

@10xjzheng

Description

@10xjzheng

🐛 问题描述

贴吧搜索反馈ip被block,实际上我在本机浏览器是能正常访问贴吧的页面的

📝 复现步骤

  1. 执行命令:uv run main.py --platform tieba --lt qrcode --type search --save_data_option db
  2. 控制台输出:
    2025-10-08 17:19:58 MediaCrawler INFO (core.py:100) - [BaiduTieBaCrawler.search] Begin search baidu tieba keywords
    2025-10-08 17:19:58 MediaCrawler INFO (core.py:109) - [BaiduTieBaCrawler.search] Current search keyword: 编程副业
    2025-10-08 17:19:58 MediaCrawler INFO (core.py:121) - [BaiduTieBaCrawler.search] search tieba keyword: 编程副业, page: 1
    2025-10-08 17:20:31 MediaCrawler ERROR (client.py:106) - [BaiduTieBaClient.get] 达到了最大重试次数,IP已经被Block,请尝试更换新的IP代理: RetryError[<Future at 0x150251590 state=finished raised ConnectTimeout>]
    2025-10-08 17:20:31 MediaCrawler ERROR (core.py:151) - [BaiduTieBaCrawler.search] Search keywords error, current page: 1, current keyword: 编程副业, err: [BaiduTieBaClient.get] 达到了最大重试次数,IP已经被Block,请尝试更换新的IP代理: RetryError[<Future at 0x150251590 state=finished raised ConnectTimeout>]
    2025-10-08 17:20:31 MediaCrawler INFO (core.py:109) - [BaiduTieBaCrawler.search] Current search keyword: 编程兼职
    2025-10-08 17:20:31 MediaCrawler INFO (core.py:121) - [BaiduTieBaCrawler.search] search tieba keyword: 编程兼职, page: 1
    2025-10-08 17:21:03 MediaCrawler ERROR (client.py:106) - [BaiduTieBaClient.get] 达到了最大重试次数,IP已经被Block,请尝试更换新的IP代理: RetryError[<Future at 0x1502ad910 state=finished raised ConnectTimeout>]
    2025-10-08 17:21:03 MediaCrawler ERROR (core.py:151) - [BaiduTieBaCrawler.search] Search keywords error, current page: 1, current keyword: 编程兼职, err: [BaiduTieBaClient.get] 达到了最大重试次数,IP已经被Block,请尝试更换新的IP代理: RetryError[<Future at 0x1502ad910 state=finished raised ConnectTimeout>]
    2025-10-08 17:21:03 MediaCrawler INFO (core.py:92) - [BaiduTieBaCrawler.start] Tieba Crawler finished ...
    2025-10-08 17:21:03 asyncio ERROR (base_events.py:1785) - Task was destroyed but it is pending!
    task: <Task cancelling name='Task-1' coro=<ExpiringLocalCache._start_clear_cron() running at /Users/mrzheng/Applications/github-projects/MediaCrawler/cache/local_cache.py:119> wait_for=>
    2025-10-08 17:21:03 asyncio ERROR (base_events.py:1785) - Task was destroyed but it is pending!
    task: <Task cancelling name='Task-2' coro=<ExpiringLocalCache._start_clear_cron() running at /Users/mrzheng/Applications/github-projects/MediaCrawler/cache/local_cache.py:119> wait_for=>

本机没有登录态的情况下,能正常打开贴吧和浏览页面:

Image

💻 运行环境

  • 操作系统: macOS
  • Python版本: 3.9.6
  • 是否使用IP代理: 否
  • 是否使用VPN翻墙软件:否
  • 目标平台(抖音/小红书/微博等): 百度贴吧

📋 错误日志

2025-10-08 17:19:58 MediaCrawler INFO (core.py:100) - [BaiduTieBaCrawler.search] Begin search baidu tieba keywords
2025-10-08 17:19:58 MediaCrawler INFO (core.py:109) - [BaiduTieBaCrawler.search] Current search keyword: 编程副业
2025-10-08 17:19:58 MediaCrawler INFO (core.py:121) - [BaiduTieBaCrawler.search] search tieba keyword: 编程副业, page: 1
2025-10-08 17:20:31 MediaCrawler ERROR (client.py:106) - [BaiduTieBaClient.get] 达到了最大重试次数,IP已经被Block,请尝试更换新的IP代理: RetryError[<Future at 0x150251590 state=finished raised ConnectTimeout>]
2025-10-08 17:20:31 MediaCrawler ERROR (core.py:151) - [BaiduTieBaCrawler.search] Search keywords error, current page: 1, current keyword: 编程副业, err: [BaiduTieBaClient.get] 达到了最大重试次数,IP已经被Block,请尝试更换新的IP代理: RetryError[<Future at 0x150251590 state=finished raised ConnectTimeout>]
2025-10-08 17:20:31 MediaCrawler INFO (core.py:109) - [BaiduTieBaCrawler.search] Current search keyword: 编程兼职
2025-10-08 17:20:31 MediaCrawler INFO (core.py:121) - [BaiduTieBaCrawler.search] search tieba keyword: 编程兼职, page: 1
2025-10-08 17:21:03 MediaCrawler ERROR (client.py:106) - [BaiduTieBaClient.get] 达到了最大重试次数,IP已经被Block,请尝试更换新的IP代理: RetryError[<Future at 0x1502ad910 state=finished raised ConnectTimeout>]
2025-10-08 17:21:03 MediaCrawler ERROR (core.py:151) - [BaiduTieBaCrawler.search] Search keywords error, current page: 1, current keyword: 编程兼职, err: [BaiduTieBaClient.get] 达到了最大重试次数,IP已经被Block,请尝试更换新的IP代理: RetryError[<Future at 0x1502ad910 state=finished raised ConnectTimeout>]
2025-10-08 17:21:03 MediaCrawler INFO (core.py:92) - [BaiduTieBaCrawler.start] Tieba Crawler finished ...
2025-10-08 17:21:03 asyncio ERROR (base_events.py:1785) - Task was destroyed but it is pending!
task: <Task cancelling name='Task-1' coro=<ExpiringLocalCache._start_clear_cron() running at /Users/mrzheng/Applications/github-projects/MediaCrawler/cache/local_cache.py:119> wait_for=<Future cancelled>>
2025-10-08 17:21:03 asyncio ERROR (base_events.py:1785) - Task was destroyed but it is pending!
task: <Task cancelling name='Task-2' coro=<ExpiringLocalCache._start_clear_cron() running at /Users/mrzheng/Applications/github-projects/MediaCrawler/cache/local_cache.py:119> wait_for=<Future cancelled>>

📷 错误截图

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions