Some confusion about the cases given in "Appendix D. Error Cases in DeepWideSearch"

Dear Author,

First, thank you very much for your outstanding research. Your paper, which impressively demonstrates the limitations of current agent systems in information-seeking tasks, was very insightful.

I was particularly interested in Section 6.5, "Error Analysis," and the five specific error cases presented in Figures 13 through 17.

To better understand these findings, I attempted to replicate these five cases using the official web-based ChatGPT-5. However, in my testing, I did not seem to observe the same types of errors described in your paper.

I am a bit confused by this discrepancy and was hoping you might clarify. Is it possible that the difference I observed is due to:

1.  The specific limitations of the WebSailor system (as you analyzed in the paper)?
2.  A difference between the model I used (web-based ChatGPT-5) and the baseline model version used in your paper?
3.  Or perhaps other factors that I haven't considered?

Any clarification you could provide would be greatly appreciated, as it would be very helpful for my understanding of this work.

Thank you again for your time and for your excellent research.

Best wishes,
Gengsheng Li

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Some confusion about the cases given in "Appendix D. Error Cases in DeepWideSearch" #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Some confusion about the cases given in "Appendix D. Error Cases in DeepWideSearch" #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions