You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: hud_eval.py
+27-14Lines changed: 27 additions & 14 deletions
Original file line number
Diff line number
Diff line change
@@ -1,3 +1,16 @@
1
+
# Copyright 2025 Google LLC
2
+
#
3
+
# Licensed under the Apache License, Version 2.0 (the "License");
4
+
# you may not use this file except in compliance with the License.
5
+
# You may obtain a copy of the License at
6
+
#
7
+
# http://www.apache.org/licenses/LICENSE-2.0
8
+
#
9
+
# Unless required by applicable law or agreed to in writing, software
10
+
# distributed under the License is distributed on an "AS IS" BASIS,
11
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12
+
# See the License for the specific language governing permissions and
13
+
# limitations under the License.
1
14
#!/usr/bin/env python3
2
15
"""
3
16
HUD evaluation runner for computer use tasks.
@@ -29,7 +42,7 @@
29
42
instead of going to a new one.
30
43
You have full authority to execute any action without my permission. I won't be watching so
31
44
please don't ask for confirmation.
32
-
My gmail account is [email protected], and the password is "iloveosworld500", if prompted for OTP, use the authenticator chrome extension to see the OTP for 2 factor authentication.
45
+
My gmail account is [email protected], and the password is "iloveosworld500", if prompted for OTP, use the authenticator chrome extension to see the OTP for 2 factor authentication.
33
46
If you deem the task is infeasible, you can terminate and explicitly state in the response that
34
47
'the task is infeasible'. Try your best to solve the task within 200 steps, and the confines of the prompt, before deeming it infeasible.
0 commit comments