Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question About Abstract State and Correct Elements #13

Open
jim850223 opened this issue Dec 20, 2024 · 5 comments
Open

Question About Abstract State and Correct Elements #13

jim850223 opened this issue Dec 20, 2024 · 5 comments

Comments

@jim850223
Copy link

Dear Authors,

Thank you for your excellent work on SYNAPSE and for providing the corresponding paper.

I have a question regarding Section 3.2 of the paper, where you mentioned that to abstract the state, "we set k to 3 and 5 for the previous and current observations, respectively." My question is related to how you handle the inclusion of correct elements in the abstract state, especially in the context of Mind2Web.

As stated in the Mind2Web dataset, "each step of the task is evaluated independently with the ground truth action history provided." If the correct element is not included in the previous abstract state, do you append (concatenate) the correct element to the previous abstract state to ensure it is available for the next step? Alternatively, if you don't append the correct element, does this mean that the agent still records the action as taken even if the correct element was not part of the abstract state?

I appreciate your insights and look forward to your response.

Thank you again for your contributions!

@ltzheng
Copy link
Owner

ltzheng commented Dec 21, 2024

See here for the code for getting the top k observations.

@jim850223
Copy link
Author

Thank you for the answering, I got it right now.
However, it raised another question:
Given that pos_candidates can potentially contain multiple valid candidates, the current approach of only comparing the first candidate may overlook valid matches if the predicted element is not the first pos_candidate. Is this problem concered in the system? And if it is, how do you solve it?

@ltzheng
Copy link
Owner

ltzheng commented Dec 27, 2024

Can you provide an example? I think in Mind2Web there is only one correct element.

@jim850223
Copy link
Author

In C.1 Evaluation of the appendix of Mind2Web, it says:

"One complication that arises during evaluation on real-world websites is that multiple elements on a webpage may induce the same effect. For instance, a button might house a text span within it, both of which, when clicked, yield identical results. To enhance the robustness of our evaluation, we employ heuristics to detect elements equivalent to the ground truth. We first examine the ancestors of the labeled element to identify potential higher-level elements acceptable for the current action. We employ a straightforward heuristic that locates the nearest clickable element to the ground truth, including itself. After identifying the top-level acceptable element, we include all its visible descendants that are located within its post-rendering bounding box as acceptable as well. Manual checking on 100 instances where the heuristic identifies a top-level element other than the ground truth confirms the validity of the approach. For both training and evaluation stages, all acceptable elements are considered positive."

@ltzheng
Copy link
Owner

ltzheng commented Dec 29, 2024

But after we do state abstraction, the clean observations for LLMs only contain one positive element.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants