wf_httpbin_form_postv1httpbin.orgeasy
Submit the HTTPBin form and return the echoed submission payload
timeout 300s • started Apr 21, 10:43 AM
Cohort status
completed
Attempts
1
Pass rate
--
Live attempts
0
Cohort verdicts
TinyFishn/a
Task prompt
Collapsed preview. Expand for the full prompt.
Expand
Benchmark case: Submit the HTTPBin form and return the echoed submission payload Goal: Open the form at https://httpbin.org/forms/post, fill it with the provided input data, submit it, and return the echoed submitted_form object. Preferred start URL: https://httpbin.org/forms/post Input guidance: Use these exact values when filling the form. Structured input data: { "comments": "Leave at reception", "custemail": "ada@example.com", "custname": "Ada Lovelace", "custtel": "+1-415-555-0136", "delivery": "19:45", "size": "medium", "topping": "bacon" } Return a final JSON object matching this schema: { "type": "object", "required": [ "submitted_form" ], "properties": { "submission_url": { "type": "string", "format": "uri", "optional": true }, "submitted_form": { "type": "object", "required": [ "custname", "custtel", "custemail", "size", "topping", "delivery", "comments" ], "properties": { "comments": { "type": "string" }, "custemail": { "type": "string" }, "custname": { "type": "string" }, "custtel": { "type": "string" }, "delivery": { "type": "string" }, "size": { "type": "string" }, "topping": { "type": "string" } } } } } Work the website directly. Prefer exact values over paraphrases. Return only the final structured answer when done.
Benchmark case: Submit the HTTPBin form and return the echoed submission payload

Goal: Open the form at https://httpbin.org/forms/post, fill it with the provided input data, submit it, and return the echoed submitted_form object.

Preferred start URL: https://httpbin.org/forms/post

Input guidance: Use these exact values when filling the form.

Structured input data:
{
  "comments": "Leave at reception",
  "custemail": "ada@example.com",
  "custname": "Ada Lovelace",
  "custtel": "+1-415-555-0136",
  "delivery": "19:45",
  "size": "medium",
  "topping": "bacon"
}

Return a final JSON object matching this schema:
{
  "type": "object",
  "required": [
    "submitted_form"
  ],
  "properties": {
    "submission_url": {
      "type": "string",
      "format": "uri",
      "optional": true
    },
    "submitted_form": {
      "type": "object",
      "required": [
        "custname",
        "custtel",
        "custemail",
        "size",
        "topping",
        "delivery",
        "comments"
      ],
      "properties": {
        "comments": {
          "type": "string"
        },
        "custemail": {
          "type": "string"
        },
        "custname": {
          "type": "string"
        },
        "custtel": {
          "type": "string"
        },
        "delivery": {
          "type": "string"
        },
        "size": {
          "type": "string"
        },
        "topping": {
          "type": "string"
        }
      }
    }
  }
}

Work the website directly. Prefer exact values over paraphrases. Return only the final structured answer when done.
Benchmark contract
Frozen case metadata for this cohort
This is the versioned case payload the arena executed and later submitted to tinyeval.
Case ID
wf_httpbin_form_post
Version
1
Domain
httpbin.org
Difficulty
easy
Timeout
300s
Entrypoint
https://httpbin.org/forms/post
Structured input data
{
  "comments": "Leave at reception",
  "custemail": "ada@example.com",
  "custname": "Ada Lovelace",
  "custtel": "+1-415-555-0136",
  "delivery": "19:45",
  "size": "medium",
  "topping": "bacon"
}
Output schema
{
  "type": "object",
  "required": [
    "submitted_form"
  ],
  "properties": {
    "submission_url": {
      "type": "string",
      "format": "uri",
      "optional": true
    },
    "submitted_form": {
      "type": "object",
      "required": [
        "custname",
        "custtel",
        "custemail",
        "size",
        "topping",
        "delivery",
        "comments"
      ],
      "properties": {
        "comments": {
          "type": "string"
        },
        "custemail": {
          "type": "string"
        },
        "custname": {
          "type": "string"
        },
        "custtel": {
          "type": "string"
        },
        "delivery": {
          "type": "string"
        },
        "size": {
          "type": "string"
        },
        "topping": {
          "type": "string"
        }
      }
    }
  }
}
TinyFish
Eval status: n_a
completed
Success
--
Latency
12.80s
Cost
$0.00
Tokens
10,205
Steps
7
Current URL
https://httpbin.org/forms/post
Submitted output
Exact structured payload submitted to tinyeval for scoring.
Expand
{
  "submission_url": "https://httpbin.org/post",
  "submitted_form": {
    "comments": "Leave at reception",
    "custemail": "ada@example.com",
    "custname": "Ada Lovelace",
    "custtel": "+1-415-555-0136",
    "delivery": "19:45",
    "size": "medium",
    "topping": "bacon"
  }
}
Output
{"submission_url": "https://httpbin.org/post", "submitted_form": {"comments": "Leave at reception", "custemail": "ada@example.com", "custname": "Ada Lovelace", "custtel": "+1-415-555-0136", "delivery": "19:45", "size": "medium", "topping": "bacon"}}
Live viewer
Viewer URL retained for this attempt. Availability depends on the underlying provider session.
Open viewer
Agent steps
13 logged steps. Expand for the full timeline.
Expand
1. set_value
Apr 21, 10:43 AM
Fill out the pizza order form.
https://httpbin.org/forms/post
1. set_value
Apr 21, 10:43 AM
page_map: 572 chars
https://httpbin.org/forms/post
2. click
Apr 21, 10:43 AM
Select medium size pizza
https://httpbin.org/forms/post
2. click
Apr 21, 10:43 AM
page_map: 473 chars
https://httpbin.org/forms/post
3. set_value
Apr 21, 10:43 AM
Select topping and set delivery time and comments
https://httpbin.org/forms/post
3. set_value
Apr 21, 10:43 AM
page_map: 375 chars
https://httpbin.org/forms/post
4. click
Apr 21, 10:43 AM
Select bacon topping
https://httpbin.org/forms/post
4. click
Apr 21, 10:43 AM
page_map: 483 chars
https://httpbin.org/forms/post
5. set_value
Apr 21, 10:43 AM
Set delivery time and comments
https://httpbin.org/forms/post
5. set_value
Apr 21, 10:43 AM
page_map: 517 chars
https://httpbin.org/forms/post
6. click
Apr 21, 10:43 AM
Submit the form
https://httpbin.org/forms/post
6. click
Apr 21, 10:43 AM
page_map: 1570 chars
https://httpbin.org/forms/post
7. end
Apr 21, 10:43 AM
{"error":"none","notes":"The form has been successfully submitted, and the echoed data is displayed on the page. I am extracting the 'form' object from the response JSON and ret...
https://httpbin.org/forms/post
Evaluation
N A
Evaluation unavailable at this time.
n_a
Failure reason
--
Failed checkpoint
--
Checkpoint summary
0 passed • 0 failed
Evaluated
Apr 21, 10:43 AM
Raw evaluation payload
{
  "submit": {
    "eval_id": "run_53f3963eb01eaacf",
    "run_id": "run_53f3963eb01eaacf",
    "status": "n_a",
    "note": "Evaluation unavailable at this time."
  },
  "detail": {
    "eval": {
      "run_id": "run_53f3963eb01eaacf",
      "case_id": "wf_httpbin_form_post",
      "case_version": 1,
      "status": "n_a",
      "agent_name": "TinyFish",
      "last_evaluation": {
        "event_id": "evt_27a1c66453596f3a",
        "status": "n_a",
        "note": "Evaluation unavailable at this time.",
        "evaluated_at": "2026-04-21T10:43:58.25707791Z"
      },
      "created_at": "2026-04-21T10:43:48.246264211Z",
      "updated_at": "2026-04-21T10:43:58.257078814Z",
      "completed_at": "2026-04-21T10:43:58.257078814Z"
    },
    "run": {
      "run_id": "run_53f3963eb01eaacf",
      "case_id": "wf_httpbin_form_post",
      "case_version": 1,
      "status": "n_a",
      "agent_name": "TinyFish",
      "last_evaluation": {
        "event_id": "evt_27a1c66453596f3a",
        "status": "n_a",
        "note": "Evaluation unavailable at this time.",
        "evaluated_at": "2026-04-21T10:43:58.25707791Z"
      },
      "created_at": "2026-04-21T10:43:48.246264211Z",
      "updated_at": "2026-04-21T10:43:58.257078814Z",
      "completed_at": "2026-04-21T10:43:58.257078814Z"
    },
    "case": {
      "case_id": "wf_httpbin_form_post",
      "version": 1,
      "status": "active",
      "created_at": "2026-04-09T00:00:00Z",
      "updated_at": "2026-04-09T00:00:00Z",
      "title": "Submit the HTTPBin form and return the echoed submission payload",
      "task_description": "Open the form at https://httpbin.org/forms/post, fill it with the provided input data, submit it, and return the echoed submitted_form object.",
      "domains": [
        "httpbin.org"
      ],
      "tags": [
        "forms",
        "httpbin",
        "submission",
        "echo"
      ],
      "difficulty": "easy",
      "locale": "en-US",
      "input_spec": {
        "description": "Use these exact values when filling the form.",
        "schema": {
          "type": "object",
          "required": [
            "custname",
            "custtel",
            "custemail",
            "size",
            "topping",
            "delivery",
            "comments"
          ],
          "properties": {
            "comments": {
              "type": "string"
            },
            "custemail": {
              "type": "string"
            },
            "custname": {
              "type": "string"
            },
            "custtel": {
              "type": "string"
            },
            "delivery": {
              "type": "string"
            },
            "size": {
              "type": "string"
            },
            "topping": {
              "type": "string"
            }
          }
        },
        "data": {
          "comments": "Leave at reception",
          "custemail": "ada@example.com",
          "custname": "Ada Lovelace",
          "custtel": "+1-415-555-0136",
          "delivery": "19:45",
          "size": "medium",
          "topping": "bacon"
        }
      },
      "output_schema": {
        "type": "object",
        "required": [
          "submitted_form"
        ],
        "properties": {
          "submission_url": {
            "type": "string",
            "format": "uri",
            "optional": true
          },
          "submitted_form": {
            "type": "object",
            "required": [
              "custname",
              "custtel",
              "custemail",
              "size",
              "topping",
              "delivery",
              "comments"
            ],
            "properties": {
              "comments": {
                "type": "string"
              },
              "custemail": {
                "type": "string"
              },
              "custname": {
                "type": "string"
              },
              "custtel": {
                "type": "string"
              },
              "delivery": {
                "type": "string"
              },
              "size": {
                "type": "string"
              },
              "topping": {
                "type": "string"
              }
            }
          }
        }
      },
      "entrypoint": {
        "mode": "url",
        "url": "https://httpbin.org/forms/post"
      },
      "run_policy": {
        "timeout_seconds": 300
      }
    },
    "events": [
      {
        "event_id": "evt_27a1c66453596f3a",
        "run_id": "run_53f3963eb01eaacf",
        "event_type": "final",
        "payload": {
          "event_type": "final",
          "timestamp": "2026-04-21T10:43:48.249798452Z",
          "page_state": {},
          "intermediate_data": {
            "raw_submission": {
              "submission_url": "https://httpbin.org/post",
              "submitted_form": {
                "comments": "Leave at reception",
                "custemail": "ada@example.com",
                "custname": "Ada Lovelace",
                "custtel": "+1-415-555-0136",
                "delivery": "19:45",
                "size": "medium",
                "topping": "bacon"
              }
            }
          },
          "final_output": {
            "submission_url": "https://httpbin.org/post",
            "submitted_form": {
              "comments": "Leave at reception",
              "custemail": "ada@example.com",
              "custname": "Ada Lovelace",
              "custtel": "+1-415-555-0136",
              "delivery": "19:45",
              "size": "medium",
              "topping": "bacon"
            }
          },
          "artifacts": {},
          "agent": {}
        },
        "created_at": "2026-04-21T10:43:48.250001499Z"
      }
    ],
    "timeout_seconds": 300,
    "deadline_at": "2026-04-21T10:48:48.246264211Z"
  }
}
Provider terminal payload
Raw terminal result retained from the provider adapter.
Expand
{
  "answer": "{\"submission_url\": \"https://httpbin.org/post\", \"submitted_form\": {\"comments\": \"Leave at reception\", \"custemail\": \"ada@example.com\", \"custname\": \"Ada Lovelace\", \"custtel\": \"+1-415-555-0136\", \"delivery\": \"19:45\", \"size\": \"medium\", \"topping\": \"bacon\"}}",
  "data": {
    "error": "none",
    "notes": "The form has been successfully submitted, and the echoed data is displayed on the page. I am extracting the 'form' object from the response JSON and returning it.",
    "result": "{\"submission_url\": \"https://httpbin.org/post\", \"submitted_form\": {\"comments\": \"Leave at reception\", \"custemail\": \"ada@example.com\", \"custname\": \"Ada Lovelace\", \"custtel\": \"+1-415-555-0136\", \"delivery\": \"19:45\", \"size\": \"medium\", \"topping\": \"bacon\"}}",
    "success": true
  },
  "durationMs": 12798,
  "error": "none",
  "inputTokens": 9612,
  "llmDurationMs": 9505,
  "model": "gemini-3-flash-preview",
  "notes": "The form has been successfully submitted, and the echoed data is displayed on the page. I am extracting the 'form' object from the response JSON and returning it.",
  "outputTokens": 593,
  "recordingId": "ff4d7f62-6fbb-4b6a-9ffe-92d8d56bded6",
  "result": "{\"submission_url\": \"https://httpbin.org/post\", \"submitted_form\": {\"comments\": \"Leave at reception\", \"custemail\": \"ada@example.com\", \"custname\": \"Ada Lovelace\", \"custtel\": \"+1-415-555-0136\", \"delivery\": \"19:45\", \"size\": \"medium\", \"topping\": \"bacon\"}}",
  "success": true,
  "thinkingTokens": 0,
  "totalSteps": 7,
  "totalTokens": 10205,
  "type": "done"
}