wf_httpbin_form_postv1httpbin.orgeasy
Submit the HTTPBin form and return the echoed submission payload
timeout 300s • started Apr 21, 10:46 AM
Cohort status
completed
Attempts
1
Pass rate
--
Live attempts
0
Cohort verdicts
TinyFishn/a
Task prompt
Collapsed preview. Expand for the full prompt.
Expand
Benchmark case: Submit the HTTPBin form and return the echoed submission payload Goal: Open the form at https://httpbin.org/forms/post, fill it with the provided input data, submit it, and return the echoed submitted_form object. Preferred start URL: https://httpbin.org/forms/post Input guidance: Use these exact values when filling the form. Structured input data: { "comments": "Leave at reception", "custemail": "ada@example.com", "custname": "Ada Lovelace", "custtel": "+1-415-555-0136", "delivery": "19:45", "size": "medium", "topping": "bacon" } Return a final JSON object matching this schema: { "type": "object", "required": [ "submitted_form" ], "properties": { "submission_url": { "type": "string", "format": "uri", "optional": true }, "submitted_form": { "type": "object", "required": [ "custname", "custtel", "custemail", "size", "topping", "delivery", "comments" ], "properties": { "comments": { "type": "string" }, "custemail": { "type": "string" }, "custname": { "type": "string" }, "custtel": { "type": "string" }, "delivery": { "type": "string" }, "size": { "type": "string" }, "topping": { "type": "string" } } } } } Work the website directly. Prefer exact values over paraphrases. Return only the final structured answer when done.
Benchmark case: Submit the HTTPBin form and return the echoed submission payload

Goal: Open the form at https://httpbin.org/forms/post, fill it with the provided input data, submit it, and return the echoed submitted_form object.

Preferred start URL: https://httpbin.org/forms/post

Input guidance: Use these exact values when filling the form.

Structured input data:
{
  "comments": "Leave at reception",
  "custemail": "ada@example.com",
  "custname": "Ada Lovelace",
  "custtel": "+1-415-555-0136",
  "delivery": "19:45",
  "size": "medium",
  "topping": "bacon"
}

Return a final JSON object matching this schema:
{
  "type": "object",
  "required": [
    "submitted_form"
  ],
  "properties": {
    "submission_url": {
      "type": "string",
      "format": "uri",
      "optional": true
    },
    "submitted_form": {
      "type": "object",
      "required": [
        "custname",
        "custtel",
        "custemail",
        "size",
        "topping",
        "delivery",
        "comments"
      ],
      "properties": {
        "comments": {
          "type": "string"
        },
        "custemail": {
          "type": "string"
        },
        "custname": {
          "type": "string"
        },
        "custtel": {
          "type": "string"
        },
        "delivery": {
          "type": "string"
        },
        "size": {
          "type": "string"
        },
        "topping": {
          "type": "string"
        }
      }
    }
  }
}

Work the website directly. Prefer exact values over paraphrases. Return only the final structured answer when done.
Benchmark contract
Frozen case metadata for this cohort
This is the versioned case payload the arena executed and later submitted to tinyeval.
Case ID
wf_httpbin_form_post
Version
1
Domain
httpbin.org
Difficulty
easy
Timeout
300s
Entrypoint
https://httpbin.org/forms/post
Structured input data
{
  "comments": "Leave at reception",
  "custemail": "ada@example.com",
  "custname": "Ada Lovelace",
  "custtel": "+1-415-555-0136",
  "delivery": "19:45",
  "size": "medium",
  "topping": "bacon"
}
Output schema
{
  "type": "object",
  "required": [
    "submitted_form"
  ],
  "properties": {
    "submission_url": {
      "type": "string",
      "format": "uri",
      "optional": true
    },
    "submitted_form": {
      "type": "object",
      "required": [
        "custname",
        "custtel",
        "custemail",
        "size",
        "topping",
        "delivery",
        "comments"
      ],
      "properties": {
        "comments": {
          "type": "string"
        },
        "custemail": {
          "type": "string"
        },
        "custname": {
          "type": "string"
        },
        "custtel": {
          "type": "string"
        },
        "delivery": {
          "type": "string"
        },
        "size": {
          "type": "string"
        },
        "topping": {
          "type": "string"
        }
      }
    }
  }
}
TinyFish
Eval status: n_a
completed
Success
--
Latency
11.73s
Cost
$0.00
Tokens
8,859
Steps
6
Current URL
https://httpbin.org/forms/post
Submitted output
Exact structured payload submitted to tinyeval for scoring.
Expand
{
  "submission_url": "https://httpbin.org/post",
  "submitted_form": {
    "comments": "Leave at reception",
    "custemail": "ada@example.com",
    "custname": "Ada Lovelace",
    "custtel": "+1-415-555-0136",
    "delivery": "19:45",
    "size": "medium",
    "topping": "bacon"
  }
}
Output
{
  "submission_url": "https://httpbin.org/post",
  "submitted_form": {
    "comments": "Leave at reception",
    "custemail": "ada@example.com",
    "custname": "Ada Lovelace",
    "custtel": "+1-415-555-0136",
    "delivery": "19:45",
    "size": "medium",
    "topping": "bacon"
  }
}
Live viewer
Viewer URL retained for this attempt. Availability depends on the underlying provider session.
Open viewer
Agent steps
11 logged steps. Expand for the full timeline.
Expand
1. set_value
Apr 21, 10:46 AM
Fill form fields with provided data
https://httpbin.org/forms/post
1. set_value
Apr 21, 10:46 AM
page_map: 572 chars
https://httpbin.org/forms/post
2. click
Apr 21, 10:46 AM
Select pizza size and toppings
https://httpbin.org/forms/post
2. click
Apr 21, 10:46 AM
page_map: 473 chars
https://httpbin.org/forms/post
3. click
Apr 21, 10:46 AM
Select bacon topping
https://httpbin.org/forms/post
3. click
Apr 21, 10:46 AM
page_map: 483 chars
https://httpbin.org/forms/post
4. set_value
Apr 21, 10:46 AM
Set delivery time and instructions
https://httpbin.org/forms/post
4. set_value
Apr 21, 10:46 AM
page_map: 517 chars
https://httpbin.org/forms/post
5. click
Apr 21, 10:46 AM
Submit the form
https://httpbin.org/forms/post
5. click
Apr 21, 10:46 AM
page_map: 1570 chars
https://httpbin.org/forms/post
6. end
Apr 21, 10:46 AM
{"error":"none","notes":"The form was successfully submitted and the response contains the echoed 'form' object with the expected values.","result":"{\n \"submission_url\": \"h...
https://httpbin.org/forms/post
Evaluation
N A
Evaluation unavailable at this time.
n_a
Failure reason
--
Failed checkpoint
--
Checkpoint summary
0 passed • 0 failed
Evaluated
Apr 21, 10:46 AM
Raw evaluation payload
{
  "submit": {
    "eval_id": "run_9a68eda4d0a40f1f",
    "run_id": "run_9a68eda4d0a40f1f",
    "status": "n_a",
    "note": "Evaluation unavailable at this time."
  },
  "detail": {
    "eval": {
      "run_id": "run_9a68eda4d0a40f1f",
      "case_id": "wf_httpbin_form_post",
      "case_version": 1,
      "status": "n_a",
      "agent_name": "TinyFish",
      "last_evaluation": {
        "event_id": "evt_7a6885df43a85165",
        "status": "n_a",
        "note": "Evaluation unavailable at this time.",
        "evaluated_at": "2026-04-21T10:46:45.51733959Z"
      },
      "created_at": "2026-04-21T10:46:45.185298833Z",
      "updated_at": "2026-04-21T10:46:45.517340376Z",
      "completed_at": "2026-04-21T10:46:45.517340376Z"
    },
    "run": {
      "run_id": "run_9a68eda4d0a40f1f",
      "case_id": "wf_httpbin_form_post",
      "case_version": 1,
      "status": "n_a",
      "agent_name": "TinyFish",
      "last_evaluation": {
        "event_id": "evt_7a6885df43a85165",
        "status": "n_a",
        "note": "Evaluation unavailable at this time.",
        "evaluated_at": "2026-04-21T10:46:45.51733959Z"
      },
      "created_at": "2026-04-21T10:46:45.185298833Z",
      "updated_at": "2026-04-21T10:46:45.517340376Z",
      "completed_at": "2026-04-21T10:46:45.517340376Z"
    },
    "case": {
      "case_id": "wf_httpbin_form_post",
      "version": 1,
      "status": "active",
      "created_at": "2026-04-09T00:00:00Z",
      "updated_at": "2026-04-09T00:00:00Z",
      "title": "Submit the HTTPBin form and return the echoed submission payload",
      "task_description": "Open the form at https://httpbin.org/forms/post, fill it with the provided input data, submit it, and return the echoed submitted_form object.",
      "domains": [
        "httpbin.org"
      ],
      "tags": [
        "forms",
        "httpbin",
        "submission",
        "echo"
      ],
      "difficulty": "easy",
      "locale": "en-US",
      "input_spec": {
        "description": "Use these exact values when filling the form.",
        "schema": {
          "type": "object",
          "required": [
            "custname",
            "custtel",
            "custemail",
            "size",
            "topping",
            "delivery",
            "comments"
          ],
          "properties": {
            "comments": {
              "type": "string"
            },
            "custemail": {
              "type": "string"
            },
            "custname": {
              "type": "string"
            },
            "custtel": {
              "type": "string"
            },
            "delivery": {
              "type": "string"
            },
            "size": {
              "type": "string"
            },
            "topping": {
              "type": "string"
            }
          }
        },
        "data": {
          "comments": "Leave at reception",
          "custemail": "ada@example.com",
          "custname": "Ada Lovelace",
          "custtel": "+1-415-555-0136",
          "delivery": "19:45",
          "size": "medium",
          "topping": "bacon"
        }
      },
      "output_schema": {
        "type": "object",
        "required": [
          "submitted_form"
        ],
        "properties": {
          "submission_url": {
            "type": "string",
            "format": "uri",
            "optional": true
          },
          "submitted_form": {
            "type": "object",
            "required": [
              "custname",
              "custtel",
              "custemail",
              "size",
              "topping",
              "delivery",
              "comments"
            ],
            "properties": {
              "comments": {
                "type": "string"
              },
              "custemail": {
                "type": "string"
              },
              "custname": {
                "type": "string"
              },
              "custtel": {
                "type": "string"
              },
              "delivery": {
                "type": "string"
              },
              "size": {
                "type": "string"
              },
              "topping": {
                "type": "string"
              }
            }
          }
        }
      },
      "entrypoint": {
        "mode": "url",
        "url": "https://httpbin.org/forms/post"
      },
      "run_policy": {
        "timeout_seconds": 300
      }
    },
    "events": [
      {
        "event_id": "evt_7a6885df43a85165",
        "run_id": "run_9a68eda4d0a40f1f",
        "event_type": "final",
        "payload": {
          "event_type": "final",
          "timestamp": "2026-04-21T10:46:45.188733741Z",
          "page_state": {},
          "intermediate_data": {
            "raw_submission": {
              "submission_url": "https://httpbin.org/post",
              "submitted_form": {
                "comments": "Leave at reception",
                "custemail": "ada@example.com",
                "custname": "Ada Lovelace",
                "custtel": "+1-415-555-0136",
                "delivery": "19:45",
                "size": "medium",
                "topping": "bacon"
              }
            }
          },
          "final_output": {
            "submission_url": "https://httpbin.org/post",
            "submitted_form": {
              "comments": "Leave at reception",
              "custemail": "ada@example.com",
              "custname": "Ada Lovelace",
              "custtel": "+1-415-555-0136",
              "delivery": "19:45",
              "size": "medium",
              "topping": "bacon"
            }
          },
          "artifacts": {},
          "agent": {}
        },
        "created_at": "2026-04-21T10:46:45.188926756Z"
      }
    ],
    "timeout_seconds": 300,
    "deadline_at": "2026-04-21T10:51:45.185298833Z"
  }
}
Provider terminal payload
Raw terminal result retained from the provider adapter.
Expand
{
  "answer": "{\n  \"submission_url\": \"https://httpbin.org/post\",\n  \"submitted_form\": {\n    \"comments\": \"Leave at reception\",\n    \"custemail\": \"ada@example.com\",\n    \"custname\": \"Ada Lovelace\",\n    \"custtel\": \"+1-415-555-0136\",\n    \"delivery\": \"19:45\",\n    \"size\": \"medium\",\n    \"topping\": \"bacon\"\n  }\n}",
  "data": {
    "error": "none",
    "notes": "The form was successfully submitted and the response contains the echoed 'form' object with the expected values.",
    "result": "{\n  \"submission_url\": \"https://httpbin.org/post\",\n  \"submitted_form\": {\n    \"comments\": \"Leave at reception\",\n    \"custemail\": \"ada@example.com\",\n    \"custname\": \"Ada Lovelace\",\n    \"custtel\": \"+1-415-555-0136\",\n    \"delivery\": \"19:45\",\n    \"size\": \"medium\",\n    \"topping\": \"bacon\"\n  }\n}",
    "success": true
  },
  "durationMs": 11729,
  "error": "none",
  "inputTokens": 8257,
  "llmDurationMs": 8734,
  "model": "gemini-3-flash-preview",
  "notes": "The form was successfully submitted and the response contains the echoed 'form' object with the expected values.",
  "outputTokens": 602,
  "recordingId": "f5f20023-4318-4c39-a2f1-107a10c2c48b",
  "result": "{\n  \"submission_url\": \"https://httpbin.org/post\",\n  \"submitted_form\": {\n    \"comments\": \"Leave at reception\",\n    \"custemail\": \"ada@example.com\",\n    \"custname\": \"Ada Lovelace\",\n    \"custtel\": \"+1-415-555-0136\",\n    \"delivery\": \"19:45\",\n    \"size\": \"medium\",\n    \"topping\": \"bacon\"\n  }\n}",
  "success": true,
  "thinkingTokens": 0,
  "totalSteps": 6,
  "totalTokens": 8859,
  "type": "done"
}