feat(spec): add optional `inputFields` and `outputFields` and to `AgentSkill` #814

inesmcm26 · 2025-06-30T22:12:30Z

Description

This PR introduces two new optional fields to the AgentSkill model - inputFields and outputFields - allowing agents to define structured, named input and output fields with detailed type and schema information. These fields complement the existing inputModes and outputModes, which specify supported mime types.

Problem

Currently, agent skills declare only inputModes and outputModes as lists of mime types. While this indicates supported data formats (e.g., "text/plain", "image/png"), it does not convey the structure or semantics of the input/output data, such as named fields or their types. In some use cases, this makes it difficult for clients to:

Understand what specific inputs an agent expects or outputs it produces
Properly validate, log, or debug message contents
Create user interfaces with meaningful form fields to invoke an agent
Handle multiple fields of the same type in a structured way

Solution

This PR adds a new optional layer by defining a FieldDefinition model, which lets agents declare:

Named input and output fields that are expected to be received and returned by a skill
Field kinds (text, file, data)
Supported mime types for each field
Optional JSON schemas for structured data fields
Field descriptions and optionality flags

The AgentSkill model is extended with two new optional lists: inputFields and outputFields. These provide a semantic, machine-readable description of the skill interface, while retaining backward compatibility by keeping inputModes and outputModes.

Benefits

Enables clients to build more precise and user-friendly input forms
Improves clarity and reduces ambiguity in message composition
Facilitates validation, logging, and debugging with schema support
Supports complex skills with multiple inputs/outputs of the same type
Provides a semantic layer that builds upon existing mime type declarations

Closes #813

Idea emerged from discussion in: #462

gemini-code-assist

Summary of Changes

Hello @inesmcm26, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the Agent Skill model by adding optional input and output fields, allowing agents to define structured data formats beyond simple mime types. This addition enables clients to build more precise interfaces, improve validation, and support complex skills with multiple inputs and outputs.

Highlights

Feature: AgentSkill Input/Output Fields: Introduces optional inputFields and outputFields to the AgentSkill model, enabling structured, named input and output field definitions with type and schema information.
New: FieldDefinition Model: Defines a FieldDefinition model with attributes like name, kind (text, file, data), mimeTypes, schema, description, and optional to provide detailed descriptions of input and output fields.
Enhancement: Documentation: Updates documentation to reflect the new inputFields and outputFields in AgentSkill and introduces the FieldDefinition object.
Update: Examples: Updates the example AgentSkill definitions to include inputFields and outputFields to demonstrate the new feature.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces inputFields and outputFields to the AgentSkill model, providing a way to define structured inputs and outputs. My review identified a critical issue in the protobuf definition which is missing the new FieldDefinition message, and a high/medium severity issue in the documentation, mainly concerning an incorrect example. Addressing these will improve the clarity and correctness of the specification.

specification/grpc/a2a.proto

docs/specification.md

holtskinner · 2025-07-01T14:57:37Z

Makes sense to me, @kthota-g can you review?

kthota-g · 2025-07-01T18:23:09Z

Currently examples is the mechanism to hint to the client about possible input types expected. An example is here - https://a2aproject.github.io/A2A/latest/specification/#56-sample-agent-card. If there are any missing attributes for the agent to complete the task, the agent can respond back with input-required to negotiate the required inputs.

Also, different clients can request the output in varying formats and the clients can specify the schema for the output - https://a2aproject.github.io/A2A/latest/specification/#97-structured-data-exchange-requesting-and-providing-json

jankrynauw · 2025-08-04T08:12:59Z

Could the google.protobuf.Any (definition) type not play an elegant role here?

message Any {
  // A URL/resource name that uniquely identifies the type of the serialized
  // protocol buffer message. This string must contain at least
  // one "/" character. The last segment of the URL's path must represent
  // the fully qualified name of the type (as in
  // `path/google.protobuf.Duration`). The name should be in a canonical form
  // (e.g., leading "." is not accepted).
  //
  // In practice, teams usually precompile into the binary all types that they
  // expect it to use in the context of Any. However, for URLs which use the
  // scheme `http`, `https`, or no scheme, one can optionally set up a type
  // server that maps type URLs to message definitions as follows:
  //
  // * If no scheme is provided, `https` is assumed.
  // * An HTTP GET on the URL must yield a [google.protobuf.Type][]
  //   value in binary format, or produce an error.
  // * Applications are allowed to cache lookup results based on the
  //   URL, or have them precompiled into a binary to avoid any
  //   lookup. Therefore, binary compatibility needs to be preserved
  //   on changes to types. (Use versioned type names to manage
  //   breaking changes.)
  //
  // Note: this functionality is not currently available in the official
  // protobuf release, and it is not used for type URLs beginning with
  // type.googleapis.com. As of May 2023, there are no widely used type server
  // implementations and no plans to implement one.
  //
  // Schemes other than `http`, `https` (or the empty scheme) might be
  // used with implementation specific semantics.
  //
  string type_url = 1;

  // Must be a valid serialized protocol buffer of the above specified type.
  bytes value = 2;
}

Inline with how Google uses common (well-known and common) protos to communicate well understood data structures (
google.type.Date, google.type.PostalAddress, google.protobuf.Timestamp, etc.), is there not a world where agent specific definitions (via the AgentSkill) are made available at a well-known endpoint?

And then extending the Part definition to simplify the handling of these:

// Part represents a container for a section of communication content.
// Parts can be purely textual, some sort of file (image, video, etc) or
// a structured data blob (i.e. JSON).
message Part {
  oneof part {
    string text = 1;
    FilePart file = 2;
    DataPart data = 3;
  }
}

// FilePart represents the different ways files can be provided. If files are
// small, directly feeding the bytes is supported via file_with_bytes. If the
// file is large, the agent should read the content as appropriate directly
// from the file_with_uri source.
message FilePart {
  oneof file {
    string file_with_uri = 1;
    bytes file_with_bytes = 2;
  }
  string mime_type = 3;
}

// DataPart represents a structured blob. This is most commonly a JSON payload.
message DataPart {
  google.protobuf.Struct data = 1;
  google.protobuf.Any input = 2; 
  google.protobuf.Any output = 3;
}

Or even this:

// DataPart represents a structured object. This is most commonly a JSON payload.
message DataPart {
  google.protobuf.Any data = 1;
}

darrelmiller · 2025-09-09T13:43:52Z

While I agree that there is some opportunity for improvement in how inputs and outputs of skills can be described, I think the current approach of providing media type identifiers and examples are a good compromise. Trying to define a new schema language for input parts is a non-trivial task and one that no LLM is going to understand implicitly. As crude as it appears, the use of examples is quite effective with LLMs.

I sympathize with the desire here, but I don't think the proposed solution will make the situation better. It does however, raise the question of whether there should be a metadata property in skills so that extensions could be defined to enable more strictly controlled interactions.

inesmcm26 requested a review from a team as a code owner June 30, 2025 22:12

feat: add optional inputFields and outputFields to AgentSkill

f2b46bd

gemini-code-assist bot reviewed Jun 30, 2025

View reviewed changes

inesmcm26 force-pushed the feat/structured-inout-fields branch from 2959a0a to f2b46bd Compare June 30, 2025 22:13

gemini-code-assist bot reviewed Jun 30, 2025

View reviewed changes

specification/grpc/a2a.proto Show resolved Hide resolved

docs/specification.md Outdated Show resolved Hide resolved

docs/specification.md Show resolved Hide resolved

inesmcm26 force-pushed the feat/structured-inout-fields branch 2 times, most recently from 0c16f8e to 58f736d Compare July 1, 2025 09:25

update specification.md and add field definition to proto

ab8597a

inesmcm26 force-pushed the feat/structured-inout-fields branch from 58f736d to ab8597a Compare July 1, 2025 09:26

holtskinner assigned kthota-g Jul 1, 2025

holtskinner assigned swapydapy Jul 1, 2025

holtskinner force-pushed the main branch from f6095b5 to 3e8f8a5 Compare July 8, 2025 15:36

holtskinner requested a review from a team as a code owner July 23, 2025 14:20

holtskinner requested a review from a team as a code owner August 27, 2025 17:04

jnapper7 mentioned this pull request Oct 21, 2025

[Feat]: Support out-of-message (band) parameters to agents #1169

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(spec): add optional `inputFields` and `outputFields` and to `AgentSkill` #814

feat(spec): add optional `inputFields` and `outputFields` and to `AgentSkill` #814

Uh oh!

inesmcm26 commented Jun 30, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

holtskinner commented Jul 1, 2025

Uh oh!

kthota-g commented Jul 1, 2025

Uh oh!

jankrynauw commented Aug 4, 2025 •

edited

Loading

Uh oh!

darrelmiller commented Sep 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

feat(spec): add optional inputFields and outputFields and to AgentSkill #814

Are you sure you want to change the base?

feat(spec): add optional inputFields and outputFields and to AgentSkill #814

Uh oh!

Conversation

inesmcm26 commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Problem

Solution

Benefits

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

holtskinner commented Jul 1, 2025

Uh oh!

kthota-g commented Jul 1, 2025

Uh oh!

jankrynauw commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

darrelmiller commented Sep 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

feat(spec): add optional `inputFields` and `outputFields` and to `AgentSkill` #814

feat(spec): add optional `inputFields` and `outputFields` and to `AgentSkill` #814

inesmcm26 commented Jun 30, 2025 •

edited

Loading

jankrynauw commented Aug 4, 2025 •

edited

Loading