-
Notifications
You must be signed in to change notification settings - Fork 558
feat: Modify endpoints for OpenAPI compatibility #1340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
feat: Modify endpoints for OpenAPI compatibility #1340
Conversation
e5ac825 to
f494cf2
Compare
|
Updated |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #1340 +/- ##
===========================================
+ Coverage 71.66% 71.70% +0.04%
===========================================
Files 171 171
Lines 17015 17071 +56
===========================================
+ Hits 12193 12240 +47
- Misses 4822 4831 +9
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
please don't merge- we want to refactor this a little more |
nemoguardrails/server/api.py
Outdated
| default=None, | ||
| description="A state object that should be used to continue the interaction.", | ||
| ) | ||
| # Standard OpenAI completion parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's bring the OpenAI schema into a separate file- perhaps server/schemes/openai
nemoguardrails/server/api.py
Outdated
| default=None, | ||
| description="Top-p sampling parameter.", | ||
| ) | ||
| stop: Optional[str] = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stop needs to be Optional[Union[str, List[str]]]
nemoguardrails/server/api.py
Outdated
| index: Optional[int] = Field( | ||
| default=None, description="The index of the choice in the list of choices." | ||
| ) | ||
| messages: Optional[dict] = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: messages needs to be message (no s)
2ec47ee to
96f759f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR adds OpenAI API compatibility to NeMo Guardrails by introducing a new /v1/models endpoint and modifying /v1/chat/completions to return OpenAI-compatible response structures.
Key Changes:
- Created
nemoguardrails/server/schemas/openai.pywith Pydantic models for OpenAI-compatible requests and responses - Modified
RequestBodyto inherit fromOpenAIRequestFields, mappingmodelparameter toconfig_idfor backward compatibility - Updated
ResponseBodyto include OpenAI standard fields (id,object,created,choices) while maintaining NeMo-specific extensions (state,llm_output,log) - Added
/v1/modelsendpoint that lists available guardrails configurations in OpenAI model format - Implemented Colang 2.x state serialization support in
runtime.py - Updated all tests to verify new response structure
Issues Found:
- Critical: OpenAI parameters (
max_tokens,temperature, etc.) are only applied whenthread_idis present due to incorrect indentation (nemoguardrails/server/api.py:460-472) - Syntax Error: Test assertion uses wrong response path in tests/test_threads.py:143
Confidence Score: 2/5
- This PR should NOT be merged until the critical indentation bug is fixed - OpenAI parameters won't work without thread_id
- Score reflects a critical logic error that breaks OpenAI parameter functionality for non-threaded requests, plus a test syntax error that will cause test failures
- Pay immediate attention to nemoguardrails/server/api.py (lines 460-472 need unindenting) and tests/test_threads.py (line 143 needs correction)
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| nemoguardrails/server/schemas/openai.py | 5/5 | New file defining OpenAI-compatible schemas. Well-structured with proper Pydantic models for requests and responses. |
| nemoguardrails/server/api.py | 2/5 | Adds /v1/models endpoint and modifies /v1/chat/completions for OpenAI compatibility. Critical bug: OpenAI parameters only applied when using thread_id due to incorrect indentation. |
| tests/test_threads.py | 3/5 | Updates thread tests for new response format. Contains syntax error on line 143 with incorrect response path. |
Sequence Diagram
sequenceDiagram
participant Client
participant API as /v1/chat/completions
participant RequestBody
participant DataStore
participant LLMRails
participant ResponseBody
Client->>API: POST with model/config_id + messages
API->>RequestBody: Validate request
RequestBody->>RequestBody: Map model → config_id
RequestBody->>RequestBody: Apply OpenAI parameters
alt thread_id provided
API->>DataStore: Fetch thread messages
DataStore-->>API: Previous messages
API->>API: Prepend thread history
end
alt streaming enabled
API->>LLMRails: generate_async (streaming)
LLMRails-->>Client: StreamingResponse
else non-streaming
API->>LLMRails: generate_async
LLMRails-->>API: GenerationResponse
alt thread_id provided
API->>DataStore: Save updated thread
end
API->>ResponseBody: Build OpenAI response
ResponseBody->>ResponseBody: Add choices, message, id, created
ResponseBody->>ResponseBody: Include NeMo extensions (state, log)
ResponseBody-->>Client: JSON response
end
6 files reviewed, 2 comments
nemoguardrails/server/api.py
Outdated
| generation_options = body.options | ||
| if body.max_tokens: | ||
| generation_options.max_tokens = body.max_tokens | ||
| if body.temperature is not None: | ||
| generation_options.temperature = body.temperature | ||
| if body.top_p is not None: | ||
| generation_options.top_p = body.top_p | ||
| if body.stop: | ||
| generation_options.stop = body.stop | ||
| if body.presence_penalty is not None: | ||
| generation_options.presence_penalty = body.presence_penalty | ||
| if body.frequency_penalty is not None: | ||
| generation_options.frequency_penalty = body.frequency_penalty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: OpenAI parameters only applied when using thread_id - these lines are incorrectly indented inside the if body.thread_id: block (starts at line 428). Move this block outside and before the streaming check to ensure parameters work for all requests.
| generation_options = body.options | |
| if body.max_tokens: | |
| generation_options.max_tokens = body.max_tokens | |
| if body.temperature is not None: | |
| generation_options.temperature = body.temperature | |
| if body.top_p is not None: | |
| generation_options.top_p = body.top_p | |
| if body.stop: | |
| generation_options.stop = body.stop | |
| if body.presence_penalty is not None: | |
| generation_options.presence_penalty = body.presence_penalty | |
| if body.frequency_penalty is not None: | |
| generation_options.frequency_penalty = body.frequency_penalty | |
| generation_options = body.options | |
| if body.max_tokens: | |
| generation_options.max_tokens = body.max_tokens | |
| if body.temperature is not None: | |
| generation_options.temperature = body.temperature | |
| if body.top_p is not None: | |
| generation_options.top_p = body.top_p | |
| if body.stop: | |
| generation_options.stop = body.stop | |
| if body.presence_penalty is not None: | |
| generation_options.presence_penalty = body.presence_penalty | |
| if body.frequency_penalty is not None: | |
| generation_options.frequency_penalty = body.frequency_penalty | |
| if ( |
tests/test_threads.py
Outdated
| ) | ||
| res = response.json() | ||
| assert res["messages"][0]["content"] == "Hello again!" | ||
| assert res["choices"]["messages"][0]["content"] == "Hello again!" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: Incorrect response path - should be res["choices"][0]["message"]["content"] to match OpenAI format
| assert res["choices"]["messages"][0]["content"] == "Hello again!" | |
| assert res["choices"][0]["message"]["content"] == "Hello again!" |
96f759f to
b411d61
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR adds OpenAI API compatibility by introducing a new /v1/models endpoint and modifying /v1/chat/completions to return OpenAI-compatible response structures. The changes include new schema definitions, test updates, and enhanced state handling for Colang 2.x.
Critical Issues Found:
- Import/syntax errors: Old
ResponseBodyclass not removed, causing conflicts with the new OpenAI-compatible version and referencing missing imports (BaseModel,GenerationLog) - Logic bugs: OpenAI parameters (
max_tokens,temperature, etc.) are set but never used - both streaming and non-streaming paths passbody.optionsinstead of the modifiedgeneration_options - Test error: Incorrect array indexing in
test_threads.pyline 143
Major Changes:
- Added
/v1/modelsendpoint that lists available guardrails configurations in OpenAI format - Restructured response format to include
choicesarray with OpenAI-compatible structure while maintaining backward compatibility with NeMo-Guardrails fields (state,llm_output, etc.) - Enhanced Colang 2.x runtime to deserialize state from API calls
- Updated all tests to use new response structure
Confidence Score: 1/5
- This PR has critical syntax errors that will prevent the module from loading and logic bugs that break core functionality
- Score reflects three critical issues: (1) syntax errors from conflicting ResponseBody classes and missing imports will cause immediate module load failures, (2) logic bug where OpenAI parameters are set but never used means the API won't work as intended, (3) test syntax error. These must be fixed before merge.
- Pay immediate attention to
nemoguardrails/server/api.py(has syntax and logic errors) andtests/test_threads.py(has syntax error)
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| nemoguardrails/server/api.py | 1/5 | Critical syntax errors (old ResponseBody class conflicts with import) and logic bugs (OpenAI parameters not applied to generation calls) |
| nemoguardrails/server/schemas/openai.py | 5/5 | New OpenAI schema definitions - well structured with proper field descriptions and backward compatibility fields |
| tests/test_threads.py | 2/5 | Updated to new response format but has incorrect array indexing on line 143 |
Sequence Diagram
sequenceDiagram
participant Client
participant API as FastAPI Server
participant Rails as LLMRails
participant Datastore
participant LLM
Client->>API: POST /v1/chat/completions
Note over API: RequestBody with model/config_id mapping
API->>API: Validate thread_id (if provided)
alt Thread ID provided
API->>Datastore: get("thread-{thread_id}")
Datastore-->>API: Previous messages
API->>API: Prepend thread messages
end
API->>API: Set generation_options from OpenAI params
Note over API: max_tokens, temperature, top_p, etc.
alt Streaming mode
API->>Rails: generate_async(messages, streaming_handler, options)
Rails->>LLM: Generate with streaming
LLM-->>API: Stream tokens
API-->>Client: StreamingResponse
else Non-streaming mode
API->>Rails: generate_async(messages, options, state)
Rails->>LLM: Generate completion
LLM-->>Rails: Response
Rails-->>API: GenerationResponse
alt Thread ID provided
API->>Datastore: set("thread-{thread_id}", updated messages)
end
API->>API: Build OpenAI-compatible ResponseBody
Note over API: Add choices array with message
API-->>Client: ResponseBody (OpenAI format + extensions)
end
Client->>API: GET /v1/models
API->>API: List config directories
API->>API: Convert to OpenAI Model format
API-->>Client: ModelsResponse
Additional Comments (3)
-
nemoguardrails/server/api.py, line 263-281 (link)syntax: Old
ResponseBodyclass should be removed - it conflicts with the new OpenAI-compatibleResponseBodyimported fromopenai.py(line 43). This class also referencesBaseModelandGenerationLogwhich are no longer imported, causingNameErrors when the module loads. -
nemoguardrails/server/api.py, line 538 (link)logic: OpenAI parameters not applied in streaming mode - should use
generation_optionsinstead ofbody.options -
nemoguardrails/server/api.py, line 546 (link)logic: OpenAI parameters not applied - should use
generation_optionsinstead ofbody.options
6 files reviewed, 4 comments
tests/test_threads.py
Outdated
| ) | ||
| res = response.json() | ||
| assert res["messages"][0]["content"] == "Hello again!" | ||
| assert res["choices"]["message"][0]["content"] == "Hello again!" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: Wrong array index placement - choices is an array, not a dict
| assert res["choices"]["message"][0]["content"] == "Hello again!" | |
| assert res["choices"][0]["message"]["content"] == "Hello again!" |
b411d61 to
8d809d4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR successfully refactors the NeMo Guardrails server to support OpenAI-compatible API endpoints.
Key Changes:
- Adds new
/v1/modelsendpoint that lists available guardrails configurations in OpenAI model format - Refactors
/v1/chat/completionsresponse to use OpenAI'schoicesarray structure withChoiceobjects - Maps OpenAI request fields (
model,max_tokens,temperature,top_p,stop,presence_penalty,frequency_penalty) to internalllm_params - Maintains backward compatibility by including NeMo-specific fields (
state,llm_output,output_data,log) in responses - Enables Colang 2.x state deserialization from dict format for stateful conversations
- Updates all tests to validate new response structure
Architecture:
The PR introduces a clean separation of concerns by creating nemoguardrails/server/schemas/openai.py with OpenAI-specific models, while extending RequestBody to inherit from OpenAIRequestFields. The model field is automatically mapped to config_id via a validator, and OpenAI parameters are converted to llm_params before passing to the LLM generation.
Confidence Score: 5/5
- Safe to merge - well-structured refactoring with comprehensive test coverage and backward compatibility
- The implementation is clean and follows OpenAI's API specification correctly. Previous indentation issues have been resolved. All tests are updated appropriately, and the changes maintain backward compatibility by keeping NeMo-specific fields in responses.
- No files require special attention - all changes are well-implemented
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| nemoguardrails/server/schemas/openai.py | 5/5 | New file defining OpenAI-compatible schema classes (OpenAIRequestFields, Choice, ResponseBody, Model, ModelsResponse) - well-structured with clear field descriptions |
| nemoguardrails/server/api.py | 4/5 | Adds /v1/models endpoint and refactors /v1/chat/completions to use OpenAI-compatible response format, maps OpenAI parameters to llm_params |
| nemoguardrails/colang/v2_x/runtime/runtime.py | 5/5 | Adds state deserialization support for Colang 2.x from dict format using json_to_state - enables state continuity in API calls |
Sequence Diagram
sequenceDiagram
participant Client
participant API as FastAPI Server
participant RequestBody
participant RailsConfig
participant LLMRails
participant DataStore
participant LLM as Language Model
Client->>API: POST /v1/chat/completions
API->>RequestBody: Validate request (OpenAI params)
RequestBody->>RequestBody: Map model → config_id
RequestBody->>RequestBody: Map OpenAI params → llm_params
alt thread_id provided
API->>DataStore: Fetch thread messages
DataStore-->>API: Return thread history
API->>API: Prepend thread messages
end
API->>RailsConfig: Load config by config_id
RailsConfig-->>API: Return LLMRails instance
alt streaming enabled
API->>LLMRails: generate_async (streaming)
LLMRails->>LLM: Stream tokens
LLM-->>Client: Server-sent events
else non-streaming
API->>LLMRails: generate_async (options with llm_params)
LLMRails->>LLM: Generate with parameters
LLM-->>LLMRails: Response
LLMRails-->>API: GenerationResponse
alt thread_id provided
API->>DataStore: Save updated thread messages
end
API->>API: Build ResponseBody (OpenAI format)
API-->>Client: Return choices with message
end
Client->>API: GET /v1/models
API->>RailsConfig: List available configs
RailsConfig-->>API: config_ids
API->>API: Convert to Model objects
API-->>Client: ModelsResponse with data array
6 files reviewed, no comments
8d809d4 to
59c1644
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
Modified the /v1/chat/completions endpoint and added /v1/models to be OpenAI-compatible while maintaining backward compatibility with NeMo-Guardrails specific features.
Major changes:
- Introduced new OpenAPI schema definitions (
openai.py) withOpenAIRequestFields,ResponseBody,Choice,Model, andModelsResponseclasses - Updated
RequestBodyto inherit fromOpenAIRequestFields, mappingmodelfield toconfig_idfor backward compatibility - Modified response format to match OpenAI spec with
id,object,created,model, andchoicesarray structure - Added
/v1/modelsendpoint that lists available guardrails configurations as OpenAI-compatible models - OpenAI parameters (
max_tokens,temperature,top_p,stop,presence_penalty,frequency_penalty) are now mapped tollm_params - Enhanced Colang 2.x runtime to deserialize state dicts from API calls
- Updated all tests to work with the new response format
- Error responses now return OpenAI-compatible structure with
finish_reason="error"
Backward compatibility:
- NeMo-Guardrails specific fields (
state,llm_output,output_data,log) are preserved in responses - Existing
config_idparameter continues to work alongside the newmodelparameter - The
optionsparameter and all existing functionality remain unchanged
Confidence Score: 5/5
- This PR is safe to merge with excellent test coverage and proper backward compatibility
- The implementation is well-designed with comprehensive test coverage across all endpoints and response formats. The code properly maintains backward compatibility by preserving all existing NeMo-Guardrails fields while adding OpenAI compatibility. The OpenAI parameter mapping is correctly implemented outside the thread_id block (as noted in previous comments). State deserialization is properly handled with appropriate error checking. All tests have been updated to reflect the new response structure.
- No files require special attention
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| nemoguardrails/server/schemas/openai.py | 5/5 | New file introducing OpenAI-compatible schema definitions for request/response formats, well-structured with proper field descriptions |
| nemoguardrails/server/api.py | 4/5 | Modified to support OpenAI compatibility: added /v1/models endpoint, updated response format to match OpenAI spec, and added parameter mapping |
| nemoguardrails/colang/v2_x/runtime/runtime.py | 5/5 | Added state deserialization from dict format to support API state continuation, properly handles serialized state from API calls |
Sequence Diagram
sequenceDiagram
participant Client
participant API as FastAPI Server
participant RequestBody
participant Rails as LLMRails
participant DataStore
participant Runtime as RuntimeV2_x
Client->>API: POST /v1/chat/completions
Note over Client,API: OpenAI-compatible request<br/>{model, messages, temperature, etc}
API->>RequestBody: Validate & parse request
Note over RequestBody: Map `model` → `config_id`<br/>Map OpenAI params → llm_params
RequestBody->>API: body.options.llm_params populated
alt thread_id provided
API->>DataStore: get(thread-{thread_id})
DataStore-->>API: Previous messages
Note over API: Prepend thread messages
end
alt state provided
API->>Runtime: Pass state dict
Runtime->>Runtime: json_to_state(state["state"])
Note over Runtime: Deserialize state for Colang 2.x
end
alt streaming enabled
API->>Rails: generate_async(messages, options, state)
Rails-->>API: StreamingResponse
API-->>Client: Server-sent events stream
else non-streaming
API->>Rails: generate_async(messages, options, state)
Rails-->>API: GenerationResponse
Note over API: Build OpenAI-compatible response<br/>{id, object, created, model, choices}
alt thread_id provided
API->>DataStore: set(thread-{thread_id}, updated_messages)
end
API-->>Client: ResponseBody with choices array
end
Client->>API: GET /v1/models
API->>API: List config directories
Note over API: Convert configs → Model objects
API-->>Client: ModelsResponse {data: [models]}
6 files reviewed, no comments
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Description
This PR modifies the existing
/v1/chat/completionsand adds a/v1/modelsendpoint to be OpenAI compatible.Changes:
nemoguardrails/server/api.py- Introduces new classes that correspond to OpenAI's standard chat completion and models APItests/test_api.py,tests/test_server_calls_with_state.py&tests/test_threads- Updates the message content path according to newResponseBodyclass and adds new tests for the/v1/modelsAPIRelated Issue(s)
Checklist