fix(openai): support for 5.x versions #646

galkleinman · 2025-08-14T08:40:39Z

Resolves #642

Important

Updated OpenAI SDK to 5.x, improved error handling, testing, and dependency management in OpenAI instrumentation.

Behavior:
- Updated OpenAI SDK to version 5.12.2 in package.json.
- Enhanced manual instrumentation in OpenAIInstrumentation to support OpenAI 5.x.
- Improved handling of tool_calls and function_calls in instrumentation.ts.
Error Handling:
- Added comprehensive error handling in instrumentation.test.ts for Polly.js.
- Improved runtime error handling for tool_calls in instrumentation.ts.
Testing:
- Removed HAR network recording fixtures in instrumentation.test.ts.
- Updated tests to be more resilient with relaxed assertions and guarded setup.
Dependencies:
- Added @types/node-fetch and node-fetch as dependencies in package.json.
- Updated pnpm-workspace.yaml with dependency overrides for form-data and tmp.

^{This description was created by}^{for 86d5d21. You can customize this summary. It will automatically update as commits are pushed.}

Summary by CodeRabbit

New Features
- Expanded compatibility with newer OpenAI API/SDK versions (v4–v5+).
Bug Fixes
- More reliable streaming behavior and tool-call handling.
- Prevented incorrect span attributes for non-function tool calls.
Tests
- Updated and added HAR recordings for chat, streaming, images, and logprobs scenarios.
- Stabilized tests with improved Polly configuration and extended timeouts for image workflows.
Chores
- Dependency updates and workspace overrides to address transitive package versions (form-data, tmp).
- Updated client/runtime integrations to align with latest OpenAI SDK.

coderabbitai · 2025-08-14T08:40:47Z

Walkthrough

Refactors OpenAI instrumentation for v4–v5+ compatibility, adjusts promise/stream handling and tool-call guards, updates tests and Polly setup, relocates/refreshes HAR recordings under test/recordings, and bumps dependencies (OpenAI, node-fetch). Adds workspace overrides for transitive deps.

Changes

Cohort / File(s)	Summary
Core instrumentation update `packages/instrumentation-openai/src/instrumentation.ts`	Introduces APIPromiseType unification, updates wrapping via _thenUnwrap, broadens version range (>=4 <6), changes manuallyInstrument(module: unknown), hardens tool/function guards, and tweaks streaming tool_call handling and span emission.
Test harness adjustments `packages/instrumentation-openai/test/instrumentation.test.ts`	Extends Polly config, injects node-fetch into OpenAI client, enables instrumentation, guards Polly hooks, relaxes tool-call assertions, moves image calls to imageOpenai, and increases timeouts.
Dependency and workspace overrides `packages/instrumentation-openai/package.json`, `pnpm-workspace.yaml`	Upgrades openai devDependency to 5.12.2; adds node-fetch and @types/node-fetch; adds overrides to force form-data >=4.0.4 and tmp >=0.2.4.
Removed legacy recordings `packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/.../recording.har`	Deletes multiple Polly HAR fixtures for image edit/variation/generation and streaming/chat tests from the old recordings directory.
Added/updated test recordings `packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/.../recording.har`	Adds and refreshes HAR fixtures for chat/completions (incl. streaming and logprobs) and image generation; updates headers, payloads, timings for OpenAI JS 5.12.2.

Sequence Diagram(s)

sequenceDiagram
  participant App
  participant Instrumentation
  participant OpenAI_SDK
  participant Tracer

  App->>Instrumentation: manuallyInstrument(module)
  Instrumentation-->>OpenAI_SDK: Patch methods (chat/completions/images)
  App->>OpenAI_SDK: invoke API (may stream)
  Instrumentation->>Tracer: start span
  OpenAI_SDK-->>Instrumentation: APIPromiseType (wrapped via _thenUnwrap)
  alt Streaming
    loop chunks
      OpenAI_SDK-->>Instrumentation: chunk (delta/tool_calls/logprobs)
      Instrumentation->>Tracer: record events/attrs (guard non-function tools)
    end
  else Non-stream
    OpenAI_SDK-->>Instrumentation: result
    Instrumentation->>Tracer: set attributes
  end
  Instrumentation->>Tracer: end span
  Instrumentation-->>App: result/stream proxy

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

chore(deps): upgrade form-data to resolve critical cve #617 — Also modifies pnpm-workspace.yaml to override form-data to >=4.0.4.
feat(openai): images support #623 — Related OpenAI instrumentation changes adding/patching image support and associated tests.

Suggested reviewers

nirga
doronkopit5

Poem

Hoppity logs and spans align,
I nibble bytes—compat divine.
Streams now flow, with tools in tow,
Promises unwrap—onward we go!
New HAR burrows, old ones fade,
Thump-thump: a stable trace parade.
🥕✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch gk/openai-v5

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

ellipsis-dev

Important

Looks good to me! 👍

Reviewed everything up to e0c4d82 in 2 minutes and 13 seconds. Click for details.

Reviewed 2716 lines of code in 17 files
Skipped 1 files when reviewing.
Skipped posting 4 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/instrumentation-openai/test/instrumentation.test.ts:116

Draft comment:
Token usage attributes are compared in some tests as strings (e.g., "15" in the chat test) and in others as numbers (e.g., 82 in the function‐calling test). Consider normalizing attribute types to avoid type inconsistencies.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

2. packages/instrumentation-openai/test/instrumentation.test.ts:825

Draft comment:
Hard-coded expected token counts (e.g., 1056 tokens for DALL-E 2 and 4160 tokens for DALL-E 3 HD) may be brittle. Consider using ranges or thresholds, or documenting why these exact numbers are expected, given that changes in the underlying tokenization library could affect these values.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

3. packages/instrumentation-openai/test/instrumentation.test.ts:182

Draft comment:
Test for 'streaming chat with new API' is skipped. It might be helpful to include a comment on why it’s skipped and under what conditions it should be re-enabled.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

4. packages/instrumentation-openai/test/instrumentation.test.ts:676

Draft comment:
The tests that check function and tool calling attributes contain several repeated assertions. Consider refactoring common span attribute validations into helper functions to reduce duplication and improve maintainability.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_5112oke7OkElsm7V

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed a7153ab in 51 seconds. Click for details.

Reviewed 13 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/instrumentation-openai/src/instrumentation.ts:122

Draft comment:
Consider adding a comment explaining why the version range is capped at <6. This ensures clarity on supported OpenAI SDK versions.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_rrFVTXRPiVVxzOkr

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

coderabbitai

Actionable comments posted: 9

🔭 Outside diff range comments (3)

packages/instrumentation-openai/src/instrumentation.ts (1)
483-497: Tool-call accumulation: assign stable fields (id, type), append only to arguments.

id should be assigned (not concatenated).

type should be assigned (already fixed here).

name may stream in segments, but most servers send it once. To be safe, set on first occurrence, append otherwise.

arguments should be concatenated, as it streams chunk by chunk.
           if (result.choices[0].message.tool_calls) {
             if (toolCall.id) {
-              result.choices[0].message.tool_calls[toolCall.index].id +=
-                toolCall.id;
+              (result.choices[0].message.tool_calls[toolCall.index] as any).id =
+                toolCall.id;
             }
             if (toolCall.type) {
               result.choices[0].message.tool_calls[toolCall.index].type =
                 toolCall.type;
             }
             if (toolCall.function?.name) {
-              (result.choices[0].message.tool_calls[toolCall.index] as any).function.name += toolCall.function.name;
+              const target = (result.choices[0].message.tool_calls[toolCall.index] as any);
+              if (!target.function.name) {
+                target.function.name = toolCall.function.name;
+              } else {
+                target.function.name += toolCall.function.name;
+              }
             }
             if (toolCall.function?.arguments) {
-              (result.choices[0].message.tool_calls[toolCall.index] as any).function.arguments += toolCall.function.arguments;
+              (result.choices[0].message.tool_calls[toolCall.index] as any).function.arguments +=
+                toolCall.function.arguments;
             }
           }
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-function-calling_1703714575/recording.har (1)
82-85: Add a 5.x “tools” function-calling recording alongside legacy “functions”.

OpenAI 5.x uses tools: [{ type: 'function', function: { ... } }] and often the Responses API. To truly cover 5.x, add a HAR recording that exercises tools (non-streaming and streaming) so instrumentation spans/attributes are validated on the new shape.

Example request body you can capture:
{
  "model": "gpt-4o-mini",
  "messages": [{ "role": "user", "content": "What's the weather like in Boston?" }],
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_current_weather",
      "description": "Get the current weather in a given location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" },
          "unit": { "type": "string", "enum": ["celsius", "fahrenheit"] }
        },
        "required": ["location"]
      }
    }
  }]
}
If you’ve already added such tests elsewhere, ignore this.
packages/instrumentation-openai/test/instrumentation.test.ts (1)

79-81: Ensure OpenAI v5.x Is Exercised in CI

Our inspection shows that every openai dependency is still on v4 and all HAR fixtures record “OpenAI/JS 4.x” in their user-agent. To truly validate v5 support, please address the following:

• Dependencies

packages/instrumentation-openai/package.json → "openai": "4.38.3"

packages/traceloop-sdk/package.json → "openai": "4.38.3"

packages/sample-app/package.json → "openai": "^4.104.0"

• HAR fixtures under
packages/instrumentation-openai/recordings still advertise user-agent v4

Recommendations:

Bump openai to ^5.x in your package manifests (or introduce a v4/v5 matrix in CI).

Re-record the HAR fixtures using the v5.x client so that user-agent reflects “OpenAI/JS 5.x”.

Verify replay still passes under both major versions, if backward compatibility is required.

♻️ Duplicate comments (7)

packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-completion_322013537/recording.har (2)

65-67: Prevent header-driven fixture churn (x-stainless-runtime-version).

Same note as other HARs: exclude volatile headers from Polly matching to avoid re-recording when Node/SDK varies.

90-96: Sanity-check base64 payload decodes to valid JSON.

Ensure the updated completion payload still decompresses to JSON; see the verification script in the other comment.

packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-function-calling_1703714575/recording.har (1)

90-96: Validate function-calling response payload integrity.

Same integrity check as other HARs; the base64 chunks should gunzip to valid JSON.

packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-chat_72904453/recording.har (2)

65-67: Exclude x-stainless- headers from matching.*

Streaming test fixtures also include volatile x-stainless-* headers; exclude them in Polly matchers to avoid re-recording.

189-193: Optional: mask set-cookie to reduce churn.

Same rationale as other HARs. If you keep it, prefer masking the value.

packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-calculate-correct-tokens-for-different-quality-levels_3292443992/recording.har (2)

90-96: Check JSON integrity for the first image-generation response.

Run the provided verification script to ensure the base64 payload decodes and parses cleanly.

281-287: Check JSON integrity for the second image-generation response.

Same as above; verify the multi-kB payload isn’t truncated/corrupted.

🧹 Nitpick comments (9)

packages/instrumentation-openai/src/instrumentation.ts (4)
120-123: Loosening the OpenAI version range is OK; consider adding a soft upper bound.

Changing to ">=3.1.0" enables 5.x, but it may unintentionally pick future major versions with breaking APIs. If you want to be conservative, consider "<6" until you explicitly test 6.x.
- [">=3.1.0"],
+ [">=3.1.0 <6"],
432-437: Avoid broad any; prefer a minimal structural type for the streaming accumulator.

Casting the message to any works but loses type safety. A narrow type keeps compiler help without over-constraining.
- } as any,
+ } as unknown as {
+ role: "assistant";
+ content: string;
+ tool_calls: Array<{
+ id: string;
+ type: "function" | string;
+ function?: { name: string; arguments: string };
+ }>;
+ // function_call may appear in older tool_call variants
+ function_call?: { name: string; arguments: string };
+ },
456-467: Function-call (legacy) streaming likely needs incremental accumulation.

You only set function_call when both name and arguments are present in the same delta. In practice, arguments stream separately. Consider accumulating name/arguments across chunks as you do for tool_calls.

If supporting legacy function_call streaming matters for 5.x, mirror the tool_calls logic: initialize function_call once present, then append arguments across subsequent deltas.

702-711: Attribute emission guard is good; additionally check function existence.

To avoid potential undefined access, also ensure toolCall.function is truthy.
- if (toolCall.type === 'function' && 'function' in toolCall) {
+ if (
+ toolCall.type === 'function' &&
+ 'function' in toolCall &&
+ (toolCall as any).function
+ ) {
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-chat_1427850401/recording.har (1)
97-107: Avoid persisting response.cookies unless tests assert on them.

The same cookie appears in both response.cookies and the set-cookie header. If tests don’t assert cookies, drop this block to reduce churn and PII surface.

Apply this diff if cookies aren’t used in assertions:
- "cookies": [
- {
- "domain": ".api.openai.com",
- "httpOnly": true,
- "name": "_cfuvid",
- "path": "/",
- "sameSite": "None",
- "secure": true,
- "value": "ptcvgRCTCq1IFGlU5o6Np6S2Sc3_QtrOzQVs_FAUe.A-1755159619967-0.0.1.1-604800000"
- }
- ],
+ "cookies": [],
If you need the presence but not the values, mask them via a beforePersist hook instead.
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-function-calling_1703714575/recording.har (1)
190-193: Mask set-cookie values to reduce fixture churn (optional).

If no test asserts on cookie content, mask the value to a constant to keep diffs minimal between re-recordings.

Apply:
- "value": "_cfuvid=taZ4ULbzwwbyarAUfW9FCwZOX6yXkJ7aLgBL2DEyid4-1755159626839-0.0.1.1-604800000; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None"
+ "value": "_cfuvid=<masked>; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None"
Alternatively, mask via a Polly beforePersist hook.
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-calculate-correct-tokens-for-different-quality-levels_3292443992/recording.har (1)
158-165: Consider masking or dropping x-content-type-options/set-cookie in image tests.

Token-calculation tests shouldn’t depend on these response headers; masking/dropping them reduces fixture churn and file size.

Proposed minimal diff:
- {
- "name": "x-content-type-options",
- "value": "nosniff"
- },
- {
- "_fromType": "array",
- "name": "set-cookie",
- "value": "_cfuvid=duRL0kE4c3tQv8yDth8cAiYfQyZaMw8O8MpX1R4I88A-1755159649952-0.0.1.1-604800000; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None"
- },
+ // removed volatile headers not used by the test
If any test asserts on these, skip this change.
packages/instrumentation-openai/test/instrumentation.test.ts (2)
417-420: Use strict numeric assertion for token counts.

Normalize the attribute to a number and assert strictly to avoid type drift across SDK versions.
- assert.equal(
- completionSpan.attributes[`${SpanAttributes.LLM_USAGE_PROMPT_TOKENS}`],
- 82,
- );
+ assert.strictEqual(
+ Number(completionSpan.attributes[`${SpanAttributes.LLM_USAGE_PROMPT_TOKENS}`]),
+ 82,
+ );
517-519: Prefer strict numeric assertion for prompt tokens.

Same rationale as above.
- assert.equal(
- completionSpan.attributes[`${SpanAttributes.LLM_USAGE_PROMPT_TOKENS}`],
- 82,
- );
+ assert.strictEqual(
+ Number(completionSpan.attributes[`${SpanAttributes.LLM_USAGE_PROMPT_TOKENS}`]),
+ 82,
+ );

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e240945 and e0c4d82.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (17)

packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-calculate-correct-tokens-for-different-quality-levels_3292443992/recording.har (12 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-chat-completion_1028826680/recording.har (8 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-stream-chat-completion_2713905208/recording.har (7 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-chat_1427850401/recording.har (8 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-completion_322013537/recording.har (7 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-function-calling_1703714575/recording.har (8 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-image-editing_4216676336/recording.har (0 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-image-generation_2726543658/recording.har (6 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-image-variation_2237177737/recording.har (0 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-chat-with-new-API_1429213105/recording.har (0 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-chat_72904453/recording.har (7 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-completion_1484001317/recording.har (7 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-tool-calling_1747151373/recording.har (8 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-function_call-attributes-in-span-for-stream-completion-when-multiple-tools-cal_2278551756/recording.har (8 hunks)
packages/instrumentation-openai/src/instrumentation.ts (5 hunks)
packages/instrumentation-openai/test/instrumentation.test.ts (4 hunks)
pnpm-workspace.yaml (1 hunks)

💤 Files with no reviewable changes (3)

packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-image-variation_2237177737/recording.har
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-chat-with-new-API_1429213105/recording.har
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-image-editing_4216676336/recording.har

🧰 Additional context used

🧬 Code Graph Analysis (2)

packages/instrumentation-openai/src/instrumentation.ts (2)

packages/traceloop-sdk/src/lib/tracing/decorators.ts (1)

tool (276-282)

packages/ai-semantic-conventions/src/SemanticAttributes.ts (1)

SpanAttributes (17-59)

packages/instrumentation-openai/test/instrumentation.test.ts (1)

packages/ai-semantic-conventions/src/SemanticAttributes.ts (1)

SpanAttributes (17-59)

🔇 Additional comments (10)

packages/instrumentation-openai/src/instrumentation.ts (1)

368-381: Good guard: skip non-function tools to avoid crashes.

The additional checks around tool.type and tool.function are correct and prevent emitting invalid function attributes for non-function tools.

packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-chat-completion_1028826680/recording.har (1)

64-67: No sensitive data found in HAR recordings

I re-scanned all .har files for Authorization headers, API keys, and Bearer tokens. The only hits were CORS header‐filter entries (e.g. Access-Control-Allow-Headers including “Authorization”), not actual secret values. No Authorization or secret material is persisted.

packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-tool-calling_1747151373/recording.har (1)

64-67: HAR updates align with tool-calling flow; cookie/header churn expected.

Looks good. Same note: confirm no sensitive headers or tokens are recorded.

Use the same scan script from the previous comment to validate this directory as well.

Also applies to: 78-79, 90-96, 99-106, 110-112, 134-136, 146-148, 174-176, 189-193, 199-201, 211-227

packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-stream-chat-completion_2713905208/recording.har (1)

64-67: Streaming HAR updated with new chunking and timings; OK.

The event-stream payload changes and header differences are expected with new captures.

Also applies to: 78-79, 90-95, 104-111, 133-135, 145-147, 173-175, 188-192, 198-200, 206-222

packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-chat_1427850401/recording.har (2)

90-96: HAR payload integrity verified
All recording.har fixtures in packages/instrumentation-openai/recordings/**/recording.har successfully concatenate, gunzip (where applicable), and parse as valid JSON. No silent fixture corruption detected.

65-67: Polly setup already ignores volatile headers

In packages/instrumentation-openai/test/instrumentation.test.ts (around lines 59–61), the Polly configuration uses
matchRequestsBy: { headers: false }
which disables header matching altogether—covering x-stainless-runtime-version and preventing fixture churn.

No further changes needed.

packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-chat_72904453/recording.har (1)

90-95: No direct SSE parsing in instrumentation—ignore metadata-guard suggestion.

The instrumentation code wraps the client’s streaming promise and consumes parsed chunks (via for await), without manually inspecting SSE fields like service_tier, system_fingerprint, or obfuscation. The OpenAI SDK handles SSE parsing, and instrumentation only touches the high-level chunk object (content/role, finish_reason, tool_calls). No changes needed.

Likely an incorrect or invalid review comment.

packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-completion_1484001317/recording.har (1)

90-95: SSE stream payload update looks consistent and replay-safe.

Chunked event-stream text ends with [DONE] and preserves the expected completion semantics. No issues from a replay standpoint.

packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-image-generation_2726543658/recording.har (1)

90-96: Updated base64 response payload is valid and size-aligned.

content.encoding is base64 and size matches the compressed payload length. Fixture is consistent for replay.

packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-function_call-attributes-in-span-for-stream-completion-when-multiple-tools-cal_2278551756/recording.har (1)

94-95: Tool call streaming chunks are well-formed and reflect incremental arguments assembly.

The sequence shows interleaved tool_calls for two functions with arguments emitted across multiple chunks, terminating with finish_reason "tool_calls". This aligns with expected 4.x/5.x streaming behavior and should exercise the aggregator logic correctly.

coderabbitai · 2025-08-14T08:50:37Z

...ation_1770406427/should-set-attributes-in-span-for-image-generation_2726543658/recording.har

              "domain": ".api.openai.com",
              "httpOnly": true,
              "name": "_cfuvid",
              "path": "/",
              "sameSite": "None",
              "secure": true,
-              "value": "FCva9.SglahJ3Gfq39cVkMsP46zAGKCEz0lcn8D3jQM-1754899594529-0.0.1.1-604800000"
+              "value": "iqsdkpeQwMKFjPi0Iv8KlMvb85OgwfiB1PYVE_.dNgc-1755159640202-0.0.1.1-604800000"
            }
          ],


🛠️ Refactor suggestion

Drop cookies/set-cookie from persisted HARs.

As in other recordings, _cfuvid cookie/set-cookie adds avoidable fixture churn and carries external session identifiers. Please strip them on persist.

See my patch suggestion in the test harness to sanitize cookies and volatile headers at persist time.

Also applies to: 162-165

coderabbitai · 2025-08-14T08:50:37Z

...n_1770406427/should-set-attributes-in-span-for-streaming-completion_1484001317/recording.har

          "cookies": [
-            {
-              "domain": ".api.openai.com",
-              "expires": "2025-08-11T08:36:13.000Z",
-              "httpOnly": true,
-              "name": "__cf_bm",
-              "path": "/",
-              "sameSite": "None",
-              "secure": true,
-              "value": "7BVkI_Oa29lnPsqHulgW245EtvG64HIEl2Fp7IxZp9I-1754899573-1.0.1.1-kdd4J1X.Jma63pkY2SjVLyddqX97N48JVjPmEOLzulzD.rZiIDkxc04zSITZbl26AssMI9omM7e18ei3dnc5W6H_4xiLGmvNjB4pZIPzlOM"
-            },
            {
              "domain": ".api.openai.com",
              "httpOnly": true,
              "name": "_cfuvid",
              "path": "/",
              "sameSite": "None",
              "secure": true,
-              "value": "iINbtFQc.yhVxOLsB58zmJ_w777tMCbQwp8i5X9MDmg-1754899573294-0.0.1.1-604800000"
+              "value": "oJzlXzMbl99K.ngOr11DUoLQkD4oAL_M5u1cOyV1mDM-1755159623848-0.0.1.1-604800000"
            }
          ],


🛠️ Refactor suggestion

Sanitize cookies in fixtures to prevent churn and avoid storing unnecessary session identifiers.

The recorded cookies and set-cookie headers (_cfuvid) will change frequently and are not used by tests. Keeping them in fixtures increases diff noise and potential for flaky updates. Recommend dropping cookies and set-cookie during recording (see test harness suggestion in instrumentation.test.ts comment).

I’ve proposed a small change to beforePersist in the test file to strip cookies and volatile response headers.

Also applies to: 205-208

🤖 Prompt for AI Agents

In packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-completion_1484001317/recording.har around lines 96-106 (and similarly lines 205-208), remove cookie entries and any Set-Cookie/volatile response headers from the persisted fixture to avoid churn; update the test harness's beforePersist hook to strip the "cookies" array and any "set-cookie" or other session-specific headers from response objects before writing the HAR so recordings no longer include _cfuvid or equivalent session identifiers.

coderabbitai · 2025-08-14T08:50:37Z

...ll-attributes-in-span-for-stream-completion-when-multiple-tools-cal_2278551756/recording.har

          "cookies": [
-            {
-              "domain": ".api.openai.com",
-              "expires": "2025-08-11T08:36:18.000Z",
-              "httpOnly": true,
-              "name": "__cf_bm",
-              "path": "/",
-              "sameSite": "None",
-              "secure": true,
-              "value": "ir.cz4nw2WPG7xc6lLT75OWqcFetJxpYTAXh9GBFbNY-1754899578-1.0.1.1-16O6NrITpUlIk9x4K3fxzvyvyQhx0_N1cbxbVtmNaCLeXW_OoN12VvQlnCpIjABB.URY8yzVuyLaZ4wLy.3YxO8NJ.wkhoYX1y4V1gpMdpA"
-            },
            {
              "domain": ".api.openai.com",
              "httpOnly": true,
              "name": "_cfuvid",
              "path": "/",
              "sameSite": "None",
              "secure": true,
-              "value": "daCGg_fpUU3jmxytoFsYXLoU4aQ1zvDMBbYagYlQBfM-1754899578893-0.0.1.1-604800000"
+              "value": "SfmiOSTiNKwNXeYI28mF.CeV7jmkS4JLhmbnWLlox0I-1755159629164-0.0.1.1-604800000"
            }
          ],


🛠️ Refactor suggestion

Avoid persisting cookies and highly volatile response headers.

The presence of _cfuvid cookie and set-cookie increases fixture volatility and is unnecessary. Recommend sanitizing cookies and volatile headers (date, cf-ray, x-request-id, etc.) in beforePersist.

Patch for the sanitizer is proposed in the test file comment below.

Also applies to: 188-192

🤖 Prompt for AI Agents

In packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-function_call-attributes-in-span-for-stream-completion-when-multiple-tools-cal_2278551756/recording.har around lines 96-106 (and also lines 188-192), the HAR fixture contains volatile cookies and response headers (e.g., the _cfuvid cookie and Set-Cookie entries, Date, cf-ray, x-request-id) that make tests brittle; update the test suite's beforePersist sanitizer to strip or normalize cookies and remove volatile response headers before saving fixtures — specifically remove entries with names like _cfuvid or any Set-Cookie header and drop/replace Date, cf-ray, x-request-id, and similar per-response headers so the persisted HARs are stable.

coderabbitai · 2025-08-14T08:50:37Z

packages/instrumentation-openai/test/instrumentation.test.ts

+    if (this.polly) {
+      const { server } = this.polly as Polly;
+      server.any().on("beforePersist", (_req, recording) => {
+        recording.request.headers = recording.request.headers.filter(
+          ({ name }: { name: string }) => name !== "authorization",
+        );
+      });
+    }


🛠️ Refactor suggestion

Harden Polly sanitizer (strip cookies and volatile headers) and type Mocha context.

Nice guard for this.polly. Expand the sanitizer to drop cookies and volatile headers to reduce fixture churn and avoid persisting session identifiers. Also, type the Mocha context for better TS safety.

Apply this diff:

- if (this.polly) { - const { server } = this.polly as Polly; - server.any().on("beforePersist", (_req, recording) => { - recording.request.headers = recording.request.headers.filter( - ({ name }: { name: string }) => name !== "authorization", - ); - }); - } + if (this.polly) { + const { server } = this.polly as Polly; + server.any().on("beforePersist", (_req, recording) => { + // Drop auth headers, cookies and highly-volatile response headers to reduce fixture churn + const dropHeader = (name: string) => + [ + "authorization", + "cookie", + "set-cookie", + "date", + "cf-ray", + "x-request-id", + "via", + "x-envoy-upstream-service-time", + "openai-processing-ms", + ].includes(name.toLowerCase()); + + recording.request.headers = (recording.request.headers || []).filter( + ({ name }: { name: string }) => !dropHeader(name), + ); + recording.request.cookies = []; + + recording.response.headers = (recording.response.headers || []).filter( + ({ name }: { name: string }) => !dropHeader(name), + ); + recording.response.cookies = []; + }); + }

Additionally, for stricter typing, consider changing the beforeEach signature to:
TypeScript (apply outside this range):
function beforeEach(this: Mocha.Context & { polly?: Polly }) { … }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (this.polly) {

const { server } = this.polly as Polly;

server.any().on("beforePersist", (_req, recording) => {

recording.request.headers = recording.request.headers.filter(

({ name }: { name: string }) => name !== "authorization",

);

});

}

if (this.polly) {

const { server } = this.polly as Polly;

server.any().on("beforePersist", (_req, recording) => {

// Drop auth headers, cookies and highly-volatile response headers to reduce fixture churn

const dropHeader = (name: string) =>

[

"authorization",

"cookie",

"set-cookie",

"date",

"cf-ray",

"x-request-id",

"via",

"x-envoy-upstream-service-time",

"openai-processing-ms",

].includes(name.toLowerCase());

recording.request.headers = (recording.request.headers || []).filter(

({ name }: { name: string }) => !dropHeader(name),

);

recording.request.cookies = [];

recording.response.headers = (recording.response.headers || []).filter(

({ name }: { name: string }) => !dropHeader(name),

);

recording.response.cookies = [];

});

}

🤖 Prompt for AI Agents

In packages/instrumentation-openai/test/instrumentation.test.ts around lines 87 to 94, the Polly sanitizer currently only strips the "authorization" header; update the sanitizer to also remove "cookie" and other volatile headers (e.g. x-request-id, x-correlation-id) by filtering out headers whose lower-cased name matches that block, and keep the existing guard for this.polly. Also add stronger typing for the Mocha beforeEach by changing its signature (outside this range) to: function beforeEach(this: Mocha.Context & { polly?: Polly }) so this.polly is typed as Polly.

coderabbitai · 2025-08-14T08:50:38Z

packages/instrumentation-openai/test/instrumentation.test.ts

+    const functionCallArguments = JSON.parse(
+      completionSpan.attributes[
+        `${SpanAttributes.LLM_COMPLETIONS}.0.function_call.arguments`
+      ]! as string,
    );
+    assert.ok(functionCallArguments.location);
+    assert.ok(functionCallArguments.location.includes("Boston"));
    assert.ok(


🛠️ Refactor suggestion

Make argument parsing robust across SDK variants (string vs object).

Arguments may be emitted as a JSON string or already-object depending on SDK/version. Parse conditionally to avoid runtime errors.

- const functionCallArguments = JSON.parse( - completionSpan.attributes[ - `${SpanAttributes.LLM_COMPLETIONS}.0.function_call.arguments` - ]! as string, - ); - assert.ok(functionCallArguments.location); - assert.ok(functionCallArguments.location.includes("Boston")); + const rawFunctionArgs = + completionSpan.attributes[ + `${SpanAttributes.LLM_COMPLETIONS}.0.function_call.arguments` + ]!; + const functionCallArguments = + typeof rawFunctionArgs === "string" ? JSON.parse(rawFunctionArgs) : rawFunctionArgs; + assert.ok(functionCallArguments?.location); + assert.ok(String(functionCallArguments.location).includes("Boston"));

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const functionCallArguments = JSON.parse(

completionSpan.attributes[

`${SpanAttributes.LLM_COMPLETIONS}.0.function_call.arguments`

]! as string,

);

assert.ok(functionCallArguments.location);

assert.ok(functionCallArguments.location.includes("Boston"));

assert.ok(

const rawFunctionArgs =

completionSpan.attributes[

`${SpanAttributes.LLM_COMPLETIONS}.0.function_call.arguments`

]!;

const functionCallArguments =

typeof rawFunctionArgs === "string"

? JSON.parse(rawFunctionArgs)

: rawFunctionArgs;

assert.ok(functionCallArguments?.location);

assert.ok(String(functionCallArguments.location).includes("Boston"));

assert.ok(

🤖 Prompt for AI Agents

In packages/instrumentation-openai/test/instrumentation.test.ts around lines 407 to 414, the test assumes function call arguments are always a JSON string; make parsing robust by checking the attribute value type before parsing: if it's a string, JSON.parse it (with try/catch to surface parse errors), otherwise treat it as an object; then assert on the resulting object (e.g., ensure location exists and includes "Boston"). Ensure the code handles null/undefined safely and fails the test with a clear message if parsing or type expectations are not met.

coderabbitai · 2025-08-14T08:50:38Z

packages/instrumentation-openai/test/instrumentation.test.ts

+    const toolCallArguments = JSON.parse(
+      completionSpan.attributes[
+        `${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.0.arguments`
+      ]! as string,
    );
+    assert.ok(toolCallArguments.location);
+    assert.ok(toolCallArguments.location.includes("Boston"));
    assert.ok(


🛠️ Refactor suggestion

Harden tool call arguments parsing (string/object).

Mirror the conditional parsing to avoid failures if the SDK changes the type.

- const toolCallArguments = JSON.parse( - completionSpan.attributes[ - `${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.0.arguments` - ]! as string, - ); - assert.ok(toolCallArguments.location); - assert.ok(toolCallArguments.location.includes("Boston")); + const rawToolArgs0 = + completionSpan.attributes[ + `${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.0.arguments` + ]!; + const toolCallArguments = + typeof rawToolArgs0 === "string" ? JSON.parse(rawToolArgs0) : rawToolArgs0; + assert.ok(toolCallArguments?.location); + assert.ok(String(toolCallArguments.location).includes("Boston"));

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const toolCallArguments = JSON.parse(

completionSpan.attributes[

`${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.0.arguments`

]! as string,

);

assert.ok(toolCallArguments.location);

assert.ok(toolCallArguments.location.includes("Boston"));

assert.ok(

const rawToolArgs0 =

completionSpan.attributes[

`${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.0.arguments`

]!;

const toolCallArguments =

typeof rawToolArgs0 === "string" ? JSON.parse(rawToolArgs0) : rawToolArgs0;

assert.ok(toolCallArguments?.location);

assert.ok(String(toolCallArguments.location).includes("Boston"));

assert.ok(

🤖 Prompt for AI Agents

In packages/instrumentation-openai/test/instrumentation.test.ts around lines 506 to 513, the test assumes the tool call arguments attribute is always a JSON string; make the parsing robust by first reading the attribute into a variable, then if it's a string run JSON.parse on it, otherwise treat it as an object (no parse), and then proceed with the existing assertions (checking location and contents).

coderabbitai · 2025-08-14T08:50:38Z

packages/instrumentation-openai/test/instrumentation.test.ts

+    const toolCall0Arguments = JSON.parse(
+      completionSpan.attributes[
+        `${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.0.arguments`
+      ]! as string,
    );
+    assert.ok(toolCall0Arguments.location);
+    assert.ok(toolCall0Arguments.location.includes("Boston"));
    assert.strictEqual(


🛠️ Refactor suggestion

Robust parsing for first streamed tool call arguments.

Guard against string/object variance.

- const toolCall0Arguments = JSON.parse( - completionSpan.attributes[ - `${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.0.arguments` - ]! as string, - ); - assert.ok(toolCall0Arguments.location); - assert.ok(toolCall0Arguments.location.includes("Boston")); + const rawToolCall0 = + completionSpan.attributes[ + `${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.0.arguments` + ]!; + const toolCall0Arguments = + typeof rawToolCall0 === "string" ? JSON.parse(rawToolCall0) : rawToolCall0; + assert.ok(toolCall0Arguments?.location); + assert.ok(String(toolCall0Arguments.location).includes("Boston"));

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const toolCall0Arguments = JSON.parse(

completionSpan.attributes[

`${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.0.arguments`

]! as string,

);

assert.ok(toolCall0Arguments.location);

assert.ok(toolCall0Arguments.location.includes("Boston"));

assert.strictEqual(

const rawToolCall0 =

completionSpan.attributes[

`${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.0.arguments`

]!;

const toolCall0Arguments =

typeof rawToolCall0 === "string" ? JSON.parse(rawToolCall0) : rawToolCall0;

assert.ok(toolCall0Arguments?.location);

assert.ok(String(toolCall0Arguments.location).includes("Boston"));

assert.strictEqual(

🤖 Prompt for AI Agents

In packages/instrumentation-openai/test/instrumentation.test.ts around lines 600 to 607, the test assumes the tool call arguments attribute is always a JSON string and calls JSON.parse unguarded; update the test to handle both string and object variants by first reading the attribute into a const, assert it's defined, then if it's a string call JSON.parse, otherwise if it's already an object use it directly (and handle any other unexpected types with a fail/assert), then continue asserting location exists and contains "Boston".

coderabbitai · 2025-08-14T08:50:38Z

packages/instrumentation-openai/test/instrumentation.test.ts

+    const toolCall1Arguments = JSON.parse(
+      completionSpan.attributes[
+        `${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.1.arguments`
+      ]! as string,
    );
+    assert.ok(toolCall1Arguments.location);
+    assert.ok(toolCall1Arguments.location.includes("Chicago"));
  });


🛠️ Refactor suggestion

Robust parsing for second streamed tool call arguments.

Same conditional parse as above for consistency.

- const toolCall1Arguments = JSON.parse( - completionSpan.attributes[ - `${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.1.arguments` - ]! as string, - ); - assert.ok(toolCall1Arguments.location); - assert.ok(toolCall1Arguments.location.includes("Chicago")); + const rawToolCall1 = + completionSpan.attributes[ + `${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.1.arguments` + ]!; + const toolCall1Arguments = + typeof rawToolCall1 === "string" ? JSON.parse(rawToolCall1) : rawToolCall1; + assert.ok(toolCall1Arguments?.location); + assert.ok(String(toolCall1Arguments.location).includes("Chicago"));

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const toolCall1Arguments = JSON.parse(

completionSpan.attributes[

`${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.1.arguments`

]! as string,

);

assert.ok(toolCall1Arguments.location);

assert.ok(toolCall1Arguments.location.includes("Chicago"));

});

const rawToolCall1 =

completionSpan.attributes[

`${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.1.arguments`

]!;

const toolCall1Arguments =

typeof rawToolCall1 === "string" ? JSON.parse(rawToolCall1) : rawToolCall1;

assert.ok(toolCall1Arguments?.location);

assert.ok(String(toolCall1Arguments.location).includes("Chicago"));

});

🤖 Prompt for AI Agents

In packages/instrumentation-openai/test/instrumentation.test.ts around lines 613 to 620, the code unconditionally JSON.parse's the second streamed tool call arguments; change it to use the same conditional parsing pattern used earlier (check the attribute exists and is a string before parsing, e.g. use a ternary or if to set toolCall1Arguments = attribute ? JSON.parse(attribute) : undefined) so parsing is robust and consistent, then assert on toolCall1Arguments.location as before.

coderabbitai · 2025-08-14T08:50:38Z

pnpm-workspace.yaml

 overrides:
  form-data@>=4.0.0 <4.0.4: '>=4.0.4'
+  tmp@<=0.2.3: '>=0.2.4'


💡 Verification agent

🧩 Analysis chain

Workspace-level overrides look fine; consider pinning to fixed versions for determinism.

The overrides are sensible to avoid known issues in form-data <4.0.4 and tmp <=0.2.3.

For reproducibility, prefer exact versions (e.g., "4.0.4" and "0.2.4") over ranges, unless you intentionally want to float to newer patch releases.

Run this script to ensure pnpm recognizes workspace-level overrides and to spot any other occurrences of form-data/tmp that could be affected:

🏁 Script executed:

#!/bin/bash # Show where form-data and tmp appear in lockfile and verify override effect fd 'pnpm-lock.yaml' -x rg -n 'form-data|tmp' {} # Sanity check: ensure overrides are set at workspace root package.json or workspace file rg -n '"overrides"\s*:' -A 5 pnpm-workspace.yaml

Length of output: 1084

🏁 Script executed:

#!/bin/bash # 1. Confirm that overrides are declared in pnpm-workspace.yaml echo "---- pnpm-workspace.yaml overrides snippet ----" rg -n '^overrides:' -A 5 pnpm-workspace.yaml || echo "No overrides block found" # 2. Show the first 30 lines of pnpm-lock.yaml to inspect the overrides section echo echo "---- pnpm-lock.yaml header (first 30 lines) ----" head -n 30 pnpm-lock.yaml # 3. Verify that the tmp override entry exists in the lockfile echo echo "---- tmp override entries in pnpm-lock.yaml ----" rg -n 'tmp@<=0.2.3:' pnpm-lock.yaml || echo "No tmp override entry found"

Length of output: 1287

Fix missing tmp override and pin to exact versions for determinism

It looks like your form-data override is applied (seen in pnpm-lock.yaml), but the tmp@<=0.2.3 override never made it into the lockfile. To ensure both packages are correctly overridden and reproducible:

• Adjust your overrides in pnpm-workspace.yaml to use exact versions and a catch-all pattern (which pnpm reliably picks up):

pnpm-workspace.yaml overrides: - form-data@>=4.0.0 <4.0.4: '>=4.0.4' - tmp@<=0.2.3: '>=0.2.4' + form-data@* : '4.0.4' + tmp@* : '0.2.4'

• After updating, run pnpm install to regenerate pnpm-lock.yaml.
• Verify both overrides appear under the top-level overrides: block in pnpm-lock.yaml.

This will both guarantee your overrides take effect and lock to specific versions for full reproducibility.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

overrides:

form-data@>=4.0.0 <4.0.4: '>=4.0.4'

tmp@<=0.2.3: '>=0.2.4'

pnpm-workspace.yaml

overrides:

form-data@* : '4.0.4'

tmp@* : '0.2.4'

🤖 Prompt for AI Agents

In pnpm-workspace.yaml around lines 4 to 6, the tmp override is not being picked up and your overrides should pin to exact versions and use a catch‑all pattern so pnpm reliably applies them; update the overrides to use exact version pins (e.g. form-data@4.0.4 and tmp@0.2.4) or a wildcard pattern that matches all ranges, save the file, run pnpm install to regenerate pnpm-lock.yaml, and verify both form-data and tmp appear under the top-level overrides block in pnpm-lock.yaml.

coderabbitai

Actionable comments posted: 0

🔭 Outside diff range comments (2)

packages/instrumentation-openai/src/instrumentation.ts (2)

367-381: Attribute index collision between functions[] and tools[]

You now emit tools under LLM_REQUEST_FUNCTIONS.* with their own 0-based index, which may overwrite attributes emitted for params.functions.*. Use an offset so tools follow functions, and guard undefined parameters to avoid setting undefined attributes.

Apply this diff:

           params.tools?.forEach((tool, index) => {
             if (tool.type !== 'function' || !('function' in tool) || !tool.function) {
               return;
             }

-            attributes[
-              `${SpanAttributes.LLM_REQUEST_FUNCTIONS}.${index}.name`
-            ] = tool.function.name;
-            attributes[
-              `${SpanAttributes.LLM_REQUEST_FUNCTIONS}.${index}.description`
-            ] = tool.function.description;
-            attributes[
-              `${SpanAttributes.LLM_REQUEST_FUNCTIONS}.${index}.arguments`
-            ] = JSON.stringify(tool.function.parameters);
+            attributes[
+              `${SpanAttributes.LLM_REQUEST_FUNCTIONS}.${(params.functions?.length ?? 0) + index}.name`
+            ] = tool.function.name;
+            attributes[
+              `${SpanAttributes.LLM_REQUEST_FUNCTIONS}.${(params.functions?.length ?? 0) + index}.description`
+            ] = tool.function.description;
+            attributes[
+              `${SpanAttributes.LLM_REQUEST_FUNCTIONS}.${(params.functions?.length ?? 0) + index}.arguments`
+            ] = JSON.stringify(tool.function.parameters ?? {});
           });

458-466: Streaming function_call assembly loses partial deltas

Requiring both name and arguments in the same chunk misses the common streaming pattern where they arrive separately. Build the function_call incrementally instead.

Apply this diff:

-        if (
-          chunk.choices[0]?.delta.function_call &&
-          chunk.choices[0]?.delta.function_call.arguments &&
-          chunk.choices[0]?.delta.function_call.name
-        ) {
-          // I needed to re-build the object so that Typescript will understand that `name` and `argument` are not null.
-          result.choices[0].message.function_call = {
-            name: chunk.choices[0].delta.function_call.name,
-            arguments: chunk.choices[0].delta.function_call.arguments,
-          };
-        }
+        if (chunk.choices[0]?.delta.function_call) {
+          // Build up function_call across chunks
+          if (!result.choices[0].message.function_call) {
+            result.choices[0].message.function_call = { name: "", arguments: "" };
+          }
+          if (chunk.choices[0].delta.function_call.name) {
+            result.choices[0].message.function_call.name +=
+              chunk.choices[0].delta.function_call.name;
+          }
+          if (chunk.choices[0].delta.function_call.arguments) {
+            result.choices[0].message.function_call.arguments +=
+              chunk.choices[0].delta.function_call.arguments;
+          }
+        }

🧹 Nitpick comments (2)

packages/instrumentation-openai/src/instrumentation.ts (2)
436-436: Avoid as any for the seeded streaming message (type-safety)

Casting the whole message to any hides mistakes. Consider a narrow local type for the mutable message shape to preserve type-safety.

For example:
type MutableAssistantMessage = {
 role: "assistant";
 content: string;
 tool_calls: Array<{
 id: string;
 type: "function";
 function: { name: string; arguments: string };
 }>;
 function_call?: { name: string; arguments: string };
};
Then initialize:
message: {
 role: "assistant",
 content: "",
 tool_calls: [],
} as MutableAssistantMessage
493-497: Tighten typing when mutating tool_calls; optionally avoid as any

The incremental += on function.name and function.arguments is fine for streaming. To reduce reliance on any, introduce a local MutableToolCall type and cast tool_calls once, instead of per-write casts; also ensure the entry has a function object before appending.

Example:
type MutableToolCall = { id: string; type: string; function?: { name: string; arguments: string } };
const calls = result.choices[0].message.tool_calls as unknown as MutableToolCall[];
const entry = calls[toolCall.index];
entry.function ??= { name: "", arguments: "" };
if (toolCall.function?.name) entry.function.name += toolCall.function.name;
if (toolCall.function?.arguments) entry.function.arguments += toolCall.function.arguments;

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between e0c4d82 and a7153ab.

📒 Files selected for processing (1)

packages/instrumentation-openai/src/instrumentation.ts (5 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

packages/instrumentation-openai/src/instrumentation.ts (2)

packages/traceloop-sdk/src/lib/tracing/decorators.ts (1)

tool (276-282)

packages/ai-semantic-conventions/src/SemanticAttributes.ts (1)

SpanAttributes (17-59)

🔇 Additional comments (3)

packages/instrumentation-openai/src/instrumentation.ts (3)

122-122: OpenAI 5.x support: version range bump LGTM — confirm wrapper targets still hold in v5

Relaxing to ">=3.1.0 <6" is aligned with the PR goal. Please double-check (in CI matrix or locally) that the prototype paths you wrap (e.g., OpenAI.Chat.Completions.prototype.create) are stable across 5.x minors; the OpenAI SDK has been evolving rapidly.

488-491: Correct fix: assign tool_call.type, don’t concatenate

Switching to assignment for type is the right behavior for streaming tool_calls.

703-711: Good guard: emit tool_calls attributes only for function tools

This prevents spurious attributes for non-function tool types and aligns with the OpenAI tool schema.

nirga · 2025-08-14T09:00:14Z

packages/instrumentation-openai/src/instrumentation.ts

-              result.choices[0].message.tool_calls[
-                toolCall.index
-              ].function.name += toolCall.function.name;
+              (result.choices[0].message.tool_calls[toolCall.index] as any).function.name += toolCall.function.name;


weird that it's += no? (I know that it used to be like that)

nirga · 2025-08-14T09:01:12Z

packages/instrumentation-openai/test/instrumentation.test.ts

-        ({ name }: { name: string }) => name !== "authorization",
-      );
-    });
+    if (this.polly) {


nit: I don't think we need this change

nirga · 2025-08-14T09:01:44Z

packages/instrumentation-openai/test/instrumentation.test.ts

-        ]! as string,
-      ),
-      { location: "Boston" },
+    const functionCallArguments = JSON.parse(


I'd revert it - to make sure it didn't change

nirga · 2025-08-14T09:01:50Z

packages/instrumentation-openai/test/instrumentation.test.ts

      ],
      "get_current_weather",
    );
-    assert.deepEqual(


nirga · 2025-08-14T09:02:05Z

pnpm-workspace.yaml

+
 overrides:
  form-data@>=4.0.0 <4.0.4: '>=4.0.4'
+  tmp@<=0.2.3: '>=0.2.4'


it solves a vulnerability

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 4e2a79b in 1 minute and 43 seconds. Click for details.

Reviewed 2854 lines of code in 13 files
Skipped 1 files when reviewing.
Skipped posting 3 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/instrumentation-openai/package.json:58

Draft comment:
Dependency on "openai" has been updated to version 5.12.2 to support SDK v5. This aligns with issue 642 and ensures compatibility with the new version.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, providing details about a dependency update and its purpose. It does not offer a suggestion, ask for confirmation, or highlight a potential issue.

2. packages/instrumentation-openai/test/instrumentation.test.ts:407

Draft comment:
Changed the assertions for function_call arguments: using assert.deepEqual on the parsed JSON (expecting exactly { location: 'Boston' }) instead of checking with substring. This makes the test more robust and precise with the new SDK behavior.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, explaining the change made in the code. It doesn't provide a suggestion, ask for confirmation, or point out a potential issue. It violates the rule against making purely informative comments.

3. packages/instrumentation-openai/test/instrumentation.test.ts:597

Draft comment:
Updated expected arguments for tool calls in tests for multiple tools. The first tool now expects { location: 'Boston, MA' } and the second { location: 'Chicago, IL' } via assert.deepEqual, reflecting the SDK v5 response structure.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_eMJPlJPYUKuBqPhi

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed a05c4dd in 1 minute and 1 seconds. Click for details.

Reviewed 43 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 3 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/instrumentation-openai/src/instrumentation.ts:367

Draft comment:
Refactored multi-line if-condition for checking tool validity using consistent double quotes. This is a formatting improvement with no functional change.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

2. packages/instrumentation-openai/src/instrumentation.ts:490

Draft comment:
Improved multi-line formatting when appending tool call function name and arguments; no logic change, just enhanced readability.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

3. packages/instrumentation-openai/src/instrumentation.ts:699

Draft comment:
Changed the condition to use consistent double quotes for checking toolCall type before setting span attributes. This is a straightforward formatting update.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_XF0Oh4j7U57d78nG

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

packages/instrumentation-openai/package.json (1)
58-59: Upgrade instrumentation-openai Node engine requirement to ≥18 and correct the description typo

The OpenAI SDK v5+ requires Node 18+, and your project’s .nvmrc pins Node 20—so bumping the engine floor to 18 is safe. I didn’t find any GitHub Actions workflows (or other CI configs) in .github/workflows; please verify your CI runners (including any .travis.yml or .circleci/config.yml) use Node ≥18.

• packages/instrumentation-openai/package.json
– Update engines.node to ">=18" (was ">=14")
– Fix the “Instrumentaion” → “Instrumentation” typo in description
– (Optional) Change the OpenAI SDK devDependency to ^5.12.2 to pick up non-breaking fixes
 packages/instrumentation-openai/package.json
  4c4
-  "description": "OpenAI Instrumentaion",
+  "description": "OpenAI Instrumentation",
 23,25c23,25
-  "engines": {
-    "node": ">=14"
-  },
+  "engines": {
+    "node": ">=18"
+  },
 58c58
-    "openai": "5.12.2",
+    "openai": "^5.12.2",

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a7153ab and 4e2a79b.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (13)

packages/instrumentation-openai/package.json (1 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-calculate-correct-tokens-for-different-quality-levels_3292443992/recording.har (0 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-chat-completion_1028826680/recording.har (0 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-stream-chat-completion_2713905208/recording.har (0 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-chat_1427850401/recording.har (0 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-completion_322013537/recording.har (0 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-function-calling_1703714575/recording.har (0 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-image-generation_2726543658/recording.har (0 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-chat_72904453/recording.har (0 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-completion_1484001317/recording.har (0 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-tool-calling_1747151373/recording.har (0 hunks)
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-function_call-attributes-in-span-for-stream-completion-when-multiple-tools-cal_2278551756/recording.har (0 hunks)
packages/instrumentation-openai/test/instrumentation.test.ts (2 hunks)

💤 Files with no reviewable changes (11)

packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-stream-chat-completion_2713905208/recording.har
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-image-generation_2726543658/recording.har
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-chat_1427850401/recording.har
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-chat_72904453/recording.har
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-function_call-attributes-in-span-for-stream-completion-when-multiple-tools-cal_2278551756/recording.har
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-tool-calling_1747151373/recording.har
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-completion_1484001317/recording.har
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-chat-completion_1028826680/recording.har
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-completion_322013537/recording.har
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-calculate-correct-tokens-for-different-quality-levels_3292443992/recording.har
packages/instrumentation-openai/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-function-calling_1703714575/recording.har

🚧 Files skipped from review as they are similar to previous changes (1)

packages/instrumentation-openai/test/instrumentation.test.ts

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 755cd47 in 1 minute and 3 seconds. Click for details.

Reviewed 44 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 3 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/instrumentation-openai/src/instrumentation.ts:44

Draft comment:
Define a local type alias APIPromiseType to support both OpenAI v4 and v5+; ensure that all promise objects provide the '_thenUnwrap' method at runtime.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

2. packages/instrumentation-openai/src/instrumentation.ts:419

Draft comment:
Updated the promise type for chat streaming from APIPromise to APIPromiseType. Verify that the _thenUnwrap interface behaves consistently across SDK versions.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

3. packages/instrumentation-openai/src/instrumentation.ts:612

Draft comment:
Changed the _wrapPromise method's promise parameter type to APIPromiseType. Confirm that the promise object’s '_thenUnwrap' method exists and works as expected with the new SDK versions.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_afJk6RzDEyOKqPoa

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

coderabbitai

Actionable comments posted: 1

🔭 Outside diff range comments (1)

packages/instrumentation-openai/src/instrumentation.ts (1)

611-618: Fallback to native Promise when _thenUnwrap is unavailable

Some environments/versions might not expose _thenUnwrap. Add a safe fallback using Promise.then to avoid runtime errors.

   private _wrapPromise<T>(
     type: "chat" | "completion",
     version: "v3" | "v4",
     span: Span,
-    promise: APIPromiseType<T>,
-  ): APIPromiseType<T> {
-    return promise._thenUnwrap((result) => {
+    promise: APIPromiseType<T>,
+  ): APIPromiseType<T> {
+    const unwrap =
+      typeof (promise as any)._thenUnwrap === "function"
+        ? (promise as any)._thenUnwrap.bind(promise)
+        : (onFulfilled: (value: T) => any) =>
+            (Promise.resolve(promise) as Promise<T>).then(onFulfilled);
+    return (unwrap((result: T) => {
       if (version === "v3") {
         if (type === "chat") {
           this._addLogProbsEvent(
             span,
             ((result as any).data as ChatCompletion).choices[0].logprobs,
           );
           this._endSpan({
             type,
             span,
             result: (result as any).data as ChatCompletion,
           });
         } else {
           this._addLogProbsEvent(
             span,
             ((result as any).data as Completion).choices[0].logprobs,
           );
           this._endSpan({
             type,
             span,
             result: (result as any).data as Completion,
           });
         }
       } else {
         if (type === "chat") {
           this._addLogProbsEvent(
             span,
             (result as ChatCompletion).choices[0].logprobs,
           );
           this._endSpan({ type, span, result: result as ChatCompletion });
         } else {
           this._addLogProbsEvent(
             span,
             (result as Completion).choices[0].logprobs,
           );
           this._endSpan({ type, span, result: result as Completion });
         }
       }
 
-      return result;
-    });
+      return result as T;
+    }) as unknown) as APIPromiseType<T>;
   }

♻️ Duplicate comments (1)

packages/instrumentation-openai/src/instrumentation.ts (1)

491-509: Streaming tool_calls assembly: avoid concatenating id and name; only append arguments

It’s incorrect to use += for id and function.name; those should be assigned, not concatenated. Arguments are streamed in chunks and should be concatenated. This also aligns with the prior feedback on "+=" being odd.

-            if (toolCall.id) {
-              result.choices[0].message.tool_calls[toolCall.index].id +=
-                toolCall.id;
-            }
+            if (toolCall.id) {
+              result.choices[0].message.tool_calls[toolCall.index].id =
+                toolCall.id;
+            }
             if (toolCall.type) {
               result.choices[0].message.tool_calls[toolCall.index].type =
                 toolCall.type;
             }
             if (toolCall.function?.name) {
-              (
-                result.choices[0].message.tool_calls[toolCall.index] as any
-              ).function.name += toolCall.function.name;
+              (
+                result.choices[0].message.tool_calls[toolCall.index] as any
+              ).function.name = toolCall.function.name;
             }
             if (toolCall.function?.arguments) {
               (
                 result.choices[0].message.tool_calls[toolCall.index] as any
               ).function.arguments += toolCall.function.arguments;
             }

🧹 Nitpick comments (2)

packages/instrumentation-openai/src/instrumentation.ts (2)
372-389: Guarding tools to only function-type is correct; consider attribute collision with legacy functions

The stricter guard avoids processing non-function tools — good. One consideration: you emit both params.functions and params.tools under the same attribute prefix (llm.request.functions). If both are provided (rare but possible), indices may collide and overwrite attributes.

Option A: Use a distinct namespace for tools (e.g., llm.request.tools).

Option B: Offset tool indices by params.functions?.length to avoid collisions.

445-445: Avoid broad as any on message if feasible

If you want to avoid the broad cast, introduce a narrow local type for the assistant message that includes optional function_call and tool_calls. This keeps type safety without relying on any.

Example (outside this range):
type AssistantMessage = NonNullable<ChatCompletion["choices"][number]>["message"] & {
 function_call?: { name: string; arguments: string };
 tool_calls?: Array<{
 id: string;
 type: "function";
 function: { name: string; arguments: string };
 }>;
};

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a05c4dd and 755cd47.

📒 Files selected for processing (1)

packages/instrumentation-openai/src/instrumentation.ts (8 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

packages/instrumentation-openai/src/instrumentation.ts (2)

packages/traceloop-sdk/src/lib/tracing/decorators.ts (1)

tool (276-282)

packages/ai-semantic-conventions/src/SemanticAttributes.ts (1)

SpanAttributes (17-59)

🔇 Additional comments (3)

packages/instrumentation-openai/src/instrumentation.ts (3)

422-429: Streaming promise typed to APIPromiseType is appropriate

Using APIPromiseType with the async iterator handles both v4/v5 shapes cleanly.

714-723: Verify the necessity of guarding against undefined toolCall.function
I wasn’t able to locate a ToolCall type definition in this package, so it’s unclear whether toolCall.function is ever guaranteed to be non-undefined when toolCall.type === "function". You should:
Confirm in your type definitions (or via the OpenAI response schema) that a function property is always present and non-undefined whenever type === "function".
If there’s any possibility of it being undefined, update the guard as suggested:
- if (toolCall.type === "function" && "function" in toolCall) {
+ if (toolCall.type === "function" && "function" in toolCall && toolCall.function) {
 span.setAttribute(
 `${SpanAttributes.LLM_COMPLETIONS}.${index}.tool_calls.${toolIndex}.name`,
 toolCall.function.name,
 );
 span.setAttribute(
 `${SpanAttributes.LLM_COMPLETIONS}.${index}.tool_calls.${toolIndex}.arguments`,
 toolCall.function.arguments,
 );
 }
126-127: Verify OpenAI v5 patch paths
We relaxed the supported range to <6 to include v5, but since this package doesn’t ship the OpenAI client itself, please smoke-test against an openai@5 install to confirm that your patch targets still resolve:

openai.chat.completions.create (wrapped via OpenAI.Chat.Completions.prototype)

openai.completions.create (wrapped via OpenAI.Completions.prototype)

Run a quick script or REPL session importing openai@5, instantiate a client, and ensure spans are applied to both chat and completion calls.

coderabbitai · 2025-08-14T10:26:11Z

packages/instrumentation-openai/src/instrumentation.ts

+// Type definition for APIPromise - compatible with both OpenAI v4 and v5+
+// The actual import is handled at runtime via require() calls in the _wrapPromise method
+type APIPromiseType<T> = Promise<T> & {
+  _thenUnwrap: <U>(onFulfilled: (value: T) => U) => APIPromiseType<U>;
+};


🛠️ Refactor suggestion

Make _thenUnwrap optional and fix misleading comment about runtime import

The comment says the import is handled via require() calls in _wrapPromise, but there are no such calls. This is misleading.

Also, making _thenUnwrap optional provides a safe fallback when the underlying promise is a native Promise in some environments/versions.

Apply this diff to clarify the comment and make _thenUnwrap optional:

-// Type definition for APIPromise - compatible with both OpenAI v4 and v5+ -// The actual import is handled at runtime via require() calls in the _wrapPromise method -type APIPromiseType<T> = Promise<T> & { - _thenUnwrap: (onFulfilled: (value: T) => U) => APIPromiseType; -}; +// Lightweight shape of OpenAI's APIPromise to support v4/v5 without importing. +// We only rely on _thenUnwrap when it exists at runtime. +type APIPromiseType<T> = Promise<T> & { + _thenUnwrap?: (onFulfilled: (value: T) => U) => APIPromiseType; +};

🤖 Prompt for AI Agents

In packages/instrumentation-openai/src/instrumentation.ts around lines 42 to 46, the comment incorrectly states that an import is handled via require() calls in _wrapPromise (there are no such calls) and the APIPromiseType marks _thenUnwrap as required; change the comment to accurately state that the type aims to be compatible with both OpenAI v4 and v5+ without claiming runtime require() usage, and make _thenUnwrap optional on the type (i.e., _thenUnwrap?: (onFulfilled: (value: T) => U) => APIPromiseType) so native Promises that lack this method are safely supported. Ensure the updated comment is concise and the type change compiles with existing usages.

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 6340b26 in 1 minute and 49 seconds. Click for details.

Reviewed 111 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 4 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/instrumentation-openai/src/instrumentation.ts:64

Draft comment:
Changing the parameter type from 'typeof openai.OpenAI' to 'unknown' in manuallyInstrument reduces compile-time type safety. Consider using a more specific union or adding runtime checks to ensure the module has the expected structure.
Reason this comment was not posted:
Comment looked like it was already resolved.

2. packages/instrumentation-openai/src/instrumentation.ts:69

Draft comment:
The diff directly wraps openaiModule.Chat.Completions.prototype and openaiModule.Completions.prototype. This assumes these properties always exist. It would be safer to validate the module’s structure before applying instrumentation to prevent potential runtime errors in unexpected versions.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

3. packages/instrumentation-openai/src/instrumentation.ts:80

Draft comment:
The removed branch for the old OpenAIApi (v3) and the revised wrapping for image-related methods now only apply when openaiModule.Images exists. Confirm that dropping backward compatibility for older versions is intentional and that all expected image endpoints (generate, edit, createVariation) are properly covered.
Reason this comment was not posted:
Comment looked like it was already resolved.

4. packages/instrumentation-openai/src/instrumentation.ts:113

Draft comment:
The instrumentation module version range has been updated from [">=3.1.0 <6"] to [">=4 <6"]. This change drops support for v3.1.0 and focuses on newer versions (supporting 5.x). Ensure that this aligns with the desired compatibility policies.
Reason this comment was not posted:
Comment looked like it was already resolved.

Workflow ID: wflow_pkTVEZJZi1KDZ6cH

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

coderabbitai

Actionable comments posted: 2

🔭 Outside diff range comments (2)

packages/instrumentation-openai/src/instrumentation.ts (2)

479-487: Do not append tool_call.id; assign it.

Using "+=" for tool_call.id risks duplicated IDs when the id is repeated across chunks. IDs are identifiers, not streamed tokens.

-            if (toolCall.id) {
-              result.choices[0].message.tool_calls[toolCall.index].id +=
-                toolCall.id;
-            }
+            if (toolCall.id) {
+              result.choices[0].message.tool_calls[toolCall.index].id =
+                toolCall.id;
+            }

599-646: Fallback to native Promise.then when _thenUnwrap is unavailable.

Calling _thenUnwrap unconditionally will throw when the underlying Promise isn't OpenAI's APIPromise (possible across versions/environments). Provide a safe fallback.

   private _wrapPromise<T>(
     type: "chat" | "completion",
     version: "v3" | "v4",
     span: Span,
-    promise: APIPromiseType<T>,
-  ): APIPromiseType<T> {
-    return promise._thenUnwrap((result) => {
+    promise: APIPromiseType<T>,
+  ): APIPromiseType<T> {
+    const chain =
+      typeof (promise as any)._thenUnwrap === "function"
+        ? (promise as any)._thenUnwrap.bind(promise)
+        : (onFulfilled: (value: T) => any) =>
+            (promise as Promise<T>).then(onFulfilled) as unknown as APIPromiseType<T>;
+
+    return chain((result: T) => {
       if (version === "v3") {
         if (type === "chat") {
           this._addLogProbsEvent(
             span,
             ((result as any).data as ChatCompletion).choices[0].logprobs,
           );
           this._endSpan({
             type,
             span,
             result: (result as any).data as ChatCompletion,
           });
         } else {
           this._addLogProbsEvent(
             span,
             ((result as any).data as Completion).choices[0].logprobs,
           );
           this._endSpan({
             type,
             span,
             result: (result as any).data as Completion,
           });
         }
       } else {
         if (type === "chat") {
           this._addLogProbsEvent(
             span,
             (result as ChatCompletion).choices[0].logprobs,
           );
           this._endSpan({ type, span, result: result as ChatCompletion });
         } else {
           this._addLogProbsEvent(
             span,
             (result as Completion).choices[0].logprobs,
           );
           this._endSpan({ type, span, result: result as Completion });
         }
       }
 
       return result;
-    });
+    }) as unknown as APIPromiseType<T>;
   }

♻️ Duplicate comments (1)

packages/instrumentation-openai/src/instrumentation.ts (1)
42-46: Make _thenUnwrap optional and fix misleading comment (still unresolved).

The comment claims a runtime import via require() in _wrapPromise, which does not exist here. Also, requiring _thenUnwrap at the type level breaks when a native Promise is returned (possible in some environments/versions).

Apply this diff to correct the comment and make _thenUnwrap optional:
-// Type definition for APIPromise - compatible with both OpenAI v4 and v5+
-// The actual import is handled at runtime via require() calls in the _wrapPromise method
-type APIPromiseType<T> = Promise<T> & {
- _thenUnwrap: (onFulfilled: (value: T) => U) => APIPromiseType;
-};
+// Lightweight shape of OpenAI's APIPromise to support v4/v5 without importing.
+// We only rely on _thenUnwrap when it exists at runtime.
+type APIPromiseType<T> = Promise<T> & {
+ _thenUnwrap?: (onFulfilled: (value: T) => U) => APIPromiseType;
+};

🧹 Nitpick comments (2)

packages/instrumentation-openai/src/instrumentation.ts (2)

433-433: Avoid broad any; narrow the cast to the message type.

Casting the whole message as any reduces type safety. Prefer a narrower cast.

-            } as any,
+            } as Partial<ChatCompletion["choices"][number]["message"]>,

489-497: Reduce any-casts when accumulating function deltas.

You can avoid multiple any-casts by referencing the current tool call and using non-null assertions guarded by your earlier checks.

-            if (toolCall.function?.name) {
-              (
-                result.choices[0].message.tool_calls[toolCall.index] as any
-              ).function.name += toolCall.function.name;
-            }
-            if (toolCall.function?.arguments) {
-              (
-                result.choices[0].message.tool_calls[toolCall.index] as any
-              ).function.arguments += toolCall.function.arguments;
-            }
+            const current = result.choices[0].message.tool_calls![toolCall.index]!;
+            if (toolCall.function?.name) {
+              current.function!.name += toolCall.function.name;
+            }
+            if (toolCall.function?.arguments) {
+              current.function!.arguments += toolCall.function.arguments;
+            }

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 755cd47 and 6340b26.

📒 Files selected for processing (1)

packages/instrumentation-openai/src/instrumentation.ts (8 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

packages/instrumentation-openai/src/instrumentation.ts (3)

packages/instrumentation-openai/src/image-wrappers.ts (3)

wrapImageGeneration (399-456)

wrapImageEdit (458-523)

wrapImageVariation (525-585)

packages/traceloop-sdk/src/lib/tracing/decorators.ts (1)

tool (276-282)

packages/ai-semantic-conventions/src/SemanticAttributes.ts (1)

SpanAttributes (17-59)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build and test

🔇 Additional comments (4)

packages/instrumentation-openai/src/instrumentation.ts (4)

360-377: Good guard on tools to avoid runtime/type errors.

Filtering out non-function tools and validating tool.function prevents undefined access in v5+. This also aligns emitted attributes to function tools only.

410-417: APIPromiseType for streaming is reasonable (await resolves to Stream).

Typing the streaming promise as APIPromiseType<Stream<...>> fits v4/v5 behavior and works with for-await; combined with the _thenUnwrap fallback (see below), this should be robust.

703-711: Good: emit tool_call attributes only for function tools.

This avoids emitting attributes for non-function tools and protects against shape differences in v5.

64-79: Ignore guard suggestion for manuallyInstrument
After reviewing the implementation and usage across our instrumentations, manuallyInstrument is always invoked with the expected module shape (e.g. in manuallyInitInstrumentations we only call it when instrumentModules.openAI is provided). None of our other instrumentations (Anthropic, Cohere, etc.) add optional guarding, so adding those checks here would introduce inconsistency without clear benefit. Keep the current unguarded implementation.

Likely an incorrect or invalid review comment.

coderabbitai · 2025-08-14T10:42:28Z

packages/instrumentation-openai/src/instrumentation.ts

+    if (openaiModule.Images) {
      this._wrap(
-        (module as any).OpenAIApi.prototype,
-        "createCompletion",
-        this.patchOpenAI("completion", "v3"),
+        openaiModule.Images.prototype,
+        "generate",
+        wrapImageGeneration(
+          this.tracer,
+          this._config.uploadBase64Image,
+          this._config,
+        ),
      );
-    } else {
      this._wrap(
-        module.Chat.Completions.prototype,
-        "create",
-        this.patchOpenAI("chat"),
+        openaiModule.Images.prototype,
+        "edit",
+        wrapImageEdit(
+          this.tracer,
+          this._config.uploadBase64Image,
+          this._config,
+        ),
      );
      this._wrap(
-        module.Completions.prototype,
-        "create",
-        this.patchOpenAI("completion"),
+        openaiModule.Images.prototype,
+        "createVariation",
+        wrapImageVariation(
+          this.tracer,
+          this._config.uploadBase64Image,
+          this._config,
+        ),
      );
-
-      if (module.Images) {
-        this._wrap(
-          module.Images.prototype,
-          "generate",
-          wrapImageGeneration(
-            this.tracer,
-            this._config.uploadBase64Image,
-            this._config,
-          ),
-        );
-        this._wrap(
-          module.Images.prototype,
-          "edit",
-          wrapImageEdit(
-            this.tracer,
-            this._config.uploadBase64Image,
-            this._config,
-          ),
-        );
-        this._wrap(
-          module.Images.prototype,
-          "createVariation",
-          wrapImageVariation(
-            this.tracer,
-            this._config.uploadBase64Image,
-            this._config,
-          ),
-        );
-      }
    }


🛠️ Refactor suggestion

Guard Images instrumentation and support namespaced module shape.

openaiModule.Images is assumed to exist; if a namespaced shape is passed (e.g., { OpenAI: { Images } }), or Images is absent, _wrap may throw.

Apply this diff to reuse the computed root and guard method presence:

- if (openaiModule.Images) { + const root = (module as any)?.OpenAI ?? (module as any) ?? openaiModule; + if (root?.Images?.prototype) { - this._wrap( - openaiModule.Images.prototype, + this._wrap( + root.Images.prototype, "generate", wrapImageGeneration( this.tracer, this._config.uploadBase64Image, this._config, ), ); - this._wrap( - openaiModule.Images.prototype, + this._wrap( + root.Images.prototype, "edit", wrapImageEdit( this.tracer, this._config.uploadBase64Image, this._config, ), ); - this._wrap( - openaiModule.Images.prototype, + this._wrap( + root.Images.prototype, "createVariation", wrapImageVariation( this.tracer, this._config.uploadBase64Image, this._config, ), );

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In packages/instrumentation-openai/src/instrumentation.ts around lines 80 to 108, the code assumes openaiModule.Images exists and uses it directly which breaks for namespaced module shapes (e.g., { OpenAI: { Images } }) or when Images is absent; compute a moduleRoot = openaiModule.OpenAI ?? openaiModule, then check that moduleRoot.Images is defined before attempting to wrap; additionally verify the prototype and specific methods (generate, edit, createVariation) exist on moduleRoot.Images.prototype before calling this._wrap so you reuse the computed root and guard method presence to avoid throwing.

coderabbitai · 2025-08-14T10:42:28Z

packages/instrumentation-openai/src/instrumentation.ts

    const module = new InstrumentationNodeModuleDefinition(
      "openai",
-      [">=3.1.0 <5"],
+      [">=4 <6"],


💡 Verification agent

🧩 Analysis chain

Confirm intentional drop of auto-instrumentation for OpenAI v3.x and update docs accordingly.

The version range changed to ">=4 <6". patch() still contains v3 fallbacks, but init() will no longer auto-patch v3.x. If this is intentional, please update README and migration notes; otherwise, consider keeping ">=3.1.0 <6".

Run this script to spot lingering docs or code paths that still imply v3 auto-support:

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n -S 'OpenAIApi|createChatCompletion|createCompletion|v3\.1\.0|>=3\.1\.0 <5|v3' -g '!*node_modules' -g '!*dist'

Length of output: 87657

Confirm Intentional v3 Auto-Instrumentation Removal and Update Docs

It looks like init() now only targets “>=4 <6” (so v3.x will no longer be auto-patched), while the internal patchOpenAI(…, "v3") fallbacks remain available for manual use. If this change is deliberate, please:

• Update your README under packages/instrumentation-openai to call out that v3.x is no longer auto-instrumented and must be patched manually.
• Revise migration notes or CHANGELOG to document the breaking change (dropping v3.x auto-support, new version range “>=4 <6”).
• Optionally, add a helper snippet or example showing how to invoke patchOpenAI("chat", "v3") for users still on v3.x.

If v3.x should remain auto-patched, consider restoring the version range to “>=3.1.0 <6” in the init() matcher.

🤖 Prompt for AI Agents

In packages/instrumentation-openai/src/instrumentation.ts around line 114 the init() matcher was changed to "[">=4 <6"]" which removes automatic v3.x patching; if this is intentional, update packages/instrumentation-openai/README to state v3.x is no longer auto-instrumented and must be patched manually, add a CHANGELOG/migration note documenting the breaking change and new matcher range, and add a short example snippet showing how to call patchOpenAI("chat", "v3") for v3 users; if v3 auto-patching should remain, revert the matcher to include v3 (e.g., ">=3.1.0 <6") in init().

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 86d5d21 in 2 minutes and 45 seconds. Click for details.

Reviewed 2801 lines of code in 13 files
Skipped 1 files when reviewing.
Skipped posting 10 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1.

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-calculate-correct-tokens-for-different-quality-levels_3292443992/recording.har:1

Draft comment:
This HAR file appears to be an auto-generated network recording capturing the expected request and response for token calculation. Details match expectations. No issues were found.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, stating that the HAR file matches expectations and no issues were found. It doesn't provide any actionable feedback or suggestions for improvement.

2.

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-chat-completion_1028826680/recording.har:1

Draft comment:
This HAR file correctly captures the network exchange for a chat completion with logprobs enabled. The headers and base64 response are accurate and consistent with expectations.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply states that the HAR file captures the network exchange correctly, which does not align with the rules for useful comments.

3.

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-stream-chat-completion_2713905208/recording.har:1

Draft comment:
The recording for stream chat completion with logprobs shows proper segmentation of event stream data and timing. No anomalies were observed.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply states that no anomalies were observed, which does not align with the rules for useful comments.

4.

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-chat_1427850401/recording.har:1

Draft comment:
The HAR file for the chat completion test captures the expected request and response details, including headers and the base64 encoded response.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, describing what the HAR file captures. It doesn't provide any actionable feedback or suggestions for improvement.

5.

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-completion_322013537/recording.har:1

Draft comment:
This HAR recording for non-chat completion verifies that the prompt parameters are sent correctly and that the base64 encoded response is captured appropriately.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply describes what the HAR recording does, which is not necessary for the PR review process.

6.

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-tool-calling_1747151373/recording.har:1

Draft comment:
The HAR file for tool calling in a chat shows the correct structure for function calls. The request and response capture the function definition and its arguments as anticipated.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply states that the HAR file shows the correct structure, which does not align with the rules for useful comments.

7.

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-image-generation_2726543658/recording.har:1

Draft comment:
The image generation HAR file captures an API call for generating an image with accurate request body and response containing the generated image URL.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, describing what the HAR file captures without providing any actionable feedback or suggestions. It doesn't ask for any specific changes or confirm any intentions, nor does it suggest improvements or optimizations.

8.

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-chat_72904453/recording.har:1

Draft comment:
This HAR document for streaming chat records the expected event stream responses, with correctly segmented data chunks and proper timing information.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative and does not provide any actionable feedback or suggestions for improvement. It simply describes what the HAR document does, which is not necessary for a code review comment.

9.

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-completion_1484001317/recording.har:1

Draft comment:
The HAR file for streaming completion (non-chat) captures the event stream responses with base64 encoding, and the token responses along with timing data appear consistent.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment appears to be purely informative, describing the contents of a HAR file without providing any actionable feedback or suggestions for improvement. It does not align with the rules for useful comments, as it does not ask for clarification, suggest changes, or highlight potential issues.

10.

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-function_call-attributes-in-span-for-stream-completion-when-multiple-tools-cal_2278551756/recording.har:1

Draft comment:
This HAR file for testing function call behavior when multiple tools are involved in a streamed completion shows the correct sequence of function call events and the expected JSON fragments for function call arguments.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, describing the contents of a HAR file without providing any actionable feedback or suggestions. It doesn't ask for any specific changes or confirm any intentions, which violates the rules for comments.

Workflow ID: wflow_XORbZTDTjzaiSumv

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

coderabbitai

Actionable comments posted: 5

🔭 Outside diff range comments (5)

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-function-calling_1703714575/recording.har (1)
101-121: Sanitize volatile headers and cookies in HAR fixtures

Please update the Polly beforePersist hook in instrumentation.test.ts to strip out all sensitive or ephemeral headers and cookies, not just authorization. Specifically:

Filter both request.headers and response.headers to remove:

Authorization, Cookie, Set-Cookie, Content-Length, Date, X-Request-Id, CF-Ray

Any SDK/env-specific headers starting with x-stainless-

Clear out request.cookies and response.cookies arrays to drop Cloudflare cookies (__cf_bm, _cfuvid)

After merging, re-record tests or run a one-off sanitizer against existing .har fixtures

Suggested diff in packages/instrumentation-openai/test/instrumentation.test.ts:
--- a/packages/instrumentation-openai/test/instrumentation.test.ts
+++ b/packages/instrumentation-openai/test/instrumentation.test.ts
@@ beforeEach(function () {
- server.any().on("beforePersist", (_req, recording) => {
- recording.request.headers = recording.request.headers.filter(
- ({ name }) => name !== "authorization",
- );
- });
+ server.any().on("beforePersist", (_req, recording) => {
+ const volatileNames = [
+ "authorization",
+ "cookie",
+ "set-cookie",
+ "content-length",
+ "date",
+ "x-request-id",
+ "cf-ray",
+ ];
+ const isStainless = (n: string) => n.toLowerCase().startsWith("x-stainless-");
+
+ const sanitize = (hdrs: Array<{ name: string; value: string }>) =>
+ hdrs.filter((h) => {
+ const name = h.name.toLowerCase();
+ return !volatileNames.includes(name) && !isStainless(h.name);
+ });
+
+ recording.request.headers = sanitize(recording.request.headers);
+ recording.response.headers = sanitize(recording.response.headers);
+ recording.request.cookies = [];
+ recording.response.cookies = [];
+ });
Apply this change, then re-record or sanitize existing fixtures to remove all volatile values before committing.
packages/instrumentation-openai/test/instrumentation.test.ts (2)
727-733: Fix undefined identifier: imageOpenai is not declared (even though test is skipped).

TypeScript will still flag this unresolved symbol. Use the existing openai client.
- await imageOpenai.images.edit({
+ await openai.images.edit({
787-791: Same undefined identifier here — use openai.
- await imageOpenai.images.createVariation({
+ await openai.images.createVariation({
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-chat-completion_1028826680/recording.har (1)
103-121: Centralize Fixture Sanitization for Volatile Fields

Rather than manually patching each HAR, add a global sanitizer in your Polly setup to strip/redact these across all recordings. We’ve identified hundreds of fixtures that include volatile fields (cookies, set-cookie headers, Cloudflare tokens, request IDs, dates, project IDs, etc.).

In your test bootstrap (e.g. test/helpers/pollySetup.ts), configure a hook to drop them before persisting:
import { Polly } from '@pollyjs/core';

export function setupPolly() {
 return new Polly('RecordingTests', {
 adapters: ['node-http', 'xhr'],
 persister: 'fs',
 recordIfMissing: true,
 hooks: {
 afterPersist: recording => {
 for (const entry of recording.har.log.entries) {
 const { response } = entry;
 // remove all cookies
 response.cookies = [];
 // strip volatile headers
 const blacklist = [
 'set-cookie',
 'cookie',
 '__cf_bm',
 '_cfuvid',
 'x-request-id',
 'cf-ray',
 'date',
 'openai-project'
 ];
 response.headers = response.headers.filter(
 h => !blacklist.includes(h.name.toLowerCase())
 );
 }
 }
 }
 });
}
Next steps:

Import and call setupPolly() in every package’s test setup so all HARs are sanitized automatically.

One-time: regenerate affected recordings (or run a script to scrub existing fixtures).

Remove manual diffs from individual fixtures.

This will eliminate noise, prevent sensitive data leakage, and keep your snapshots stable.
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-completion_322013537/recording.har (1)

153-159: Sanitize OpenAI and Traceloop identifiers in HAR fixtures

We audited all HAR recordings under packages/**/recordings and discovered real values leaking in tests:

openai-organization: "traceloop"

openai-project: "proj_tzz1TbPPOXaf6j9tEkVUBIAa"

authorization header: "Bearer tl_9981e7218948437584e08e7b724304d8" (in packages/traceloop-sdk/recordings/...)

These identifiers and tokens shouldn’t be committed verbatim. Please replace them with placeholders in your fixtures, for example:

openai-organization: <ORGANIZATION_ID>

openai-project: <PROJECT_ID>

authorization: <AUTH_TOKEN>

Affected files (patterns):

• packages/instrumentation-*/**/recordings/**/*.har
• packages/traceloop-sdk/recordings/**/*.har

♻️ Duplicate comments (5)

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-chat_1427850401/recording.har (1)

101-121: Sanitize HAR: remove cookies and volatile headers to avoid fixture churn and leakage.

Same issues as other fixtures: cookies, date, request-id, cf-ray, content-length, and x-stainless-* should not be persisted. Please rely on the Polly sanitizer to drop them before persisting.

Use the repo-wide scan provided in my other comment to confirm removals.

Also applies to: 197-205, 125-127, 189-191, 219-221, 35-42, 50-57, 69-72, 83-83
packages/instrumentation-openai/test/instrumentation.test.ts (2)
112-129: Harden Polly sanitizer: drop cookies and volatile headers (security + stability).

You only remove authorization. Persisted HARs still contain cookies, request IDs, dates, cf-ray and x-stainless-* SDK/env headers, which are sensitive and/or highly volatile. Strip them to reduce churn and avoid leaking identifiers.

Apply this diff:
       const { server } = this.polly as Polly;
       server.any().on("beforePersist", (_req, recording) => {
-        recording.request.headers = recording.request.headers.filter(
-          ({ name }: { name: string }) => name !== "authorization",
-        );
+        // Drop auth, cookies and volatile headers to reduce fixture churn and avoid persisting identifiers
+        const dropHeader = (name: string) =>
+          [
+            "authorization",
+            "cookie",
+            "set-cookie",
+            "date",
+            "cf-ray",
+            "x-request-id",
+            "via",
+            "x-envoy-upstream-service-time",
+            "openai-processing-ms",
+            "content-length",
+            // OpenAI SDK/client/env specifics that change frequently
+            "user-agent",
+            "x-stainless-arch",
+            "x-stainless-lang",
+            "x-stainless-os",
+            "x-stainless-runtime",
+            "x-stainless-runtime-version",
+            "x-stainless-package-version",
+            "x-stainless-retry-count",
+          ].includes(name.toLowerCase());
+
+        recording.request.headers = (recording.request.headers || []).filter(
+          ({ name }: { name: string }) => !dropHeader(name),
+        );
+        recording.response.headers = (recording.response.headers || []).filter(
+          ({ name }: { name: string }) => !dropHeader(name),
+        );
+        // Drop cookies entirely
+        recording.request.cookies = [];
+        recording.response.cookies = [];
       });
Optionally also type the Mocha context:
TypeScript (apply outside this range):
function beforeEach(this: Mocha.Context & { polly?: Polly }) { … }

542-550: Make tool_call arguments parsing robust (string vs object).

The attribute may be a JSON string or an already-parsed object depending on SDK/version. Parse conditionally to avoid runtime errors.
-    const parsedArgs = JSON.parse(
-      completionSpan.attributes[
-        `${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.0.arguments`
-      ]! as string,
-    );
+    const rawArgs0 =
+      completionSpan.attributes[
+        `${SpanAttributes.LLM_COMPLETIONS}.0.tool_calls.0.arguments`
+      ]!;
+    const parsedArgs =
+      typeof rawArgs0 === "string" ? JSON.parse(rawArgs0) : rawArgs0;
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-function_call-attributes-in-span-for-stream-completion-when-multiple-tools-cal_2278551756/recording.har (1)

101-121: Sanitize streaming fixture: strip cookies and volatile headers.

This recording also persists __cf_bm/_cfuvid cookies, date, x-request-id, cf-ray, content-length, and x-stainless-* headers. Please rely on the improved Polly sanitizer to drop them before persisting.

Use the scan script from my first HAR comment to confirm these headers/cookies are removed after re-recording.

Also applies to: 197-205, 124-126, 188-190, 218-220, 35-42, 50-57, 69-72, 83-83

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-completion_1484001317/recording.har (1)

101-121: Remove cookies and volatile headers from this completion streaming HAR as well.

Same sanitization needed here to avoid committing sensitive cookies and unstable headers (date, x-request-id, cf-ray, content-length, x-stainless-*).

Reuse the repo-wide rg scan from earlier to validate that fixtures are clean.

Also applies to: 217-224, 124-126, 208-210, 234-236, 35-42, 50-57, 69-72, 83-84

🧹 Nitpick comments (5)

packages/instrumentation-openai/test/instrumentation.test.ts (2)
100-106: Silence noisy console.log in CI (guard with DEBUG).

Keep the helpful local hint but avoid CI noise by gating on an env var.
-    console.log("Using node-fetch for Polly.js compatibility");
+    if (process.env.DEBUG?.includes("polly")) {
+      console.log("Using node-fetch for Polly.js compatibility");
+    }
59-68: Heads-up: requestTimeout/socketTimeout = 0 can hang indefinitely.

This is fine when paired with per-test timeouts (you added for images). Consider adding a global safety timeout or narrowing this to recording mode to avoid deadlocks in CI for other tests.
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-chat-completion_1028826680/recording.har (1)
69-72: Drop content-length from request headers to reduce fixture churn.

content-length will change anytime the JSON payload formatting changes. Prefer omitting it from recordings.

Apply this diff to remove it from this fixture:
-            {
-              "_fromType": "array",
-              "name": "content-length",
-              "value": "117"
-            },
Combined with not matching on headers, this makes re-records less brittle.
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-completion_322013537/recording.har (2)
66-67: Avoid pinning runtime version in fixture to reduce churn.

Hardcoding x-stainless-runtime-version to a specific Node version makes the fixture needlessly environment-dependent.

Apply this diff to use a neutral placeholder (or consider dropping the header entirely if Polly matching doesn’t require it):
-              "value": "v20.10.0"
+              "value": "SANITIZED"
247-247: Normalize headersSize and timing fields.

headersSize, startedDateTime, time, and wait add noise and are irrelevant to test semantics; consider zeroing or fixing them.

Apply:
-          "headersSize": 1449,
+          "headersSize": 0,
@@
-        "startedDateTime": "2025-08-14T14:45:17.130Z",
-        "time": 2934,
+        "startedDateTime": "1970-01-01T00:00:00.000Z",
+        "time": 0,
@@
-          "wait": 2934
+          "wait": 0
Also applies to: 253-254, 262-263

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these settings in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6340b26 and 86d5d21.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (13)

packages/instrumentation-openai/package.json (1 hunks)
packages/instrumentation-openai/test/instrumentation.test.ts (8 hunks)
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-calculate-correct-tokens-for-different-quality-levels_3292443992/recording.har (1 hunks)
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-chat-completion_1028826680/recording.har (12 hunks)
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-stream-chat-completion_2713905208/recording.har (1 hunks)
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-chat_1427850401/recording.har (12 hunks)
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-completion_322013537/recording.har (12 hunks)
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-function-calling_1703714575/recording.har (12 hunks)
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-image-generation_2726543658/recording.har (1 hunks)
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-chat_72904453/recording.har (1 hunks)
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-completion_1484001317/recording.har (12 hunks)
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-tool-calling_1747151373/recording.har (12 hunks)
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-function_call-attributes-in-span-for-stream-completion-when-multiple-tools-cal_2278551756/recording.har (13 hunks)

✅ Files skipped from review due to trivial changes (5)

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-calculate-correct-tokens-for-different-quality-levels_3292443992/recording.har
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-streaming-chat_72904453/recording.har
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-image-generation_2726543658/recording.har
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-stream-chat-completion_2713905208/recording.har
packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-tool-calling_1747151373/recording.har

🚧 Files skipped from review as they are similar to previous changes (1)

packages/instrumentation-openai/package.json

🧰 Additional context used

🧬 Code Graph Analysis (1)

packages/instrumentation-openai/test/instrumentation.test.ts (1)

packages/ai-semantic-conventions/src/SemanticAttributes.ts (1)

SpanAttributes (17-59)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build and test

🔇 Additional comments (6)

packages/instrumentation-openai/test/instrumentation.test.ts (1)

662-664: LGTM: Extended timeout for image generation.

Reasonable guard for a slow endpoint during recording/replay.

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-chat-completion_1028826680/recording.har (1)

31-31: UA/version headers reflect SDK 5.12.2 — consistent with PR goal.

The updated User-Agent and x-stainless-package-version align with the SDK bump to 5.x compatibility.

Also applies to: 50-52

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-completion_322013537/recording.har (4)

29-37: Headers updated for OpenAI JS v5.12.2 look correct.

The user-agent and x-stainless-* headers reflect the 5.x SDK and are consistent with the PR’s objectives.

Also applies to: 39-42, 50-57

89-90: Minified request body LGTM.

Compact JSON for postData.text is fine and stable for replay.

11-11: Non-actionable: _id changed.

HAR entry _id changes are expected when re-recording. No concerns.

95-101: HAR fixture payload validated successfully

The base64-encoded, gzip-compressed response body decompresses and parses as valid JSON, confirming the recording fixture is intact.

coderabbitai · 2025-08-14T15:41:32Z

...tion_1770406427/should-emit-logprobs-span-event-for-chat-completion_1028826680/recording.har

+              "value": "OpenAI/JS 5.12.2"
            },
            {
              "_fromType": "array",
-              "name": "x-stainless-lang",
-              "value": "js"
+              "name": "x-stainless-arch",
+              "value": "arm64"
            },


🛠️ Refactor suggestion

Don’t match or persist dynamic request headers (user-agent, x-stainless-*, content-length).

These headers vary by SDK patch, machine arch, runtime, and body size. Recording them causes fixture churn; matching on them can break playback when SDK/runtime changes.

Recommended:

Ignore request headers in request matching (or at least ignore these specific ones).

Optionally strip them before persisting.

Example Polly config in your test setup:

// in test setup where Polly is configured polly.configure({ matchRequestsBy: { // Avoid matching on headers entirely (simple & robust) headers: false, }, });

If you must match on headers, provide an allowlist and exclude:

user-agent

content-length

x-stainless-arch

x-stainless-lang

x-stainless-package-version

x-stainless-retry-count

x-stainless-runtime-version

Also applies to: 40-42, 50-57, 66-67, 69-72, 83-83

🤖 Prompt for AI Agents

packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-chat-completion_1028826680/recording.har lines ~31-37 (and similarly at 40-42, 50-57, 66-67, 69-72, 83): the recording currently includes dynamic request headers (user-agent, x-stainless-*, content-length) which cause fixture churn; update the test/polly setup to stop matching or persisting these headers by disabling header matching (matchRequestsBy.headers = false) or by stripping/blacklisting user-agent, content-length and all x-stainless-* headers before saving the HAR so recordings are stable across SDK/runtime variations.

coderabbitai · 2025-08-14T15:41:32Z

...tion_1770406427/should-emit-logprobs-span-event-for-chat-completion_1028826680/recording.har

+            "text": "{\"messages\":[{\"role\":\"user\",\"content\":\"Tell me a joke about OpenTelemetry\"}],\"model\":\"gpt-3.5-turbo\",\"logprobs\":true}"
          },


💡 Verification agent

🧩 Analysis chain

Future-proof the test model (gpt-3.5-turbo is deprecated).

For any future re-recording against the API, switch to a current model that supports logprobs to avoid breakage.

Example edit in-place (verify the chosen model supports logprobs before changing):

- "text": "{\"messages\":[{\"role\":\"user\",\"content\":\"Tell me a joke about OpenTelemetry\"}],\"model\":\"gpt-3.5-turbo\",\"logprobs\":true}" + "text": "{\"messages\":[{\"role\":\"user\",\"content\":\"Tell me a joke about OpenTelemetry\"}],\"model\":\"gpt-4o-mini\",\"logprobs\":true}"

Would you like me to update the broader test suite to a current, logprobs-capable model and re-record the fixtures?

🌐 Web query:

Which current OpenAI chat completion models support the `logprobs` parameter as of now?

💡 Result:

Short answer — and a quick verification date:

As of August 14, 2025, OpenAI’s Chat Completions API exposes the logprobs parameter for chat completions (i.e., you can pass logprobs and top_logprobs). The API docs also call out a specific exception: the gpt-4-vision-preview model does not currently support the chat logprobs option. (community.openai.com)

Notes and next steps

In practice logprobs support is model-dependent (text-only chat models generally support it; some preview / vision / beta models may not). If you need a definitive per-model yes/no list I can check specific model names you care about (or run a small compatibility test request for any model and report whether the API accepts logprobs for that model).

Citations:

1: https://community.openai.com/t/logprobs-in-chatcompletion/329471?utm_source=chatgpt.com

Update deprecated test model to a current logprobs-compatible model

gpt-3.5-turbo has been deprecated and may break future recordings. As of August 14 2025, all text-only chat models (except gpt-4-vision-preview) support logprobs. Please replace the model in the HAR fixture with a supported one (e.g. gpt-4o-mini) and re-record the test.

• File: packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-chat-completion_1028826680/recording.har
– Lines 89–90: update the model name

Example diff (verify gpt-4o-mini or your chosen model supports logprobs before re-recording):

- "text": "{\"messages\":[{\"role\":\"user\",\"content\":\"Tell me a joke about OpenTelemetry\"}],\"model\":\"gpt-3.5-turbo\",\"logprobs\":true}" + "text": "{\"messages\":[{\"role\":\"user\",\"content\":\"Tell me a joke about OpenTelemetry\"}],\"model\":\"gpt-4o-mini\",\"logprobs\":true}"

Once updated, please re-record the fixture (and consider updating other logprobs tests to use a non-vision, text-only model).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"text": "{\"messages\":[{\"role\":\"user\",\"content\":\"Tell me a joke about OpenTelemetry\"}],\"model\":\"gpt-3.5-turbo\",\"logprobs\":true}"

},

"text": "{\"messages\":[{\"role\":\"user\",\"content\":\"Tell me a joke about OpenTelemetry\"}],\"model\":\"gpt-4o-mini\",\"logprobs\":true}"

},

🤖 Prompt for AI Agents

In packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-emit-logprobs-span-event-for-chat-completion_1028826680/recording.har around lines 89-90, the HAR fixture uses the deprecated model "gpt-3.5-turbo"; update that model field to a current text-only logprobs-compatible model such as "gpt-4o-mini" (or another verified logprobs-supported model), save the HAR, then re-run/re-record the test fixture so the recorded responses match the new model and update any other logprobs fixtures similarly.

coderabbitai · 2025-08-14T15:41:32Z

...trumentation_1770406427/should-set-attributes-in-span-for-completion_322013537/recording.har

+              "_fromType": "array",
+              "name": "content-length",
+              "value": "80"
            },


💡 Verification agent

🧩 Analysis chain

Request header “content-length” is often environment-specific; consider removing or ignoring in matching.

This header tends to vary and can cause fixture churn. If your Polly config matches on headers, consider excluding this one; otherwise, it’s safe to keep.

If you prefer to sanitize it in the recording, apply:

- "name": "content-length", - "value": "80" + "name": "content-length", + "value": "SANITIZED"

To verify the current matchRequestsBy settings, run:

Also applies to: 83-83

🏁 Script executed:

#!/bin/bash # Search for Polly setup and matching config rg -n -A3 -i 'new Polly|setupPolly|matchRequestsBy|matchBy|polly|pollyjs' -g '!node_modules'

Length of output: 102868

Exclude content-length from persisted headers to prevent fixture churn

Your Polly setup already uses matchRequestsBy: { headers: false }, so header differences won’t affect matching. However, recordings still include the content-length header, which can vary by environment and trigger unnecessary diffs. Update your beforePersist hook in packages/instrumentation-openai/test/instrumentation.test.ts (and any other packages with similar hooks) to filter out content-length as well as authorization:

packages/instrumentation-openai/test/instrumentation.test.ts server.any().on("beforePersist", (_req, recording) => { - recording.request.headers = recording.request.headers.filter( - ({ name }: { name: string }) => name !== "authorization" - ); + recording.request.headers = recording.request.headers.filter( + ({ name }: { name: string }) => + name.toLowerCase() !== "authorization" && + name.toLowerCase() !== "content-length" + ); });

If you’d rather sanitize existing recordings directly, apply a diff to each HAR:

- "name": "content-length", - "value": "80" + "name": "content-length", + "value": "SANITIZED"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"_fromType": "array",

"name": "content-length",

"value": "80"

},

server.any().on("beforePersist", (_req, recording) => {

recording.request.headers = recording.request.headers.filter(

({ name }: { name: string }) =>

name.toLowerCase() !== "authorization" &&

name.toLowerCase() !== "content-length"

);

});

🤖 Prompt for AI Agents

In packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-completion_322013537/recording.har around lines 69 to 72, the persisted HAR contains a fluctuating "content-length" header; update the beforePersist hook in packages/instrumentation-openai/test/instrumentation.test.ts (and any similar hooks) to remove "content-length" from recorded request/response headers in addition to "authorization" so recordings no longer include that header, or alternatively remove the "content-length" entries directly from existing HAR recordings to prevent fixture churn.

coderabbitai · 2025-08-14T15:41:32Z

...trumentation_1770406427/should-set-attributes-in-span-for-completion_322013537/recording.har

+              "expires": "2025-08-14T15:15:20.000Z",
              "httpOnly": true,
              "name": "__cf_bm",
              "path": "/",
              "sameSite": "None",
              "secure": true,
-              "value": "dpDj_rzXilyhr6.Gr7BsgjxBWPXkH0QzxFYSJODSL1k-1754899572-1.0.1.1-Dm62jBQ.gvLCAHxZ8fwS7jknxKPsax7.PxNAd66oLv9PlHLplZwv0WKD4PRjjs6VJVQO12VLR6yDX79MAkUA9cRa2K0QQ05.tldNjsAB14U"
+              "value": "8Zo0W1WdNUEe5N8pLRcrqa6tcfMK1dmXFUFuKsvOciQ-1755182720-1.0.1.1-KC0VBaOYmgjrg8anx2pC19gIaGdhy_5UB3Tw9XHfgK_ehMDzMLdcCHYTKgixPGhAydPOuuNWHT37sV_7FR32jGcfox_SCkkVMgNwG9ZXCsQ"
            },


⚠️ Potential issue

Redact cookies and Set-Cookie values; these are sensitive and highly ephemeral.

Committing actual cookie values (__cf_bm, _cfuvid) is unnecessary and risky. Redact or drop them to reduce exposure and churn.

Apply:

@@ - "expires": "2025-08-14T15:15:20.000Z", + "expires": "REDACTED", @@ - "value": "8Zo0W1WdNUEe5N8pLRcrqa6tcfMK1dmXFUFuKsvOciQ-1755182720-1.0.1.1-KC0VBaOYmgjrg8anx2pC19gIaGdhy_5UB3Tw9XHfgK_ehMDzMLdcCHYTKgixPGhAydPOuuNWHT37sV_7FR32jGcfox_SCkkVMgNwG9ZXCsQ" + "value": "<redacted>" @@ - "value": "KcexFXGp8n4hoEyANEo3kEN_URABlGh9VeqUBny1ZjY-1755182720191-0.0.1.1-604800000" + "value": "<redacted>" @@ - "value": "__cf_bm=8Zo0W1WdNUEe5N8pLRcrqa6tcfMK1dmXFUFuKsvOciQ-1755182720-1.0.1.1-KC0VBaOYmgjrg8anx2pC19gIaGdhy_5UB3Tw9XHfgK_ehMDzMLdcCHYTKgixPGhAydPOuuNWHT37sV_7FR32jGcfox_SCkkVMgNwG9ZXCsQ; path=/; expires=Thu, 14-Aug-25 15:15:20 GMT; domain=.api.openai.com; HttpOnly; Secure; SameSite=None" + "value": "SANITIZED; path=/; expires=Thu, 01-Jan-70 00:00:00 GMT; domain=.api.openai.com; HttpOnly; Secure; SameSite=None" @@ - "value": "_cfuvid=KcexFXGp8n4hoEyANEo3kEN_URABlGh9VeqUBny1ZjY-1755182720191-0.0.1.1-604800000; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None" + "value": "SANITIZED; path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None"

Also applies to: 114-121, 218-225

🤖 Prompt for AI Agents

In packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-attributes-in-span-for-completion_322013537/recording.har around lines 105-112 (and also apply same change to 114-121 and 218-225), the HAR contains real Set-Cookie cookie values (__cf_bm, _cfuvid) which are sensitive and ephemeral; remove or redact the cookie "value" fields (or remove the entire cookie entries) and replace them with a stable placeholder like "[REDACTED]" or omit the value property so tests keep structure but no real secrets are committed. Ensure the modified entries remain valid HAR JSON (keep name, path, sameSite, secure flags if needed) and run the test recordings check to confirm no structural changes break replay.

coderabbitai · 2025-08-14T15:41:32Z

...trumentation_1770406427/should-set-attributes-in-span-for-completion_322013537/recording.har

            {
              "name": "date",
-              "value": "Mon, 11 Aug 2025 08:06:12 GMT"
+              "value": "Thu, 14 Aug 2025 14:45:20 GMT"


🛠️ Refactor suggestion

Sanitize volatile response headers to reduce fixture churn.

These server-generated values change per request and don’t affect instrumentation assertions.

Apply:

- "value": "Thu, 14 Aug 2025 14:45:20 GMT" + "value": "Thu, 01 Jan 1970 00:00:00 GMT" @@ - "value": "760" + "value": "0" @@ - "value": "envoy-router-68955c875d-87tvh" + "value": "SANITIZED" @@ - "value": "816" + "value": "0" @@ - "value": "req_2ca041b07b24913daf8c1e435a505663" + "value": "SANITIZED" @@ - "value": "96f13c2eebe7c22f-TLV" + "value": "SANITIZED"

Also applies to: 161-163, 177-179, 181-183, 209-211, 235-237

galkleinman added 2 commits August 14, 2025 11:23

fix(openai): support for 5.x versions

33db37c

audit fix

e0c4d82

ellipsis-dev bot reviewed Aug 14, 2025

View reviewed changes

version limiting in instrumenation

a7153ab

ellipsis-dev bot reviewed Aug 14, 2025

View reviewed changes

coderabbitai bot reviewed Aug 14, 2025

View reviewed changes

nirga approved these changes Aug 14, 2025

View reviewed changes

fix tests

4e2a79b

ellipsis-dev bot reviewed Aug 14, 2025

View reviewed changes

prettier

a05c4dd

ellipsis-dev bot reviewed Aug 14, 2025

View reviewed changes

coderabbitai bot reviewed Aug 14, 2025

View reviewed changes

fix: APIPromise import

755cd47

ellipsis-dev bot reviewed Aug 14, 2025

View reviewed changes

coderabbitai bot reviewed Aug 14, 2025

View reviewed changes

fix openai module typing

6340b26

ellipsis-dev bot reviewed Aug 14, 2025

View reviewed changes

coderabbitai bot reviewed Aug 14, 2025

View reviewed changes

hmnzr mentioned this pull request Aug 14, 2025

feat: add support for OpenAI SDK v5 #644

Closed

fix polly.js tests issue

86d5d21

ellipsis-dev bot reviewed Aug 14, 2025

View reviewed changes

coderabbitai bot reviewed Aug 14, 2025

View reviewed changes

galkleinman merged commit 588bb01 into main Aug 14, 2025
5 checks passed

coderabbitai bot mentioned this pull request Sep 14, 2025

feat(vercel): add gen.ai.input.messages + gen.ai.output.messages #711

Open

-    if (this.polly) {
-      const { server } = this.polly as Polly;
-      server.any().on("beforePersist", (_req, recording) => {
-        recording.request.headers = recording.request.headers.filter(
-          ({ name }: { name: string }) => name !== "authorization",
-        );
-      });
-    }
+  if (this.polly) {
+    const { server } = this.polly as Polly;
+    server.any().on("beforePersist", (_req, recording) => {
+      // Drop auth headers, cookies and highly-volatile response headers to reduce fixture churn
+      const dropHeader = (name: string) =>
+        [
+          "authorization",
+          "cookie",
+          "set-cookie",
+          "date",
+          "cf-ray",
+          "x-request-id",
+          "via",
+          "x-envoy-upstream-service-time",
+          "openai-processing-ms",
+        ].includes(name.toLowerCase());
+      recording.request.headers = (recording.request.headers || []).filter(
+        ({ name }: { name: string }) => !dropHeader(name),
+      );
+      recording.request.cookies = [];
+      recording.response.headers = (recording.response.headers || []).filter(
+        ({ name }: { name: string }) => !dropHeader(name),
+      );
+      recording.response.cookies = [];
+    });
+  }

		"text": "{\"messages\":[{\"role\":\"user\",\"content\":\"Tell me a joke about OpenTelemetry\"}],\"model\":\"gpt-3.5-turbo\",\"logprobs\":true}"
		},

fix(openai): support for 5.x versions #646

fix(openai): support for 5.x versions #646

Uh oh!

Conversation

galkleinman commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

nirga Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

nirga Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

nirga Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

nirga Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

nirga Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

galkleinman Aug 14, 2025

Choose a reason for hiding this comment

galkleinman commented Aug 14, 2025 •

edited

Loading

coderabbitai bot commented Aug 14, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)