⚡️ Speed up method BlockManifest.describe_outputs by 787% in PR #1647 (seg-preview-workflow-block)
#1648
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #1647
If you approve this dependent PR, these changes will be merged into the original PR branch
seg-preview-workflow-block.📄 787% (7.87x) speedup for
BlockManifest.describe_outputsininference/core/workflows/core_steps/models/foundation/seg_preview/v1.py⏱️ Runtime :
1.60 milliseconds→180 microseconds(best of88runs)📝 Explanation and details
The optimization moves the
OutputDefinitionobject creation from inside thedescribe_outputs()method to module-level initialization as a cached constant_OUTPUTS. Instead of creating a new list andOutputDefinitionobject on every method call, the optimized version simply returns the pre-computed list.Key changes:
OutputDefinitioninstance and list on each call (1018 calls creating 2036 object instantiations in profiler results)_OUTPUTSis computed once at import time and reused across all callsOutputDefinition()constructor callsWhy it's faster:
Python object instantiation has overhead - each
OutputDefinition()call involves attribute assignment, method resolution, and memory allocation. By pre-computing this static data structure once, we avoid repeated object creation and list construction. The 787% speedup comes from eliminating these allocations entirely.Test case performance:
The optimization shows consistent 900-1100% improvements across all test cases, with the biggest gains in scenarios that call
describe_outputs()multiple times (like the side-effects and list modification tests), proving the caching approach scales well with repeated usage patterns typical in workflow execution engines.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-pr1647-2025-10-28T21.53.02and push.