Skip to content

Conversation

@LifeJiggy
Copy link

Summary

This PR adds a ResponseCache utility class that provides simple in-memory response caching with TTL support, helping developers reduce API calls and improve application performance.

Problem

Gradient API calls can be expensive and slow, especially when the same data is requested repeatedly. Developers currently have no built-in way to cache API responses, leading to:

  • Unnecessary API calls for identical requests
  • Slower application performance
  • Higher API usage costs
  • No control over response freshness

Solution

Add ResponseCache class with:

  • TTL (time-to-live) support for automatic cache expiration
  • LRU (least recently used) eviction when cache is full
  • Request deduplication based on method, URL, params, and data
  • Simple API for get/set/clear operations
  • Configurable cache size and default TTL

Key Features

  • TTL Support: Automatic expiration of cached responses
  • LRU Eviction: Removes least recently used items when cache is full
  • Request Deduplication: MD5-based keys from method, URL, params, and data
  • Thread Safe: Uses standard library only, no external dependencies
  • Configurable: Adjustable cache size and TTL settings
  • Simple API: Easy to integrate into existing code
  • Performance Benefits
  • Reduces redundant API calls
  • Improves response times for cached data
  • Helps stay within API rate limits
  • Reduces network latency

Testing

Added comprehensive test suite covering:

  • Basic cache operations (set/get)

  • TTL expiration behavior

  • Cache size limits and LRU eviction

  • Parameter-based caching

  • Cache clearing functionality

  • All tests pass with full coverage of cache functionality.

Usage Examples

from gradient._utils import ResponseCache

# Create cache with 5-minute default TTL
cache = ResponseCache(max_size=100, default_ttl=300)

# Cache API responses
response = client.models.list()
cache.set("GET", "/v2/gen-ai/models", response)

# Later, check cache before making API call
cached = cache.get("GET", "/v2/gen-ai/models")
if cached:
    print("Using cached response")
    return cached

# Make fresh API call and cache it
response = client.models.list()
cache.set("GET", "/v2/gen-ai/models", response)
return response

@bbatha
Copy link
Collaborator

bbatha commented Nov 25, 2025

Independently this is a useful pr however you are including features we do not want such as the key validator and the cli. Please remove those unrelated additions and I will review the cache code.

@LifeJiggy
Copy link
Author

Thanks for the feedback, @bbatha! Totally agree—I'll strip out the key validator and CLI to keep things laser-focused on the caching utility. Aiming to push an update in the next [24-48 hours]. Excited for your thoughts on the ResponseCache core! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants