Skip to content

Add comprehensive stream-based API to DocumentOperation functionality#1275

Draft
Copilot wants to merge 2 commits intomainfrom
copilot/fix-1172
Draft

Add comprehensive stream-based API to DocumentOperation functionality#1275
Copilot wants to merge 2 commits intomainfrom
copilot/fix-1172

Conversation

Copy link
Copy Markdown

Copilot AI commented Jun 27, 2025

This PR implements a complete stream-based API for DocumentOperation alongside the existing file-based API, providing developers with flexible options for PDF manipulation scenarios.

New Stream-Based Methods

Loading and Saving

  • LoadStream(Stream, password) - Load PDF from a stream instead of file path
  • SaveToStream(Stream) - Save processed PDF directly to a stream

Merging and Processing

  • MergeStream(Stream, pageSelector) - Merge PDF content from streams
  • OverlayStream(LayerStreamConfiguration) - Apply overlay content from streams
  • UnderlayStream(LayerStreamConfiguration) - Apply underlay content from streams
  • AddAttachmentStream(DocumentAttachmentStream) - Add file attachments from streams

Key Features

Full backward compatibility - All existing file-based APIs remain unchanged
Mixed operations - Can combine file and stream operations in the same workflow
Automatic resource management - Temporary files are created and cleaned up automatically
Comprehensive testing - 20 total tests (13 existing + 7 new stream tests)
Rich documentation - Includes practical examples and usage patterns

Usage Examples

Basic Stream Processing

using var inputStream = new MemoryStream(pdfBytes);
using var outputStream = new MemoryStream();

DocumentOperation
    .LoadStream(inputStream)
    .Linearize() // Optimize for web
    .SaveToStream(outputStream);

Merging Multiple Streams

DocumentOperation
    .LoadStream(stream1)
    .MergeStream(stream2)
    .MergeStream(stream3)
    .SaveToStream(outputStream);

Mixed File and Stream Operations

DocumentOperation
    .LoadFile("document.pdf")        // Load from file
    .MergeStream(streamContent)      // Merge from stream
    .SaveToStream(outputStream);     // Save to stream

Stream-Based Attachments

DocumentOperation
    .LoadStream(mainDocStream)
    .AddAttachmentStream(new DocumentOperation.DocumentAttachmentStream
    {
        Stream = attachmentStream,
        AttachmentName = "readme.txt",
        MimeType = "text/plain"
    })
    .SaveToStream(outputStream);

Technical Implementation

The implementation uses temporary files to bridge the stream-based API with the underlying QPDF native library, which operates on file paths. Key technical features:

  • Automatic temporary file management with proper cleanup in finally blocks
  • Stream position handling and disposal management
  • Resource leak prevention through comprehensive error handling
  • Performance optimization by reusing existing JobConfiguration infrastructure

Fixes #1172.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: MarcinZiabek <9263853+MarcinZiabek@users.noreply.github.com>
Copilot AI changed the title [WIP] Improve the DocumentOperation functionality by providing stream-based instead of file-based API Add comprehensive stream-based API to DocumentOperation functionality Jun 27, 2025
Copilot AI requested a review from MarcinZiabek June 27, 2025 12:16
@riise-endre
Copy link
Copy Markdown

@MarcinZiabek When can we expect this PR to be merged? We are currently blocked, and not sure if we should await this PR being merged or look for other alternatives. Thanks.

@phess101
Copy link
Copy Markdown

We are highly interested in this feature as well

@bbmario2023
Copy link
Copy Markdown

We are also very interested in this feature. Is there a possibility of integrating it soon?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve the DocumentOperation functionality by providing stream-based instead of file-based API

5 participants