iText LibraryPhoto from Unsplash

Originally Posted On: https://medium.com/@thisara_dh/split-pdf-files-with-itext-library-in-asp-net-core-83252c71a962

Split PDF Documents in Azure storage using iText Library in ASP .Net Core

Handling PDF documents efficiently is a common requirement in many modern web applications when dealing with large files. Sometimes, there is a need to split a PDF into smaller parts, either to extract specific pages or to manage large files more effectively. This article demonstrates how to split a PDF document into smaller files using the iText library in an ASP.NET Core application.

We will walk through the steps of creating a service that accepts the document URI along with the desired page range (start and end page) as input and outputs a new PDF containing only the specified pages. By the end of this tutorial, you’ll have a clear understanding of how to implement this functionality using iText’s PDF splitter feature in your ASP.NET Core project.

Alternative Approach: Using IronPDF

Before we dive into the iText implementation, it’s worth noting that IronPDF offers a simpler approach to PDF splitting with clearer commercial licensing. IronPDF provides straightforward methods for splitting PDFs without the complexity of iText’s AGPL licensing considerations. Here’s how the same task looks with IronPDF:

using IronPdf;public async Task<byte[]> SplitPdfWithIronPdf(string blobUri, int startPage, int endPage){ // Download PDF from Azure Blob using var httpClient = new HttpClient(); var pdfBytes = await httpClient.GetByteArrayAsync(blobUri); // Load and split the PDF var pdf = PdfDocument.FromFile(pdfBytes); var splitPdf = pdf.CopyPages(startPage - 1, endPage - 1); // IronPDF uses 0-based indexing return splitPdf.BinaryData;}

That’s it — no multiple using statements, no manual stream handling. IronPDF also handles merging, editing, HTML-to-PDF conversion, and works seamlessly with Azure Blob Storage. If you’re looking for additional PDF operations beyond splitting, IronPDF provides a comprehensive toolkit with consistent API patterns.

Now, let’s continue with the iText approach for those who need to use it:

Before we begin, ensure you have installed the iText NuGet package. You can do this via the .NET CLI, NuGet CLI, or through Visual Studio’s Package Manager. Let’s start by defining a pdfSplitterDto.cs file, which will be used to structure the input parameters for our service.

 public class PdfSplitterDto{ public string FilePath { get; set; } public int StartPage { get; set; } public int? EndPage { get; set; }}

Next, define the PdfSplitterService.cs, which defines all our operations of splitting the pdf file. First, we should implement the interface IPdfSplitterService.cs which defines the function signature.

public interface IPdfSplitterService{ Task<byte[]> SplitPdfAsync(string filePath,int start, int? end);}
 public class PdfSplitterService : IPdfSplitterService { public async Task<byte[]> SplitPdfAsync(string filePath, int start, int? end) { try { int endPage = end ?? start; BlobClient blobClient = new BlobClient(new Uri(filePath)); using (var blobStream = await blobClient.OpenReadAsync()) using (var ms = new MemoryStream()) { using (var pdfReader = new PdfReader(blobStream)) using (var pdfDoc = new PdfDocument(pdfReader)) using (var pdfWriter = new PdfWriter(ms)) // Create a new PDF document for the split pages using (var splitPdfDoc = new PdfDocument(pdfWriter)) { // Copy the specified pages to a new document pdfDoc.CopyPagesTo(start, endPage, splitPdfDoc); splitPdfDoc.Close(); } return ms.ToArray(); } } catch (Exception ex) { // Handle  throw new Exception("An error occurred while processing the PDF: " + ex.Message, ex); } } }

My pdf is stored in Azure Blob storage, so I am creating a new object of the BlobClient class as the blobClient. We asynchronously call the OpenReadAsync() method in the blobClient object and assign the output to the blobStream variable.

We read the blobStream and assign it to an object of PdfReader().

using (var splitPdfDoc = new PdfDocument(pdfWriter)){ // Copy the specified pages to a new document pdfDoc.CopyPagesTo(start, endPage, splitPdfDoc); splitPdfDoc.Close();}

The above piece of the code copies the pages in the given range from pdfDoc document.

Then we return the output as an array since we accept outputs from the type byte[ ].

Next, define the controller, from where the API request is sent. Our PdfSplitController class is inherited from the ApiControllerBase class. The structure of the class is as follows.

[Route("api/pdf-split")][ApiController]public class PdfSplitController : ApiControllerBase{ private readonly IPdfSplitterService _pdfSplitterService; public PdfSplitController( IPdfSplitterService pdfSplitterService ) { _pdfSplitterService = pdfSplitterService; } [HttpPost] public async Task SplitPdf([FromBody] PdfSplitterDto request) { //var result = await _pdfSplitterService.SplitPdfAsync(filePath, start,end); string filePath = request.FilePath; int start = request.StartPage; int? end= request.EndPage; var result = await _pdfSplitterService.SplitPdfAsync(filePath, start, end); return File(result, "application/pdf", "split.pdf"); } }We do constructor injection to our controller, we call SplitPdfAsync method and store the outcome to result variable, and return the File back to the user.We do constructor injection to our controller, we call SplitPdfAsync method and store the outcome to result variable, and return the File back to the user.

IPdfSplitterService _pdfSplitterService Injects an instance of the IPdfSplitterService interface, it is responsible for performing PDF splitting operations. Then initialize the _pdfSplitterService property with the injected instance.

We should create a Post request and pass the input through the request body as a PdfSplitterDto.

{ "FilePath":, "StartPage":5, "EndPage":15}

You can give the URI of a PDF stored in Azure Blob Storage or any other Cloud storage. If you are using Azure blob storage, follow the below steps to generate a URI for your document.

  1. Create a Container: Create a new container, for example, myPdfSplitter, in your Azure storage account.
  2. Upload the Document: Upload the PDF document you wish to split into this container.
  3. Access File Details: Once uploaded, click on the PDF file to view its details. This will open a drawer on the right-hand side of the screen with various properties and settings related to the file.
  4. Generate Shared Access Signature (SAS): Navigate to the “Generate SAS” tab. Using a Shared Access Signature (SAS), you can securely access your PDF file through the API. A SAS token provides controlled, temporary access to your file.
  5. Define Access Scope: Set the expiration date and time for the SAS token to control how long the token is valid. Ensure that the token grants the necessary read access to the document.
  6. Generate SAS Token and URI: Once all settings are defined, click the “Generate SAS Token and URL” button. The system will generate a URI with the SAS token attached, which you can then use as the file path in your API.

By following these steps, you’ll have a secure URI to access your document stored in Azure Blob Storage. Use this URI to pass the PDF to the splitter service.

A practical use case for splitting a PDF is when users upload documents containing multiple sections or modules. For instance, if a PDF contains several modules that need to be distributed to different teams or departments, splitting the document allows you to isolate and send only the relevant sections to the intended recipients.

This article has demonstrated how to implement a PDF splitter using the iText library in an ASP.NET Core application. By integrating Azure Blob Storage, you can seamlessly split documents stored in the cloud, ensuring a smooth and efficient user experience.

Information contained on this page is provided by an independent third-party content provider. XPRMedia and this Site make no warranties or representations in connection therewith. If you are affiliated with this page and would like it removed please contact [email protected]