Split Word Documents using Java REST API

Split document icon

MS Word (DOC, DOCX, etc) is one of the most widely used documents for information and data sharing. Due to its rich formatting support, the presentation and storage of information in this format are quite easy. Furthermore, this format is also used for data archival when information is retrieved from multiple sources. So, users, do consolidated/merged documents and produce a single merged file.

In this blog, following topics are covered

Why Split the documents?

Similar to document merge operations, many users have a requirement to split an existing MS Word or OpenOffice document. So either you need to install MS Word or other applications providing these capabilities. But in this scenario, you need to manually provide the document, provide the document splitting mechanism (approach) and also, manually provide the location on which you need to save the resultant files.

The manual procedure to split documents is viable for a small set of documents. But, when dealing with a bulk of files or in a dynamically changing criteria, it gets difficult to cope up with the requirements. Therefore, in order to provide a smart and effective document spit solution, Aspose.Words Cloud API can be used. You do not need to install any software including MS Office automation etc. All you need to do is, create an account on Aspose.Cloud Dashboard and get your Client Secret and Client ID. Then use any programming language of your choice to accomplish your requirements.

Split files using Java Cloud API

The task to split a DOCX document by manually copying and pasting may be a time consuming, labor-intensive, sub-optimal approach. Instead, we can significantly improve efficiency by using Aspose.Words Cloud Java SDK which splits Word documents into files. In this article, the focus is to fulfill the requirement by using SplitDocument API.

Supported Formats

The API is capable of splitting DOCX, DOC, DOTX, DOT, RTF, ODT, OTT, TXT documents. The output can be saved in DOCX, DOC, PDF, ODT, RTF, HTML, JPEG, PNG, and many other file formats. The following code snippet shows how to perform document split operation.

  • First of all, you need to create an object of WordsApi.
  • Secondly, you need to create an instance of ApiClient object
  • Thirdly, pass your Client Secret and Client ID details to ApiClient object
  • Penultimate, create an instance of SplitDocumentRequest class which takes input parameters such as file name, resultant format, to and from pages. (If you do not provide the To and From page details, all the pages of the document will be split into individual page document)
  • Lastly, create an instance of SplitDocumentResponse by calling splitDocument method to complete the operation.

In case you need to generate the output in PDF or other supported formats, simply provide the desired file format details against format argument and the API will perform the desired operation.

cURL command to Split documents

cURL commands are also one of the easiest approaches to call REST APIs as they provide the mechanism to use the APIs irrespective of the operating system. Please note that our APIs can only be accessed by authorized users, so when trying to access the APIs through cURL commands, you need to provide JWT authorization token. Please visit the following article for details on how to obtain JSON Web Token Authentication.

Once JWT token has been obtained, use following cURL command to perform document spit operation.

curl -X PUT "https://api.aspose.cloud/v4.0/words/Sample.docx/split?format=docx&zipOutput=false" -H "accept: application/json" -H "Authorization: Bearer <JWT Token>"

Conclusion

The article has explained the steps to Split Word documents into individual page documents using Java code snippet as well as using cURL commands. In case you have any related query, please feel free to contact via Free support forum. Please visit the following link for the Free GitHub code repository.

Nevertheless, you may consider visiting the following link for related details on How to split Word DOC/DOCX Pages to Multiple Documents using C#