Split Word Document into Separate Files Online using Python

A quick and easy approach to extract pages from word documents using Python SDK. Split Pages in Word document online

split word document
split word document | Extract Pages from Word Document as a separate file

In a collaborative environment, many people work collectively on a single document, and from time to time, they need to provide their input. However, once the document has grown to a significant size, it gets difficult to print the complete document or send it as an attachment by email. Therefore, we may have a requirement to split the document into small size documents. In this article, we are going to discuss the details of how to Split word document into individual documents using Python SDK.

Word Processing API

Aspose.Words Cloud is our dedicated solution for MS Word (DOCX, DOC, DOT, RTF, DOCM) or OpenDocument (ODT, OTT) processing. No third-party software or MS Office automation is necessary to process Word documents. Simply call the REST APIs to accomplish your requirements. Since the APIs are REST-based, so you can access them on any platform including Desktop, Web, Mobile App, etc. Now as per the scope of this article, we are going to discuss the details of how to split pages in a word file as an individual word document. The API also provides the flexibility to customize the split operation i.e. Split every page, odd and even, by the number of pages, by page range.

In order to further facilitate our customers, we have created Aspose.Words Cloud SDK for Python, which is a wrapper around Cloud API, so you can take all the benefits of Word document processing within your favorite programming language. So before proceeding further, the first step is the installation of SDK on the local system. It is available for download at PIP and GitHub. Execute the following command on the command line terminal to install the SDK:

pip install aspose-words-cloud

In case you are using Visual Studio as IDE, you may directly add the reference of SDK in the project.

Click View -> Other Windows -> Python Environments option. As shown below

Python Environment menu option.
Image 1:- Python Environment menu option.

Enter aspose-word-cloud under Packages field in Python Environments window. Then click the Install aspose-word-cloud (21.11.0) link. The version number may change depending upon the latest/current release version. See the image below.

aspose-words-cloud python package
Image 2:- aspose-words-cloud python package.

Split Pages in Word Document using Python

Please follow the instructions below to split all the pages in a word document already available in cloud storage.

  • First, we need to initialize an object of WordsApi while passing Client ID and Client Secret details as arguments
  • Secondly, specify the name of the input Word file, resultant output format, name of the resultant file, and parameter to zip archive output
  • Upload input Word document using UploadFileRequest object
  • Now create an instance of SplitDocumentRequest while passing details defined in the second step
  • Finally, call the split_document(…) method of WordsApi to perform the conversion operation. The resultant files are saved in mapped cloud storage
Preview of Document Split operation
Image 3:- Preview of Document Split operation.

Split Document based on Selected Pages

In this section, we are going to discuss the details on how to split a document based on selected pages and save the output as a ZIP archive. The code snippet is almost the same as shared above except we need to specify the Page From, Page To and True value for output to be archived.

Document Split output
Image 4:- Preview of Document Split operation for selected pages.

Extract Pages from Word Document using cURL Commands

Like other REST APIs, Aspose.Words Cloud can also be accessed via cURL commands within the command line terminal. However, before proceeding further, we need to first generate a JWT access token based on Client credentials.

curl -v "https://api.aspose.cloud/connect/token" \
-X POST \
-d "grant_type=client_credentials&client_id=88d1cda8-b12c-4a80-b1ad-c85ac483c5c5&client_secret=406b404b2df649611e508bbcfcd2a77f" \
-H "Content-Type: application/x-www-form-urlencoded" \
-H "Accept: application/json"

Once the token has been generated, please execute the following command to extract pages from the word document and save the output in Cloud storage.

curl -v -X PUT "https://api.aspose.cloud/v4.0/words/source.doc/split?format=DOCX&destFileName=Split-File&from=2&to=4&zipOutput=false" \
-H  "accept: application/json" \
-H  "Authorization: Bearer <JWT Token>"

Conclusion

In this article, we have explored the possibility to create a document splitter that can split Word document into individual page files using Python SDK. Furthermore, as per your requirements, you may use the Python SDK or extract pages from Word document using cURL commands. Please note that we believe in collective growth and collaboration. Therefore, our SDKs are developed as per the MIT license and their complete source code is available for download over Github. If you need, you may download and modify the code as per your requirements. In case you encounter any issues or you have any further queries, please feel free to contact us via the Free product support forum.

Related Articles

We recommend you visit the following links to learn more about