Extract Formatted Text

This REST API allows extracting formatted text by setting the pages extraction mode option. You need to specify the FormattedTextOptions Mode parameter besides the basic options.

Resource

The following GroupDocs.Parser Cloud REST API resource has been used in the Extract formatted text example.

cURL example

The following example demonstrates how to extract formatted text.

# Get JSON Web Token
# Provide your Client Id and Client Secret via environment variables $CLIENT_ID and $CLIENT_SECRET.
curl -v "https://api.groupdocs.cloud/connect/token" \
  -X POST \
  -d "grant_type=client_credentials&client_id=$CLIENT_ID&client_secret=$CLIENT_SECRET" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -H "Accept: application/json"

# Example: join several documents into one
curl -v "https://api.groupdocs.cloud/v1.0/parser/text" \
  -X POST \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -d '{
    "FormattedTextOptions": {
        "Mode": "Html"
    },
    "FileInfo": {
        "FilePath": "words\\docx\\formatted-document.docx",
        "StorageName": ""
    }
}'
# Get JSON Web Token
# Provide your credentials via environment variables $env:CLIENT_ID and $env:CLIENT_SECRET.
curl.exe -v "https://api.groupdocs.cloud/connect/token" `
  -X POST `
  -d "grant_type=client_credentials&client_id=$env:CLIENT_ID&client_secret=$env:CLIENT_SECRET" `
  -H "Content-Type: application/x-www-form-urlencoded" `
  -H "Accept: application/json"

# Example: join several documents into one
curl.exe -v "https://api.groupdocs.cloud/v1.0/parser/text" `
  -X POST `
  -H "Content-Type: application/json" `
  -H "Accept: application/json" `
  -H "Authorization: Bearer $env:JWT_TOKEN" `
  -d "{
    'FormattedTextOptions': {
        'Mode': 'Html'
    },
    'FileInfo': {
        'FilePath': 'words\\docx\\formatted-document.docx',
        'StorageName': ''
    }
}"
rem Get JSON Web Token
rem Provide your credentials via environment variables %CLIENT_ID% and %CLIENT_SECRET%.
curl -v "https://api.groupdocs.cloud/connect/token" ^
  -X POST ^
  -d "grant_type=client_credentials&client_id=%CLIENT_ID%&client_secret=%CLIENT_SECRET%" ^
  -H "Content-Type: application/x-www-form-urlencoded" ^
  -H "Accept: application/json"

rem Example: join several documents into one
curl -v "https://api.groupdocs.cloud/v1.0/parser/text" ^
  -X POST ^
  -H "Content-Type: application/json" ^
  -H "Accept: application/json" ^
  -H "Authorization: Bearer %JWT_TOKEN%" ^
  -d "{\"FormattedTextOptions\":{\"Mode\":\"Html\"},\"FileInfo\":{\"FilePath\":\"words\\docx\\formatted-document.docx\",\"StorageName\":\"\"}}"
{
    "text": "
<p>
<b>Bold text
</b>
</p>
<p>
<i>Italic text
</i>
</p>
<ol>
<li>
<i>First element
</i>
</li>
<li>
<i>Second element
</i>
</li>
<li>
<i>Third element
</i>
</li>
</ol>
<h1>Heading 1
</h1>
<p>
<a href#\"http://targetwebsite.domain\ >}}Hyperlink 
</a>targetwebsite.domain
</p>
<table border#\"1\ >}}
<tr>
<td>
<p>table
</p>
</td>
<td>
<p>Cell 1
</p>
</td>
<td>
<p>Cell 2
</p>
</td>
</tr>
<tr>
<td>
<p>Cell 3
</p>
</td>
<td>
<p>Cell 4
</p>
</td>
<td>
<p>Cell 5
</p>
</td>
</tr>
</table>
<p>\f
</p>
<p>
<b>Second page bold text
</b>
</p>
<h1>Second page heading
</h1>"
}

SDK examples

Using an SDK (API client) is the quickest way for a developer to speed up the development. An SDK takes care of a lot of low-level details of making requests and handling responses and lets you focus on writing code specific to your particular project. Check out our GitHub repository for a complete list of GroupDocs.Parser Cloud SDKs along with working examples, to get you started in no time. Please check Available SDKs article to learn how to add an SDK to your project.