Transform Your Document Processing with tenbase2.com
Document Segmentor REST API

Turn complex documents into structured, analyzable data with our powerful document segmentation REST API. Whether you’re building an advanced document analysis system, creating a content management solution, or developing educational tools, our API delivers precise document breakdown at multiple granularity levels.

Features That Set Us Apart

Multi-Format Support

Process documents from various sources:
PDF files
HTML files (optimized for ABBYY FineReader OCR output)
Plain text documents

Clean, Focused Segmentation

Our API automatically excludes non-essential elements to deliver clean, relevant content: – Table of contents removed – Page numbers stripped – Footnotes excluded

Flexible Segmentation Options

Break down documents exactly how you need them: – Page-level segmentation for preserving original document structure – Paragraph-level parsing for logical content blocks – Sentence-level analysis for fine-grained text processing – Intelligent footnote detection and parsing (when using ABBYY FineReader OCR-generated HTML)

Built for Developers

RESTful API architecture for seamless integration
Clear, consistent JSON responses
Comprehensive API documentation
No API key required
Unlimited free usage
Low latency processing

How It Works

Submit Your Document Send your document to our API endpoint using a simple POST request.
Choose Your Segmentation Level Specify whether you want page, paragraph, or sentence-level segmentation.
Receive Structured Results Get back clean, organized JSON containing your segmented document data.

Example Code Snippet

Here’s how you can use the API with a simple POST request in Python:

import requests

# Define the API endpoint

url = "https://tenbase2.ai/api/segmentor/seg"

# Open the file in binary mode

# Files are pdf, html, txt, or zip

with open("path/to/your/file.txt", "rb") as file: 

files = { "file": file }

# Add the iParseType parameter (set to 0, 1, or 2)
params = {
    "iParseType": 0  # 0=sentence, 1=paragraph, 2=page
}

# Make the POST request to upload the file with the parameter
response = requests.post(url, files=files, params=params)

# Check the response status
if response.status_code == 200:
    try:
        data = response.json()
        if isinstance(data, dict) and "items" in data:
            for item in data.get("items", []):
                print(item)
        else:
            print("Unexpected response format:", data)
    except ValueError:
        print("Failed to parse JSON:", response.text)

else:
    print("Failed to upload file:", response.status_code, response.text)

Transform Your Document Processing with tenbase2.com Document Segmentor REST API