> ## Documentation Index
> Fetch the complete documentation index at: https://docs.somark.cn/llms.txt
> Use this file to discover all available pages before exploring further.

# Overview

> The unified entry point for SoMark CLI & SDK.

SoMark CLI & SDK are developer tools from the SoMark team.

We keep the SDK and CLI in the same project for one practical reason: install once, get two paths. The same package gives you in-app integration and command-line execution. Use [SoMark CLI](/en/documentation/cli) when you want to run immediately in a terminal, script, or agent platform. Use [SoMark SDK](/en/documentation/sdk) when you want to connect document parsing to a product, backend service, or automation pipeline.

Python and JavaScript are not two wrappers around one codebase. `somark-py` and `somark-js` are independent implementations: each follows its own language ecosystem, async model, and engineering conventions, without glue code or unnecessary dependencies.

<Columns cols={2}>
  <Card title="somark-py" icon="https://mintcdn.com/soulcode-aa7e5a93/mqnfzKMhfEvy3I3O/images/python.svg?fit=max&auto=format&n=mqnfzKMhfEvy3I3O&q=85&s=528053ec0b2ff92265cc46a7e95f1122" href="https://github.com/SoMarkAI/somark-py" cta="Open Python repository" width="48" height="48" data-path="images/python.svg">
    Python implementation. One package provides both the SDK and the `somark` CLI.
  </Card>

  <Card title="somark-js" icon="https://mintcdn.com/soulcode-aa7e5a93/mqnfzKMhfEvy3I3O/images/javascript.svg?fit=max&auto=format&n=mqnfzKMhfEvy3I3O&q=85&s=bc4be68c0abea8e7bc2c5a4cf898eca8" href="https://github.com/SoMarkAI/somark-js" cta="Open JavaScript repository" width="48" height="48" data-path="images/javascript.svg">
    JavaScript / TypeScript implementation. One package provides both the SDK and the `somark` CLI.
  </Card>
</Columns>

## One capability set, two entry points

<Columns cols={2}>
  <Card title="CLI" icon="terminal">
    Best for terminals, scripts, CI, agent platforms, and every "process this file now" scenario.
  </Card>

  <Card title="SDK" icon="code">
    Best for product features, backend services, async jobs, data pipelines, and engineering systems that need to run reliably over time.
  </Card>
</Columns>

## Sync principle

The SoMark API is the highest-priority source of truth. Whatever the API supports, the CLI and SDK follow. The CLI does not create a separate workflow, and the SDK does not invent a parallel universe.

The Python and JavaScript implementations keep parsing capabilities fully aligned, but some derived capabilities differ because of ecosystem differences:

* <Tooltip headline="SoMarkDown" tip="A compiler for SoMark's Markdown superset, with professional rendering for math, chemical structures, code highlighting, and more." cta="Read docs" href="/en/open-source-tools/somarkdown">SoMarkDown</Tooltip> can only be imported from the JS SDK because SoMarkDown itself is implemented in JavaScript. SoMarkDown Preview is available in both Python and JS workflows.
* PDF processing reaches its broadest capability range on the Python side because it depends on <Tooltip headline="SoPDF" tip="A high-performance, open-source-friendly PDF processing library." cta="Read docs" href="/en/open-source-tools/sopdf">SoPDF</Tooltip>, whose implementation and lower-level dependencies are more complete in Python.

## When to use CLI

<Columns cols={2}>
  <Card title="Agent platforms" icon="robot">
    Use `somark` as a clean external tool in Claude Code, Codex, OpenClaw, and similar environments.
  </Card>

  <Card title="Terminal batches" icon="square-terminal">
    Scan folders, run scripts, and process batches of papers, reports, receipts, or contracts without building an app first.
  </Card>

  <Card title="One-off conversion" icon="folder-tree">
    Temporarily convert files to Markdown, JSON, SoMarkDown, or ZIP, then move on with the result.
  </Card>

  <Card title="Automation and diagnostics" icon="gear">
    Put it in CI, scheduled jobs, local preview, usage checks, and installation diagnostics. This is what command lines are good at.
  </Card>
</Columns>

## When to use SDK

<Columns cols={2}>
  <Card title="In-product parsing" icon="app-store">
    After users upload files, return readable, searchable, and reusable structured content inside your app.
  </Card>

  <Card title="Backend services" icon="server">
    Connect parsing to APIs, workers, webhooks, or internal services so documents become part of your system.
  </Card>

  <Card title="Async queue control" icon="bars-staggered">
    Use it when large files, batch files, or long-running tasks need fine-grained async control after entering a queue.
  </Card>

  <Card title="Data pipelines" icon="database">
    Parse, clean, store, search, and vectorize content with a deep SoMark integration, so unstructured files become usable data.
  </Card>
</Columns>
