LLM-Powered Extraction API

The enhanced LLM Extraction API now supports user-defined JSON Schema to enforce strict data structure compliance while retaining natural language flexibility. Combine free-text instructions with schema validation to balance precision and adaptability, ideal for regulated industries (e.g., finance, healthcare) or complex nested data.


Scrape content from a URL

This endpoint allows you to scrape markdown content from a given URL. You must provide a valid URL as a query parameter.

Required parameters

  • Name
    url
    Type
    string
    Description

    The URL from which the content will be scraped. For example: https://www.maginative.com/

  • Name
    prompt
    Type
    string
    Description

    Natural language extraction prompt (≤500 chars).

  • Name
    schema
    Type
    string
    Description

    A JSON Schema defining the expected output structure. LLM will prioritize filling this schema over free-text interpretation.

  • Name
    proxy_premium
    Type
    boolean
    Description

    Use premium proxies to make the request harder to detect

  • Name
    proxy_country
    Type
    string
    Description

    Geolocation of the IP used to make the request. Only for Premium Proxies, ISO 3166 country codes.

Request

POST
/v1/scraper/schema
curl -G https://api.autoscraper.pro/v1/scraper/schema \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -d '{
      "urls": [
        "https://www.example.com/"
      ],
      "proxy_premium": true,
      "proxy_country": ""
    }'

Response

{
  "product_name": "Wireless Noise-Canceling Headphones",
  "brand": "AudioTech",
  "pricing": {
    "base_price": 299.99,
    "discount": 0.15
  },
  "specifications": [
    "Battery life: 30hrs",
    "Bluetooth 5.3"
  ]
}

Was this page helpful?

Previous
Screenshot