Data extraction
Learn how to extract content from websites using the WetroCloud scrape endpoint.
The scrape
endpoint allows you to retrieve specific data from a web page in a structured JSON format based on your predefined schema.
This endpoint supports just one response formats to suit different use cases:
Structured Output Response
Structured output is a feature that allows large language models (LLMs) to generate responses in a specific format with your resources. Structured output is best for extracting structured data to fit your application’s needs. It returns data formatted with the json_schema
and json_schema_rules
you provide.
To use the structured output feature you need to provide the following parameters:
Parameter | Type | Description |
---|---|---|
website | String | The website url to extract from. |
json_schema | String | The desired structure for the JSON response. |
json_schema
The json_schema
parameter defines the structure of the response you want the system to return. This parameter helps ensure the output matches your application’s requirements.
Example JSON Schema
Example JSON Schema (List)
If you want a list of items and not just a single item, you can specify the json schema by putting the entire object in a list
Example Requests
Example Response
Field | Description |
---|---|
response | JSON object structured as per the provided schema. |
tokens | Number of tokens used for processing. |
success | Indicates whether the query was successful. |