Bemærk
Adgang til denne side kræver godkendelse. Du kan prøve at logge på eller ændre mapper.
Adgang til denne side kræver godkendelse. Du kan prøve at ændre mapper.
Extracts structured data from a document column using AI/LLM.
For the corresponding Databricks SQL function, see ai_extract function.
Syntax
from pyspark.sql import functions as dbf
dbf.ai_extract(col=<col>, schema=<schema>, options=<options>)
Parameters
| Parameter | Type | Description |
|---|---|---|
col |
pyspark.sql.Column or str |
A column containing the document content to extract from. |
schema |
dict or list |
A Python dict (field name to {"type": ..., "description": ...}) or list of field-name strings. Serialized to a JSON literal automatically. |
options |
dict, optional |
A dictionary of options to control extraction behavior. |
Returns
pyspark.sql.Column: A new column of VariantType containing the extracted fields.
Examples
df.select(ai_extract("text", {"name": {"type": "string", "description": "Name"}}))
df.select(ai_extract("text", ["name", "age"]))