satif_sdk.standardizers.remote
DEFAULT_TIMEOUT
Default timeout 10 minutes (httpx uses float)
RemoteStandardizer Objects
class RemoteStandardizer(AsyncStandardizer)
A standardizer that interacts with a remote Satif-compliant standardization API. It handles file uploads, monitors progress via Server-Sent Events (SSE), and downloads the resulting SDIF file.
Allows providing a custom
httpx.AsyncClientinstance for advanced configuration, otherwise creates a default client based on environment variables or parameters. Compresses multiple input files into a single zip archive before uploading.Requires configuration of the remote API base URL and potentially an API key. The remote API is expected to follow a specific pattern:
- POST to
 runs_path_prefixto create a run.- SSE stream at
 events_url(from create run response).- GET from
 runs_path_prefix/{run_id}/resultto download the output.
__init__
def __init__(base_url: Optional[str] = None,
             api_key: Optional[str] = None,
             runs_path_prefix: Optional[str] = None,
             timeout: Optional[float] = DEFAULT_TIMEOUT,
             client: Optional[httpx.AsyncClient] = None,
             **kwargs: Any)
Initializes the remote standardizer.
Arguments:
base_url- The base URL of the remote standardization API. Defaults to env {ENV_REMOTE_BASE_URL}. Used only if 'client' is not provided.api_key- The API key for authentication. Defaults to env {ENV_REMOTE_API_KEY}. Used as Bearer token if 'client' is not provided.runs_path_prefix- Base path for standardization runs on the remote API. Defaults to env {ENV_REMOTE_RUNS_PATH_PREFIX} or '{DEFAULT_RUNS_PATH_PREFIX}'.timeout- Default request timeout in seconds. Used only if 'client' is not provided. Defaults to {DEFAULT_TIMEOUT} seconds.client- An optional pre-configuredhttpx.AsyncClientinstance. If provided,base_url,api_key, andtimeoutargs are ignored for client creation, butruns_path_prefixis still used.api_key0 - Additional keyword arguments passed to the defaulthttpx.AsyncClientconstructor ifclientis not provided.
standardize
async def standardize(datasource: Datasource,
                      output_path: SDIFPath,
                      *,
                      options: Optional[Dict[str, Any]] = None,
                      log_sse_events: bool = False,
                      overwrite: bool = False) -> StandardizationResult
Performs standardization by interacting with the remote Satif API.
This involves:
- Validating inputs and preparing file paths.
 - Packaging datasource file(s) for upload (zipping if multiple).
 - Uploading the datasource and options to initiate a run.
 - Monitoring the run's progress via Server-Sent Events (SSE).
 - Fetching final run details (including file_configs and result_url).
 - Downloading the resulting SDIF file using the result_url.
 - Saving the downloaded SDIF file.
 Arguments:
datasource- Path or list of paths to the input file(s).
output_path- The path where the resulting SDIF file should be saved.
options- Optional dictionary of processing options for the standardization run. These are serialized to JSON and sent as a form field.
log_sse_events- If True, SSE messages from the server will be logged.
overwrite- If True, overwrite the output file if it exists.Returns:
A StandardizationResult object containing the path to the created SDIF database file and the file-specific configurations.
Raises:
FileNotFoundError- If an input file doesn't exist.FileExistsError- If output_path exists and overwrite is False.IOError- If file reading/writing fails (now primarily OSError).httpx.HTTPStatusError- For unsuccessful API responses (4xx, 5xx).RuntimeError- For other operational errors, including failed standardization runs.output_path0 - If zip creation fails.output_path1 - If datasource is invalid.