components batch_score_llm

Batch Score Large Language Models

batch_score_llm

Overview

Version: 0.0.1

View in Studio: https://ml.azure.com/registries/azureml/components/batch_score_llm/version/0.0.1

Inputs

Predefined arguments for parallel job: https://learn.microsoft.com/en-us/azure/machine-learning/reference-yaml-job-parallel?source=recommendations#predefined-arguments-for-parallel-job

Name	Description	Type	Default	Optional	Enum
resume_from	The pipeline run id to resume from	string		True
append_row_safe_output	Enable PRS safe append row configuration that is needed when dealing with large outputs with Unicode characters.	boolean	True

Custom arguments

Name	Description	Type	Default	Optional	Enum
data_input_table	The data to be split and scored in parallel.	mltable		False
api_type	Specifies the API type used for scoring.	string	completion	False	['completion', 'chat_completion', 'embeddings', 'vesta', 'vesta_chat_completion']
scoring_url	Url used for scoring input data.	string		False
authentication_type	Specifies the authentication type to use for scoring.	string	managed_identity	True	['azureml_workspace_connection', 'managed_identity']
connection_name	Specifies the connection name containing the api-key for scoring. This is required for authentication type "azureml_workspace_connection".	string		True
debug_mode		boolean	False
additional_properties	A stringified json expressing additional properties to be added to each request body at the top level.	string		True
additional_headers	A stringified json expressing additional headers to be added to each request.	string		True
configuration_file	A json file containing configuration values for the batch score component.	uri_file		True
tally_failed_requests	Determines if failed requests will be outputted. Enabling this will count failed requests towards error_threshold.	boolean	False
tally_exclusions	Configures which failed requests will be excluded from tallying. Only applicable when tally_failed_requests is enabled. Delimit with "	" when specifying multiple values. - "none": None of the failed requests will be excluded from tallying. - "bad_request_to_model": 400 model status code will be excluded from tallying.	string	none
segment_large_requests		string		True	['disabled', 'enabled']
segment_max_token_size		integer	600
app_insights_connection_string	An application insights connection string. If provided, batch component will emit metrics and logs to this application insight instance	string		True
ensure_ascii	If ensure_ascii is True, the output is guaranteed to have all incoming non-ASCII characters escaped. If ensure_ascii is False, these characters will be output as-is. More defailted information can be found at https://docs.python.org/3/library/json.html	boolean	False
output_behavior		string	append_row	False	['append_row', 'summary_only']
max_retry_time_interval	The maximum time (in seconds) spent retrying a payload. If unspecified, payloads are retried unlimited times.	integer		True

Parallel configuration

Name	Description	Type	Default	Optional	Enum
initial_worker_count		integer	5
max_worker_count	Overrides initial_worker_count if necessary	integer	200

Partial results configuration

Name	Description	Type	Default	Optional	Enum
save_mini_batch_results		string	disabled	False	['disabled', 'enabled']
async_mode	Whether to use PRS mini-batch streaming feature, which allows each PRS processor to process multiple mini-batches at a time.	boolean	False

Outputs

Name	Description	Type
job_out_path		uri_file
mini_batch_results_out_directory		uri_folder
metrics_out_directory		uri_folder

Wiki menu

Home
Reference Documentation
- Components
- Data
- Environments
- Models
Contributing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

components batch_score_llm

Batch Score Large Language Models

batch_score_llm

Overview

Inputs

Outputs

Wiki menu

Clone this wiki locally