-
Notifications
You must be signed in to change notification settings - Fork 128
components batch_score_llm
github-actions[bot] edited this page Dec 12, 2023
·
13 revisions
Version: 0.0.1
View in Studio: https://ml.azure.com/registries/azureml/components/batch_score_llm/version/0.0.1
Predefined arguments for parallel job: https://learn.microsoft.com/en-us/azure/machine-learning/reference-yaml-job-parallel?source=recommendations#predefined-arguments-for-parallel-job
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
resume_from | The pipeline run id to resume from | string | True | ||
append_row_safe_output | Enable PRS safe append row configuration that is needed when dealing with large outputs with Unicode characters. | boolean | True |
Custom arguments
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
data_input_table | The data to be split and scored in parallel. | mltable | False | ||
api_type | Specifies the API type used for scoring. | string | completion | False | ['completion', 'chat_completion', 'embeddings', 'vesta', 'vesta_chat_completion'] |
scoring_url | Url used for scoring input data. | string | False | ||
authentication_type | Specifies the authentication type to use for scoring. | string | managed_identity | True | ['azureml_workspace_connection', 'managed_identity'] |
connection_name | Specifies the connection name containing the api-key for scoring. This is required for authentication type "azureml_workspace_connection". | string | True | ||
debug_mode | boolean | False | |||
additional_properties | A stringified json expressing additional properties to be added to each request body at the top level. | string | True | ||
additional_headers | A stringified json expressing additional headers to be added to each request. | string | True | ||
configuration_file | A json file containing configuration values for the batch score component. | uri_file | True | ||
tally_failed_requests | Determines if failed requests will be outputted. Enabling this will count failed requests towards error_threshold. | boolean | False | ||
tally_exclusions | Configures which failed requests will be excluded from tallying. Only applicable when tally_failed_requests is enabled. Delimit with " | " when specifying multiple values. - "none": None of the failed requests will be excluded from tallying. - "bad_request_to_model": 400 model status code will be excluded from tallying. | string | none | |
segment_large_requests | string | True | ['disabled', 'enabled'] | ||
segment_max_token_size | integer | 600 | |||
app_insights_connection_string | An application insights connection string. If provided, batch component will emit metrics and logs to this application insight instance | string | True | ||
ensure_ascii | If ensure_ascii is True, the output is guaranteed to have all incoming non-ASCII characters escaped. If ensure_ascii is False, these characters will be output as-is. More defailted information can be found at https://docs.python.org/3/library/json.html | boolean | False | ||
output_behavior | string | append_row | False | ['append_row', 'summary_only'] | |
max_retry_time_interval | The maximum time (in seconds) spent retrying a payload. If unspecified, payloads are retried unlimited times. | integer | True |
Parallel configuration
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
initial_worker_count | integer | 5 | |||
max_worker_count | Overrides initial_worker_count if necessary | integer | 200 |
Partial results configuration
Name | Description | Type | Default | Optional | Enum |
---|---|---|---|---|---|
save_mini_batch_results | string | disabled | False | ['disabled', 'enabled'] | |
async_mode | Whether to use PRS mini-batch streaming feature, which allows each PRS processor to process multiple mini-batches at a time. | boolean | False |
Name | Description | Type |
---|---|---|
job_out_path | uri_file | |
mini_batch_results_out_directory | uri_folder | |
metrics_out_directory | uri_folder |