LLM Fine-Tuning - with my own data #4006

asumansaree · 2024-05-20T06:08:19Z

asumansaree
May 20, 2024

Greetings! I've watched your webinar on YouTube , which is Build Custom LLMs on Your Data. But in webinar and also in Ludwig's tutorial part, a LLM is fine-tuned for Alpaca dataset, to follow instruction. But I want to fine-tune a LLM that already have chat property, with my own title-content data. I want to achieve a chatbot that can answer according to my title-content data, like a chat. Like:
title: Corporate - About Us - Vision
content: Producing customer-centered financial digital projects using current and stable technologies and transforming these projects into global products.
User can ask question "What is your vision?" and chat model should be able to answer this question according to my data, with chat approach.
Here is the config file I've tried (It should be Turkish):
`config_str = """
model_type: llm
base_model: Trendyol/Trendyol-LLM-7b-base-v1.0
quantization:
bits: 4
adapter:
type: lora
prompt:
template: |
### Input:
Content: {sample}
### Response:
input_features:

name: prompt
type: text
preprocessing:
max_sequence_length: 256
output_features:
name: output
type: text
preprocessing:
max_sequence_length: 256
trainer:
type: finetune
learning_rate: 0.0001
batch_size: 1
gradient_accumulation_steps: 16
epochs: 3
learning_rate_scheduler:
warmup_fraction: 0.01
preprocessing:
sample_ratio: 0.1
"""
config = yaml.safe_load(config_str)`
But I couldn't even pass preprocessing. Is there any resource for how can I achieve this kind of task (in Turkish)? Thanks in advance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM Fine-Tuning - with my own data #4006

{{title}}

Replies: 0 comments

Select a reply

LLM Fine-Tuning - with my own data #4006

asumansaree May 20, 2024

Replies: 0 comments

asumansaree
May 20, 2024