LLM Fine-Tuning - with my own data #4006
Unanswered
asumansaree
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Greetings! I've watched your webinar on YouTube , which is Build Custom LLMs on Your Data. But in webinar and also in Ludwig's tutorial part, a LLM is fine-tuned for Alpaca dataset, to follow instruction. But I want to fine-tune a LLM that already have chat property, with my own title-content data. I want to achieve a chatbot that can answer according to my title-content data, like a chat. Like:
title: Corporate - About Us - Vision
content: Producing customer-centered financial digital projects using current and stable technologies and transforming these projects into global products.
User can ask question "What is your vision?" and chat model should be able to answer this question according to my data, with chat approach.
Here is the config file I've tried (It should be Turkish):
`config_str = """
model_type: llm
base_model: Trendyol/Trendyol-LLM-7b-base-v1.0
quantization:
bits: 4
adapter:
type: lora
prompt:
template: |
### Input:
Content: {sample}
### Response:
input_features:
type: text
preprocessing:
max_sequence_length: 256
output_features:
type: text
preprocessing:
max_sequence_length: 256
trainer:
type: finetune
learning_rate: 0.0001
batch_size: 1
gradient_accumulation_steps: 16
epochs: 3
learning_rate_scheduler:
warmup_fraction: 0.01
preprocessing:
sample_ratio: 0.1
"""
config = yaml.safe_load(config_str)`
But I couldn't even pass preprocessing. Is there any resource for how can I achieve this kind of task (in Turkish)? Thanks in advance.
Beta Was this translation helpful? Give feedback.
All reactions