Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iphone 12 pro, ipad and iphone 15 or some other device this llama.rn pakage has issue. Even the device has latest os and enough gpu to run the model. #102

Open
PratikBhatti83 opened this issue Dec 29, 2024 · 10 comments

Comments

@PratikBhatti83
Copy link

When using the llama.rn package, the initLlama function fails to initialize, throwing a "Failed to load model" error.
Screenshot 2024-12-29 at 1 41 18 PM

@PratikBhatti83 PratikBhatti83 changed the title Iphone 12 pro, ipad and iphone 15 or some other device this llama.rn pakage has issue. Iphone 12 pro, ipad and iphone 15 or some other device this llama.rn pakage has issue. Even the device has lates os and enough gpu to run the model. Dec 29, 2024
@PratikBhatti83 PratikBhatti83 changed the title Iphone 12 pro, ipad and iphone 15 or some other device this llama.rn pakage has issue. Even the device has lates os and enough gpu to run the model. Iphone 12 pro, ipad and iphone 15 or some other device this llama.rn pakage has issue. Even the device has latest os and enough gpu to run the model. Dec 29, 2024
@PratikBhatti83
Copy link
Author

@jhen0409 @a-ghorbani is there any solution for this? could you please help me to solve this problem

@jhen0409
Copy link
Member

jhen0409 commented Jan 2, 2025

I can't figure the cause based on the error message. Could you provide more information, like Xcode logs or what models are you using?

@PratikBhatti83
Copy link
Author

PratikBhatti83 commented Jan 2, 2025

@jhen0409 i used Llama-3.2-1B-Instruct-Q4_K_M.gguf this model!

Xcode log is like below error only

Context initialization failed: Error: Failed to load model.

Meanwhile on other device like Iphone 13, Iphone 14 Pro I can able to run this model. but some of the device it's failed to load model while intiazation. i think you should update lllama.cpp version or sync the latest code because I checked llama.cpp doesn't have this type of initialization issue.

@PratikBhatti83
Copy link
Author

@jhen0409 is there any update for this? please help me to solve this issue!

@jhen0409
Copy link
Member

jhen0409 commented Jan 6, 2025

I can update llama.cpp tomorrow, but not sure if it will help you.

You should be able to get Xcode logs like this:

Logs: Load Llama-3.2-1B-Instruct-Q4_K_M.gguf (success)
llama_model_loader: loaded meta data with 30 key-value pairs and 147 tensors from <PATH>
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.type str              = model
llama_model_loader: - kv   2:                               general.name str              = Llama 3.2 1B Instruct
llama_model_loader: - kv   3:                           general.finetune str              = Instruct
llama_model_loader: - kv   4:                           general.basename str              = Llama-3.2
llama_model_loader: - kv   5:                         general.size_label str              = 1B
llama_model_loader: - kv   6:                               general.tags arr[str,6]       = ["facebook", "meta", "pytorch", "llam...
llama_model_loader: - kv   7:                          general.languages arr[str,8]       = ["en", "de", "fr", "it", "pt", "hi", ...
llama_model_loader: - kv   8:                          llama.block_count u32              = 16
llama_model_loader: - kv   9:                       llama.context_length u32              = 131072
llama_model_loader: - kv  10:                     llama.embedding_length u32              = 2048
llama_model_loader: - kv  11:                  llama.feed_forward_length u32              = 8192
llama_model_loader: - kv  12:                 llama.attention.head_count u32              = 32
llama_model_loader: - kv  13:              llama.attention.head_count_kv u32              = 8
llama_model_loader: - kv  14:                       llama.rope.freq_base f32              = 500000.000000
llama_model_loader: - kv  15:     llama.attention.layer_norm_rms_epsilon f32              = 0.000010
llama_model_loader: - kv  16:                 llama.attention.key_length u32              = 64
llama_model_loader: - kv  17:               llama.attention.value_length u32              = 64
llama_model_loader: - kv  18:                          general.file_type u32              = 15
llama_model_loader: - kv  19:                           llama.vocab_size u32              = 128256
llama_model_loader: - kv  20:                 llama.rope.dimension_count u32              = 64
llama_model_loader: - kv  21:                       tokenizer.ggml.model str              = gpt2
llama_model_loader: - kv  22:                         tokenizer.ggml.pre str              = llama-bpe
llama_model_loader: - kv  23:                      tokenizer.ggml.tokens arr[str,128256]  = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv  24:                  tokenizer.ggml.token_type arr[i32,128256]  = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv  25:                      tokenizer.ggml.merges arr[str,280147]  = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
llama_model_loader: - kv  26:                tokenizer.ggml.bos_token_id u32              = 128000
llama_model_loader: - kv  27:                tokenizer.ggml.eos_token_id u32              = 128009
llama_model_loader: - kv  28:                    tokenizer.chat_template str              = {% set loop_messages = messages %}{% ...
llama_model_loader: - kv  29:               general.quantization_version u32              = 2
llama_model_loader: - type  f32:   34 tensors
llama_model_loader: - type q4_K:   96 tensors
llama_model_loader: - type q6_K:   17 tensors
llm_load_vocab: control token: 128254 '<|reserved_special_token_246|>' is not marked as EOG
llm_load_vocab: control token: 128252 '<|reserved_special_token_244|>' is not marked as EOG
llm_load_vocab: control token: 128251 '<|reserved_special_token_243|>' is not marked as EOG
llm_load_vocab: control token: 128250 '<|reserved_special_token_242|>' is not marked as EOG
llm_load_vocab: control token: 128248 '<|reserved_special_token_240|>' is not marked as EOG
llm_load_vocab: control token: 128247 '<|reserved_special_token_239|>' is not marked as EOG
llm_load_vocab: control token: 128245 '<|reserved_special_token_237|>' is not marked as EOG
llm_load_vocab: control token: 128244 '<|reserved_special_token_236|>' is not marked as EOG
llm_load_vocab: control token: 128243 '<|reserved_special_token_235|>' is not marked as EOG
llm_load_vocab: control token: 128240 '<|reserved_special_token_232|>' is not marked as EOG
llm_load_vocab: control token: 128238 '<|reserved_special_token_230|>' is not marked as EOG
llm_load_vocab: control token: 128235 '<|reserved_special_token_227|>' is not marked as EOG
llm_load_vocab: control token: 128234 '<|reserved_special_token_226|>' is not marked as EOG
llm_load_vocab: control token: 128229 '<|reserved_special_token_221|>' is not marked as EOG
llm_load_vocab: control token: 128227 '<|reserved_special_token_219|>' is not marked as EOG
llm_load_vocab: control token: 128226 '<|reserved_special_token_218|>' is not marked as EOG
llm_load_vocab: control token: 128224 '<|reserved_special_token_216|>' is not marked as EOG
llm_load_vocab: control token: 128223 '<|reserved_special_token_215|>' is not marked as EOG
llm_load_vocab: control token: 128221 '<|reserved_special_token_213|>' is not marked as EOG
llm_load_vocab: control token: 128219 '<|reserved_special_token_211|>' is not marked as EOG
llm_load_vocab: control token: 128218 '<|reserved_special_token_210|>' is not marked as EOG
llm_load_vocab: control token: 128217 '<|reserved_special_token_209|>' is not marked as EOG
llm_load_vocab: control token: 128216 '<|reserved_special_token_208|>' is not marked as EOG
llm_load_vocab: control token: 128215 '<|reserved_special_token_207|>' is not marked as EOG
llm_load_vocab: control token: 128213 '<|reserved_special_token_205|>' is not marked as EOG
llm_load_vocab: control token: 128211 '<|reserved_special_token_203|>' is not marked as EOG
llm_load_vocab: control token: 128210 '<|reserved_special_token_202|>' is not marked as EOG
llm_load_vocab: control token: 128209 '<|reserved_special_token_201|>' is not marked as EOG
llm_load_vocab: control token: 128208 '<|reserved_special_token_200|>' is not marked as EOG
llm_load_vocab: control token: 128207 '<|reserved_special_token_199|>' is not marked as EOG
llm_load_vocab: control token: 128204 '<|reserved_special_token_196|>' is not marked as EOG
llm_load_vocab: control token: 128202 '<|reserved_special_token_194|>' is not marked as EOG
llm_load_vocab: control token: 128197 '<|reserved_special_token_189|>' is not marked as EOG
llm_load_vocab: control token: 128195 '<|reserved_special_token_187|>' is not marked as EOG
llm_load_vocab: control token: 128194 '<|reserved_special_token_186|>' is not marked as EOG
llm_load_vocab: control token: 128191 '<|reserved_special_token_183|>' is not marked as EOG
llm_load_vocab: control token: 128190 '<|reserved_special_token_182|>' is not marked as EOG
llm_load_vocab: control token: 128188 '<|reserved_special_token_180|>' is not marked as EOG
llm_load_vocab: control token: 128187 '<|reserved_special_token_179|>' is not marked as EOG
llm_load_vocab: control token: 128185 '<|reserved_special_token_177|>' is not marked as EOG
llm_load_vocab: control token: 128184 '<|reserved_special_token_176|>' is not marked as EOG
llm_load_vocab: control token: 128183 '<|reserved_special_token_175|>' is not marked as EOG
llm_load_vocab: control token: 128178 '<|reserved_special_token_170|>' is not marked as EOG
llm_load_vocab: control token: 128177 '<|reserved_special_token_169|>' is not marked as EOG
llm_load_vocab: control token: 128176 '<|reserved_special_token_168|>' is not marked as EOG
llm_load_vocab: control token: 128175 '<|reserved_special_token_167|>' is not marked as EOG
llm_load_vocab: control token: 128174 '<|reserved_special_token_166|>' is not marked as EOG
llm_load_vocab: control token: 128173 '<|reserved_special_token_165|>' is not marked as EOG
llm_load_vocab: control token: 128172 '<|reserved_special_token_164|>' is not marked as EOG
llm_load_vocab: control token: 128169 '<|reserved_special_token_161|>' is not marked as EOG
llm_load_vocab: control token: 128167 '<|reserved_special_token_159|>' is not marked as EOG
llm_load_vocab: control token: 128166 '<|reserved_special_token_158|>' is not marked as EOG
llm_load_vocab: control token: 128160 '<|reserved_special_token_152|>' is not marked as EOG
llm_load_vocab: control token: 128159 '<|reserved_special_token_151|>' is not marked as EOG
llm_load_vocab: control token: 128157 '<|reserved_special_token_149|>' is not marked as EOG
llm_load_vocab: control token: 128156 '<|reserved_special_token_148|>' is not marked as EOG
llm_load_vocab: control token: 128154 '<|reserved_special_token_146|>' is not marked as EOG
llm_load_vocab: control token: 128152 '<|reserved_special_token_144|>' is not marked as EOG
llm_load_vocab: control token: 128151 '<|reserved_special_token_143|>' is not marked as EOG
llm_load_vocab: control token: 128150 '<|reserved_special_token_142|>' is not marked as EOG
llm_load_vocab: control token: 128147 '<|reserved_special_token_139|>' is not marked as EOG
llm_load_vocab: control token: 128144 '<|reserved_special_token_136|>' is not marked as EOG
llm_load_vocab: control token: 128142 '<|reserved_special_token_134|>' is not marked as EOG
llm_load_vocab: control token: 128141 '<|reserved_special_token_133|>' is not marked as EOG
llm_load_vocab: control token: 128140 '<|reserved_special_token_132|>' is not marked as EOG
llm_load_vocab: control token: 128133 '<|reserved_special_token_125|>' is not marked as EOG
llm_load_vocab: control token: 128130 '<|reserved_special_token_122|>' is not marked as EOG
llm_load_vocab: control token: 128128 '<|reserved_special_token_120|>' is not marked as EOG
llm_load_vocab: control token: 128127 '<|reserved_special_token_119|>' is not marked as EOG
llm_load_vocab: control token: 128126 '<|reserved_special_token_118|>' is not marked as EOG
llm_load_vocab: control token: 128125 '<|reserved_special_token_117|>' is not marked as EOG
llm_load_vocab: control token: 128124 '<|reserved_special_token_116|>' is not marked as EOG
llm_load_vocab: control token: 128123 '<|reserved_special_token_115|>' is not marked as EOG
llm_load_vocab: control token: 128122 '<|reserved_special_token_114|>' is not marked as EOG
llm_load_vocab: control token: 128121 '<|reserved_special_token_113|>' is not marked as EOG
llm_load_vocab: control token: 128120 '<|reserved_special_token_112|>' is not marked as EOG
llm_load_vocab: control token: 128119 '<|reserved_special_token_111|>' is not marked as EOG
llm_load_vocab: control token: 128116 '<|reserved_special_token_108|>' is not marked as EOG
llm_load_vocab: control token: 128115 '<|reserved_special_token_107|>' is not marked as EOG
llm_load_vocab: control token: 128114 '<|reserved_special_token_106|>' is not marked as EOG
llm_load_vocab: control token: 128113 '<|reserved_special_token_105|>' is not marked as EOG
llm_load_vocab: control token: 128111 '<|reserved_special_token_103|>' is not marked as EOG
llm_load_vocab: control token: 128110 '<|reserved_special_token_102|>' is not marked as EOG
llm_load_vocab: control token: 128107 '<|reserved_special_token_99|>' is not marked as EOG
llm_load_vocab: control token: 128106 '<|reserved_special_token_98|>' is not marked as EOG
llm_load_vocab: control token: 128105 '<|reserved_special_token_97|>' is not marked as EOG
llm_load_vocab: control token: 128104 '<|reserved_special_token_96|>' is not marked as EOG
llm_load_vocab: control token: 128103 '<|reserved_special_token_95|>' is not marked as EOG
llm_load_vocab: control token: 128100 '<|reserved_special_token_92|>' is not marked as EOG
llm_load_vocab: control token: 128097 '<|reserved_special_token_89|>' is not marked as EOG
llm_load_vocab: control token: 128096 '<|reserved_special_token_88|>' is not marked as EOG
llm_load_vocab: control token: 128094 '<|reserved_special_token_86|>' is not marked as EOG
llm_load_vocab: control token: 128093 '<|reserved_special_token_85|>' is not marked as EOG
llm_load_vocab: control token: 128090 '<|reserved_special_token_82|>' is not marked as EOG
llm_load_vocab: control token: 128089 '<|reserved_special_token_81|>' is not marked as EOG
llm_load_vocab: control token: 128087 '<|reserved_special_token_79|>' is not marked as EOG
llm_load_vocab: control token: 128085 '<|reserved_special_token_77|>' is not marked as EOG
llm_load_vocab: control token: 128080 '<|reserved_special_token_72|>' is not marked as EOG
llm_load_vocab: control token: 128077 '<|reserved_special_token_69|>' is not marked as EOG
llm_load_vocab: control token: 128076 '<|reserved_special_token_68|>' is not marked as EOG
llm_load_vocab: control token: 128073 '<|reserved_special_token_65|>' is not marked as EOG
llm_load_vocab: control token: 128070 '<|reserved_special_token_62|>' is not marked as EOG
llm_load_vocab: control token: 128069 '<|reserved_special_token_61|>' is not marked as EOG
llm_load_vocab: control token: 128067 '<|reserved_special_token_59|>' is not marked as EOG
llm_load_vocab: control token: 128064 '<|reserved_special_token_56|>' is not marked as EOG
llm_load_vocab: control token: 128062 '<|reserved_special_token_54|>' is not marked as EOG
llm_load_vocab: control token: 128061 '<|reserved_special_token_53|>' is not marked as EOG
llm_load_vocab: control token: 128060 '<|reserved_special_token_52|>' is not marked as EOG
llm_load_vocab: control token: 128054 '<|reserved_special_token_46|>' is not marked as EOG
llm_load_vocab: control token: 128045 '<|reserved_special_token_37|>' is not marked as EOG
llm_load_vocab: control token: 128044 '<|reserved_special_token_36|>' is not marked as EOG
llm_load_vocab: control token: 128043 '<|reserved_special_token_35|>' is not marked as EOG
llm_load_vocab: control token: 128042 '<|reserved_special_token_34|>' is not marked as EOG
llm_load_vocab: control token: 128038 '<|reserved_special_token_30|>' is not marked as EOG
llm_load_vocab: control token: 128037 '<|reserved_special_token_29|>' is not marked as EOG
llm_load_vocab: control token: 128035 '<|reserved_special_token_27|>' is not marked as EOG
llm_load_vocab: control token: 128034 '<|reserved_special_token_26|>' is not marked as EOG
llm_load_vocab: control token: 128033 '<|reserved_special_token_25|>' is not marked as EOG
llm_load_vocab: control token: 128032 '<|reserved_special_token_24|>' is not marked as EOG
llm_load_vocab: control token: 128030 '<|reserved_special_token_22|>' is not marked as EOG
llm_load_vocab: control token: 128029 '<|reserved_special_token_21|>' is not marked as EOG
llm_load_vocab: control token: 128028 '<|reserved_special_token_20|>' is not marked as EOG
llm_load_vocab: control token: 128026 '<|reserved_special_token_18|>' is not marked as EOG
llm_load_vocab: control token: 128025 '<|reserved_special_token_17|>' is not marked as EOG
llm_load_vocab: control token: 128024 '<|reserved_special_token_16|>' is not marked as EOG
llm_load_vocab: control token: 128022 '<|reserved_special_token_14|>' is not marked as EOG
llm_load_vocab: control token: 128020 '<|reserved_special_token_12|>' is not marked as EOG
llm_load_vocab: control token: 128017 '<|reserved_special_token_9|>' is not marked as EOG
llm_load_vocab: control token: 128016 '<|reserved_special_token_8|>' is not marked as EOG
llm_load_vocab: control token: 128015 '<|reserved_special_token_7|>' is not marked as EOG
llm_load_vocab: control token: 128014 '<|reserved_special_token_6|>' is not marked as EOG
llm_load_vocab: control token: 128013 '<|reserved_special_token_5|>' is not marked as EOG
llm_load_vocab: control token: 128011 '<|reserved_special_token_3|>' is not marked as EOG
llm_load_vocab: control token: 128010 '<|python_tag|>' is not marked as EOG
llm_load_vocab: control token: 128006 '<|start_header_id|>' is not marked as EOG
llm_load_vocab: control token: 128003 '<|reserved_special_token_1|>' is not marked as EOG
llm_load_vocab: control token: 128002 '<|reserved_special_token_0|>' is not marked as EOG
llm_load_vocab: control token: 128000 '<|begin_of_text|>' is not marked as EOG
llm_load_vocab: control token: 128041 '<|reserved_special_token_33|>' is not marked as EOG
llm_load_vocab: control token: 128063 '<|reserved_special_token_55|>' is not marked as EOG
llm_load_vocab: control token: 128046 '<|reserved_special_token_38|>' is not marked as EOG
llm_load_vocab: control token: 128007 '<|end_header_id|>' is not marked as EOG
llm_load_vocab: control token: 128065 '<|reserved_special_token_57|>' is not marked as EOG
llm_load_vocab: control token: 128171 '<|reserved_special_token_163|>' is not marked as EOG
llm_load_vocab: control token: 128162 '<|reserved_special_token_154|>' is not marked as EOG
llm_load_vocab: control token: 128165 '<|reserved_special_token_157|>' is not marked as EOG
llm_load_vocab: control token: 128057 '<|reserved_special_token_49|>' is not marked as EOG
llm_load_vocab: control token: 128050 '<|reserved_special_token_42|>' is not marked as EOG
llm_load_vocab: control token: 128056 '<|reserved_special_token_48|>' is not marked as EOG
llm_load_vocab: control token: 128230 '<|reserved_special_token_222|>' is not marked as EOG
llm_load_vocab: control token: 128098 '<|reserved_special_token_90|>' is not marked as EOG
llm_load_vocab: control token: 128153 '<|reserved_special_token_145|>' is not marked as EOG
llm_load_vocab: control token: 128084 '<|reserved_special_token_76|>' is not marked as EOG
llm_load_vocab: control token: 128082 '<|reserved_special_token_74|>' is not marked as EOG
llm_load_vocab: control token: 128102 '<|reserved_special_token_94|>' is not marked as EOG
llm_load_vocab: control token: 128253 '<|reserved_special_token_245|>' is not marked as EOG
llm_load_vocab: control token: 128179 '<|reserved_special_token_171|>' is not marked as EOG
llm_load_vocab: control token: 128071 '<|reserved_special_token_63|>' is not marked as EOG
llm_load_vocab: control token: 128135 '<|reserved_special_token_127|>' is not marked as EOG
llm_load_vocab: control token: 128161 '<|reserved_special_token_153|>' is not marked as EOG
llm_load_vocab: control token: 128164 '<|reserved_special_token_156|>' is not marked as EOG
llm_load_vocab: control token: 128134 '<|reserved_special_token_126|>' is not marked as EOG
llm_load_vocab: control token: 128249 '<|reserved_special_token_241|>' is not marked as EOG
llm_load_vocab: control token: 128004 '<|finetune_right_pad_id|>' is not marked as EOG
llm_load_vocab: control token: 128036 '<|reserved_special_token_28|>' is not marked as EOG
llm_load_vocab: control token: 128148 '<|reserved_special_token_140|>' is not marked as EOG
llm_load_vocab: control token: 128181 '<|reserved_special_token_173|>' is not marked as EOG
llm_load_vocab: control token: 128222 '<|reserved_special_token_214|>' is not marked as EOG
llm_load_vocab: control token: 128075 '<|reserved_special_token_67|>' is not marked as EOG
llm_load_vocab: control token: 128241 '<|reserved_special_token_233|>' is not marked as EOG
llm_load_vocab: control token: 128051 '<|reserved_special_token_43|>' is not marked as EOG
llm_load_vocab: control token: 128068 '<|reserved_special_token_60|>' is not marked as EOG
llm_load_vocab: control token: 128149 '<|reserved_special_token_141|>' is not marked as EOG
llm_load_vocab: control token: 128201 '<|reserved_special_token_193|>' is not marked as EOG
llm_load_vocab: control token: 128058 '<|reserved_special_token_50|>' is not marked as EOG
llm_load_vocab: control token: 128146 '<|reserved_special_token_138|>' is not marked as EOG
llm_load_vocab: control token: 128143 '<|reserved_special_token_135|>' is not marked as EOG
llm_load_vocab: control token: 128023 '<|reserved_special_token_15|>' is not marked as EOG
llm_load_vocab: control token: 128039 '<|reserved_special_token_31|>' is not marked as EOG
llm_load_vocab: control token: 128132 '<|reserved_special_token_124|>' is not marked as EOG
llm_load_vocab: control token: 128101 '<|reserved_special_token_93|>' is not marked as EOG
llm_load_vocab: control token: 128212 '<|reserved_special_token_204|>' is not marked as EOG
llm_load_vocab: control token: 128189 '<|reserved_special_token_181|>' is not marked as EOG
llm_load_vocab: control token: 128225 '<|reserved_special_token_217|>' is not marked as EOG
llm_load_vocab: control token: 128129 '<|reserved_special_token_121|>' is not marked as EOG
llm_load_vocab: control token: 128005 '<|reserved_special_token_2|>' is not marked as EOG
llm_load_vocab: control token: 128078 '<|reserved_special_token_70|>' is not marked as EOG
llm_load_vocab: control token: 128163 '<|reserved_special_token_155|>' is not marked as EOG
llm_load_vocab: control token: 128072 '<|reserved_special_token_64|>' is not marked as EOG
llm_load_vocab: control token: 128112 '<|reserved_special_token_104|>' is not marked as EOG
llm_load_vocab: control token: 128186 '<|reserved_special_token_178|>' is not marked as EOG
llm_load_vocab: control token: 128095 '<|reserved_special_token_87|>' is not marked as EOG
llm_load_vocab: control token: 128109 '<|reserved_special_token_101|>' is not marked as EOG
llm_load_vocab: control token: 128099 '<|reserved_special_token_91|>' is not marked as EOG
llm_load_vocab: control token: 128138 '<|reserved_special_token_130|>' is not marked as EOG
llm_load_vocab: control token: 128193 '<|reserved_special_token_185|>' is not marked as EOG
llm_load_vocab: control token: 128199 '<|reserved_special_token_191|>' is not marked as EOG
llm_load_vocab: control token: 128048 '<|reserved_special_token_40|>' is not marked as EOG
llm_load_vocab: control token: 128088 '<|reserved_special_token_80|>' is not marked as EOG
llm_load_vocab: control token: 128192 '<|reserved_special_token_184|>' is not marked as EOG
llm_load_vocab: control token: 128136 '<|reserved_special_token_128|>' is not marked as EOG
llm_load_vocab: control token: 128092 '<|reserved_special_token_84|>' is not marked as EOG
llm_load_vocab: control token: 128158 '<|reserved_special_token_150|>' is not marked as EOG
llm_load_vocab: control token: 128001 '<|end_of_text|>' is not marked as EOG
llm_load_vocab: control token: 128049 '<|reserved_special_token_41|>' is not marked as EOG
llm_load_vocab: control token: 128031 '<|reserved_special_token_23|>' is not marked as EOG
llm_load_vocab: control token: 128255 '<|reserved_special_token_247|>' is not marked as EOG
llm_load_vocab: control token: 128182 '<|reserved_special_token_174|>' is not marked as EOG
llm_load_vocab: control token: 128066 '<|reserved_special_token_58|>' is not marked as EOG
llm_load_vocab: control token: 128180 '<|reserved_special_token_172|>' is not marked as EOG
llm_load_vocab: control token: 128233 '<|reserved_special_token_225|>' is not marked as EOG
llm_load_vocab: control token: 128079 '<|reserved_special_token_71|>' is not marked as EOG
llm_load_vocab: control token: 128081 '<|reserved_special_token_73|>' is not marked as EOG
llm_load_vocab: control token: 128231 '<|reserved_special_token_223|>' is not marked as EOG
llm_load_vocab: control token: 128196 '<|reserved_special_token_188|>' is not marked as EOG
llm_load_vocab: control token: 128047 '<|reserved_special_token_39|>' is not marked as EOG
llm_load_vocab: control token: 128083 '<|reserved_special_token_75|>' is not marked as EOG
llm_load_vocab: control token: 128139 '<|reserved_special_token_131|>' is not marked as EOG
llm_load_vocab: control token: 128131 '<|reserved_special_token_123|>' is not marked as EOG
llm_load_vocab: control token: 128118 '<|reserved_special_token_110|>' is not marked as EOG
llm_load_vocab: control token: 128053 '<|reserved_special_token_45|>' is not marked as EOG
llm_load_vocab: control token: 128220 '<|reserved_special_token_212|>' is not marked as EOG
llm_load_vocab: control token: 128108 '<|reserved_special_token_100|>' is not marked as EOG
llm_load_vocab: control token: 128091 '<|reserved_special_token_83|>' is not marked as EOG
llm_load_vocab: control token: 128203 '<|reserved_special_token_195|>' is not marked as EOG
llm_load_vocab: control token: 128059 '<|reserved_special_token_51|>' is not marked as EOG
llm_load_vocab: control token: 128019 '<|reserved_special_token_11|>' is not marked as EOG
llm_load_vocab: control token: 128170 '<|reserved_special_token_162|>' is not marked as EOG
llm_load_vocab: control token: 128205 '<|reserved_special_token_197|>' is not marked as EOG
llm_load_vocab: control token: 128040 '<|reserved_special_token_32|>' is not marked as EOG
llm_load_vocab: control token: 128200 '<|reserved_special_token_192|>' is not marked as EOG
llm_load_vocab: control token: 128236 '<|reserved_special_token_228|>' is not marked as EOG
llm_load_vocab: control token: 128145 '<|reserved_special_token_137|>' is not marked as EOG
llm_load_vocab: control token: 128168 '<|reserved_special_token_160|>' is not marked as EOG
llm_load_vocab: control token: 128214 '<|reserved_special_token_206|>' is not marked as EOG
llm_load_vocab: control token: 128137 '<|reserved_special_token_129|>' is not marked as EOG
llm_load_vocab: control token: 128232 '<|reserved_special_token_224|>' is not marked as EOG
llm_load_vocab: control token: 128239 '<|reserved_special_token_231|>' is not marked as EOG
llm_load_vocab: control token: 128055 '<|reserved_special_token_47|>' is not marked as EOG
llm_load_vocab: control token: 128228 '<|reserved_special_token_220|>' is not marked as EOG
llm_load_vocab: control token: 128206 '<|reserved_special_token_198|>' is not marked as EOG
llm_load_vocab: control token: 128018 '<|reserved_special_token_10|>' is not marked as EOG
llm_load_vocab: control token: 128012 '<|reserved_special_token_4|>' is not marked as EOG
llm_load_vocab: control token: 128198 '<|reserved_special_token_190|>' is not marked as EOG
llm_load_vocab: control token: 128021 '<|reserved_special_token_13|>' is not marked as EOG
llm_load_vocab: control token: 128086 '<|reserved_special_token_78|>' is not marked as EOG
llm_load_vocab: control token: 128074 '<|reserved_special_token_66|>' is not marked as EOG
llm_load_vocab: control token: 128027 '<|reserved_special_token_19|>' is not marked as EOG
llm_load_vocab: control token: 128242 '<|reserved_special_token_234|>' is not marked as EOG
llm_load_vocab: control token: 128155 '<|reserved_special_token_147|>' is not marked as EOG
llm_load_vocab: control token: 128052 '<|reserved_special_token_44|>' is not marked as EOG
llm_load_vocab: control token: 128246 '<|reserved_special_token_238|>' is not marked as EOG
llm_load_vocab: control token: 128117 '<|reserved_special_token_109|>' is not marked as EOG
llm_load_vocab: control token: 128237 '<|reserved_special_token_229|>' is not marked as EOG
llm_load_vocab: special tokens cache size = 256
llm_load_vocab: token to piece cache size = 0.7999 MB
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = llama
llm_load_print_meta: vocab type       = BPE
llm_load_print_meta: n_vocab          = 128256
llm_load_print_meta: n_merges         = 280147
llm_load_print_meta: vocab_only       = 0
llm_load_print_meta: n_ctx_train      = 131072
llm_load_print_meta: n_embd           = 2048
llm_load_print_meta: n_layer          = 16
llm_load_print_meta: n_head           = 32
llm_load_print_meta: n_head_kv        = 8
llm_load_print_meta: n_rot            = 64
llm_load_print_meta: n_swa            = 0
llm_load_print_meta: n_embd_head_k    = 64
llm_load_print_meta: n_embd_head_v    = 64
llm_load_print_meta: n_gqa            = 4
llm_load_print_meta: n_embd_k_gqa     = 512
llm_load_print_meta: n_embd_v_gqa     = 512
llm_load_print_meta: f_norm_eps       = 0.0e+00
llm_load_print_meta: f_norm_rms_eps   = 1.0e-05
llm_load_print_meta: f_clamp_kqv      = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale    = 0.0e+00
llm_load_print_meta: n_ff             = 8192
llm_load_print_meta: n_expert         = 0
llm_load_print_meta: n_expert_used    = 0
llm_load_print_meta: causal attn      = 1
llm_load_print_meta: pooling type     = 0
llm_load_print_meta: rope type        = 0
llm_load_print_meta: rope scaling     = linear
llm_load_print_meta: freq_base_train  = 500000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_ctx_orig_yarn  = 131072
llm_load_print_meta: rope_finetuned   = unknown
llm_load_print_meta: ssm_d_conv       = 0
llm_load_print_meta: ssm_d_inner      = 0
llm_load_print_meta: ssm_d_state      = 0
llm_load_print_meta: ssm_dt_rank      = 0
llm_load_print_meta: ssm_dt_b_c_rms   = 0
llm_load_print_meta: model type       = 1B
llm_load_print_meta: model ftype      = Q4_K - Medium
llm_load_print_meta: model params     = 1.24 B
llm_load_print_meta: model size       = 762.81 MiB (5.18 BPW) 
llm_load_print_meta: general.name     = Llama 3.2 1B Instruct
llm_load_print_meta: BOS token        = 128000 '<|begin_of_text|>'
llm_load_print_meta: EOS token        = 128009 '<|eot_id|>'
llm_load_print_meta: EOT token        = 128009 '<|eot_id|>'
llm_load_print_meta: EOM token        = 128008 '<|eom_id|>'
llm_load_print_meta: LF token         = 128 'Ä'
llm_load_print_meta: EOG token        = 128008 '<|eom_id|>'
llm_load_print_meta: EOG token        = 128009 '<|eot_id|>'
llm_load_print_meta: max token length = 256
llm_load_tensors:   CPU_Mapped model buffer size =   762.81 MiB
llama_new_context_with_model: n_seq_max     = 1
llama_new_context_with_model: n_ctx         = 1024
llama_new_context_with_model: n_ctx_per_seq = 1024
llama_new_context_with_model: n_batch       = 256
llama_new_context_with_model: n_ubatch      = 256
llama_new_context_with_model: flash_attn    = 1
llama_new_context_with_model: freq_base     = 500000.0
llama_new_context_with_model: freq_scale    = 1
llama_new_context_with_model: n_ctx_per_seq (1024) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
llama_kv_cache_init: kv_size = 1024, offload = 1, type_k = 'f16', type_v = 'f16', n_layer = 16
llama_kv_cache_init: layer 0: n_embd_k_gqa = 512, n_embd_v_gqa = 512
llama_kv_cache_init: layer 1: n_embd_k_gqa = 512, n_embd_v_gqa = 512
llama_kv_cache_init: layer 2: n_embd_k_gqa = 512, n_embd_v_gqa = 512
llama_kv_cache_init: layer 3: n_embd_k_gqa = 512, n_embd_v_gqa = 512
llama_kv_cache_init: layer 4: n_embd_k_gqa = 512, n_embd_v_gqa = 512
llama_kv_cache_init: layer 5: n_embd_k_gqa = 512, n_embd_v_gqa = 512
llama_kv_cache_init: layer 6: n_embd_k_gqa = 512, n_embd_v_gqa = 512
llama_kv_cache_init: layer 7: n_embd_k_gqa = 512, n_embd_v_gqa = 512
llama_kv_cache_init: layer 8: n_embd_k_gqa = 512, n_embd_v_gqa = 512
llama_kv_cache_init: layer 9: n_embd_k_gqa = 512, n_embd_v_gqa = 512
llama_kv_cache_init: layer 10: n_embd_k_gqa = 512, n_embd_v_gqa = 512
llama_kv_cache_init: layer 11: n_embd_k_gqa = 512, n_embd_v_gqa = 512
llama_kv_cache_init: layer 12: n_embd_k_gqa = 512, n_embd_v_gqa = 512
llama_kv_cache_init: layer 13: n_embd_k_gqa = 512, n_embd_v_gqa = 512
llama_kv_cache_init: layer 14: n_embd_k_gqa = 512, n_embd_v_gqa = 512
llama_kv_cache_init: layer 15: n_embd_k_gqa = 512, n_embd_v_gqa = 512
llama_kv_cache_init:        CPU KV buffer size =    32.00 MiB
llama_new_context_with_model: KV self size  =   32.00 MiB, K (f16):   16.00 MiB, V (f16):   16.00 MiB
llama_new_context_with_model:        CPU  output buffer size =     0.49 MiB
llama_new_context_with_model:        CPU compute buffer size =   127.25 MiB
llama_new_context_with_model: graph nodes  = 455
llama_new_context_with_model: graph splits = 1
common_init_from_params: setting dry_penalty_last_n to ctx_size = 1024
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)

If it fails, there should be some logs that can help us find the reason.

@PratikBhatti83
Copy link
Author

PratikBhatti83 commented Jan 6, 2025

@jhen0409 let me share the logs

LOG Model info (took 64ms): {"alignment": 32, "data_offset": 7831552, "general.architecture": "llama", "general.basename": "Llama-3.2", "general.file_type": "27", "general.finetune": "Instruct", "general.languages": "["en", "de", "fr", "it", "pt", "hi", "es", "th"]", "general.license": "llama3.2", "general.name": "Llama 3.2 1B Instruct", "general.quantization_version": "2", "general.size_label": "1B", "general.tags": "["facebook", "meta", "pytorch", "llama", "llama-3", "text-generation"]", "general.type": "model", "llama.attention.head_count": "32", "llama.attention.head_count_kv": "8", "llama.attention.key_length": "64", "llama.attention.layer_norm_rms_epsilon": "0.000010", "llama.attention.value_length": "64", "llama.block_count": "16", "llama.context_length": "131072", "llama.embedding_length": "2048", "llama.feed_forward_length": "8192", "llama.rope.dimension_count": "64", "llama.rope.freq_base": "500000.000000", "llama.vocab_size": "128256", "quantize.imatrix.chunks_count": "125", "quantize.imatrix.dataset": "/training_dir/calibration_datav3.txt", "quantize.imatrix.entries_count": "112", "quantize.imatrix.file": "/models_out/Llama-3.2-1B-Instruct-GGUF/Llama-3.2-1B-Instruct.imatrix", "tokenizer.chat_template": "{{- bos_token }}
{%- if custom_tools is defined %}
{%- set tools = custom_tools %}
{%- endif %}
{%- if not tools_in_user_message is defined %}
{%- set tools_in_user_message = true %}
{%- endif %}
{%- if not date_string is defined %}
{%- if strftime_now is defined %}
{%- set date_string = strftime_now("%d %b %Y") %}
{%- else %}
{%- set date_string = "26 Jul 2024" %}
{%- endif %}
{%- endif %}
{%- if not tools is defined %}
{%- set tools = none %}
{%- endif %}

{#- This block extracts the system message, so we can slot it into the right place. #}
{%- if messages[0]['role'] == 'system' %}
{%- set system_message = messages[0]['content']|trim %}
{%- set messages = messages[1:] %}
{%- else %}
{%- set system_message = "" %}
{%- endif %}

{#- System message #}
{{- "<|start_header_id|>system<|end_header_id|>\n\n" }}
{%- if tools is not none %}
{{- "Environment: ipython\n" }}
{%- endif %}
{{- "Cutting Knowledge Date: December 2023\n" }}
{{- "Today Date: " + date_string + "\n\n" }}
{%- if tools is not none and not tools_in_user_message %}
{{- "You have access to the following functions. To call a function, please respond with JSON for a function call." }}
{{- 'Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.' }}
{{- "Do not use variables.\n\n" }}
{%- for t in tools %}
{{- t | tojson(indent=4) }}
{{- "\n\n" }}
{%- endfor %}
{%- endif %}
{{- system_message }}
{{- "<|eot_id|>" }}

{#- Custom tools are passed in a user message with some extra guidance #}
{%- if tools_in_user_message and not tools is none %}
{#- Extract the first user message so we can plug it in here #}
{%- if messages | length != 0 %}
{%- set first_user_message = messages[0]['content']|trim %}
{%- set messages = messages[1:] %}
{%- else %}
{{- raise_exception("Cannot put tools in the first user message when there's no first user message!") }}
{%- endif %}
{{- '<|start_header_id|>user<|end_header_id|>\n\n' -}}
{{- "Given the following functions, please respond with a JSON for a function call " }}
{{- "with its proper arguments that best answers the given prompt.\n\n" }}
{{- 'Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}.' }}
{{- "Do not use variables.\n\n" }}
{%- for t in tools %}
{{- t | tojson(indent=4) }}
{{- "\n\n" }}
{%- endfor %}
{{- first_user_message + "<|eot_id|>"}}
{%- endif %}

{%- for message in messages %}
{%- if not (message.role == 'ipython' or message.role == 'tool' or 'tool_calls' in message) %}
{{- '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' }}
{%- elif 'tool_calls' in message %}
{%- if not message.tool_calls|length == 1 %}
{{- raise_exception("This model only supports single tool-calls at once!") }}
{%- endif %}
{%- set tool_call = message.tool_calls[0].function %}
{{- '<|start_header_id|>assistant<|end_header_id|>\n\n' -}}
{{- '{"name": "' + tool_call.name + '", ' }}
{{- '"parameters": ' }}
{{- tool_call.arguments | tojson }}
{{- "}" }}
{{- "<|eot_id|>" }}
{%- elif message.role == "tool" or message.role == "ipython" %}
{{- "<|start_header_id|>ipython<|end_header_id|>\n\n" }}
{%- if message.content is mapping or message.content is iterable %}
{{- message.content | tojson }}
{%- else %}
{{- message.content }}
{%- endif %}
{{- "<|eot_id|>" }}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|start_header_id|>assistant<|end_header_id|>\n\n' }}
{%- endif %}
", "tokenizer.ggml.bos_token_id": "128000", "tokenizer.ggml.eos_token_id": "128009", "tokenizer.ggml.model": "gpt2", "tokenizer.ggml.pre": "llama-bpe", "version": 3}``

@PratikBhatti83
Copy link
Author

@jhen0409 is there any update? please check my logs and let me know the solution.

@PratikBhatti83
Copy link
Author

@jhen0409 is there any update? please help me to solve this issue!

@jhen0409
Copy link
Member

Can you try v0.4.8?

let me share the logs: [...logs]

This looks like logs from JS, not Xcode logs.

@PratikBhatti83
Copy link
Author

bro still having same issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants