Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

train on bert base #13

Open
L-hongbin opened this issue Jan 9, 2021 · 9 comments
Open

train on bert base #13

L-hongbin opened this issue Jan 9, 2021 · 9 comments

Comments

@L-hongbin
Copy link

Hello, I'd to know how about the result of this model training on Bert_base? I have trianed on bert base with c2f , python run.py train_bert_base_ml0_d2, but only get a result about 67 F1

@lxucs
Copy link
Owner

lxucs commented Jan 9, 2021

There might be sth off in your configuration. I have trained spanbert_base weight with F1 score 77+.

@sushantakpani
Copy link

Hi,
What is the difference train_bert_base_ml0_d1 and train_bert_base_ml0_d2?
Which configuration is for c2f 2019?

@lxucs
Copy link
Owner

lxucs commented Jan 12, 2021

c2f 2019 is train_bert_xxx_ml0_d2. d2 uses Attended Antecedent as higher-order inference while d1 only uses local decisions.

@sushantakpani
Copy link

c2f 2019 is train_bert_xxx_ml0_d2. d2 uses Attended Antecedent as higher-order inference while d1 only uses local decisions.

@lxucs Thank you.

@L-hongbin
Copy link
Author

There might be sth off in your configuration. I have trained spanbert_base weight with F1 score 77+.

I have trained on bert_base with the following configuration,but only get the F1 score 73.2+, and the c2f-coref-bert_base model in the paper https://arxiv.org/pdf/1908.09091.pdf is about 73.9.

max_top_antecedents = 50
max_training_sentences = 11
top_span_ratio = 0.4
max_num_extracted_spans = 3900
max_num_speakers = 20
max_segment_len = 128
bert_learning_rate = 1e-05
task_learning_rate = 0.0002
loss_type = "marginalized"
mention_loss_coef = 0
false_new_delta = 1.5
adam_eps = 1e-06
adam_weight_decay = 0.01
warmup_ratio = 0.1
max_grad_norm = 1
gradient_accumulation_steps = 1
coref_depth = 2
higher_order = "attended_antecedent"
fine_grained = true
dropout_rate = 0.3
ffnn_size = 3000
ffnn_depth = 1
cluster_ffnn_size = 3900
cluster_reduce = "mean"
easy_cluster_first = false
cluster_dloss = false
num_epochs = 20
feature_emb_size = 20
max_span_width = 30
use_metadata = true
use_features = true
use_segment_distance = true
model_heads = true
use_width_prior = true
use_distance_prior = true
conll_eval_path = "./dev.english.v4_gold_conll"
conll_test_path = "./test.english.v4_gold_conll"
genres = [
"bc"
"bn"
"mz"
"nw"
"pt"
"tc"
"wb"
]
eval_frequency = 1000
report_frequency = 100
log_root = "./"
num_docs = 2802
bert_tokenizer_name = "bert-base-cased"
bert_pretrained_name_or_path = "bert-base-cased"

@lxucs
Copy link
Owner

lxucs commented Mar 1, 2021

@L-hongbin I was talking about spanbert_base, not bert_base

@L-hongbin
Copy link
Author

@L-hongbin I was talking about spanbert_base, not bert_base

Thanks for your reply, so you don't have the results on bert_base?

@lxucs
Copy link
Owner

lxucs commented Mar 1, 2021

@L-hongbin The results are similar to the reported numbers but I don't have the exact numbers on my hand now.

@L-hongbin
Copy link
Author

@L-hongbin The results are similar to the reported numbers but I don't have the exact numbers on my hand now.

Thanks~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants