New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

FEA：add split token and generate related resource #59

Open

txy77 wants to merge 39 commits into RUCAIBox:main from txy77:main

Collaborator

txy77 commented Oct 6, 2022

Update split token, generate word2vec, copy_mask, token2id, load pretrained model
Fix some bugs in redial, inspired and tgredial model

txy77 added 12 commits

September 28, 2022 22:56

txy

502463d


          txy77

ee1d80a


          Merge branch 'main' of https://github.com/txy77/CRSLab

e25f2d8


          txy77

10c36fe


          Merge branch 'main' of https://github.com/txy77/CRSLab

b8d82ed


          txy77

c8f1385


          Merge branch 'main' of https://github.com/txy77/CRSLab

ad22cde


          txy77

661824a


          Merge branch 'main' of https://github.com/txy77/CRSLab

b588195


          txy77

3226e5e


          txy77

99796d0


          txy77

1eb81e4

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

Fix the bugs
Retypeset the code

txy77 and others added 2 commits

October 7, 2022 11:07


          Delete settings.json

0b0436e


          txy77

735c3aa

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

change the name of the variable:

processing -> processed_
split_token -> split_text

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

change the name of the variable:

processing -> processed_
split_token -> split_text


          txy77

d775117

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

Add the version number of python package gensim


          txy77

7f43abe

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

change the name of variable:
crslabtokenizer -> Tokenizer


          txy77

36e120d

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

change the wat of load config


          txy77

d245105

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

fix the problem of build copy_mask.npy


          txy77

0fcd3d3

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

FIX: Removed unnecessary word2vec


          txy77

53427ea

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

FIX: Complete the integration of tokenizer classes


          txy77

8ac3b83

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

FIX: problem of data type


          txy77

f92d485

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

Fix: add special_token_idx to tokenizer


          txy77

4763e57

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

FIX: conv special_token_idx


          txy77

b6d4c34

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

FIX: variable name : CRS_Tokenizer -> crs_tokenizer


          txy77

b6445df

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

FIX: variable name : wordembedding -> word_embedding


          txy77

d7b9d5b

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

FIX : delete redundant variable : crstokennizer


          txy77

2cab622

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

change variable name: BaseCrsTokenize -> BaseTokenizer


          txy77

97ea317

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

FIX: change as_tensor function


          txy77

af93741

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

FIX: seperate the word2vec & copy_mask from dictionary

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

FIX: delete npy_dict

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

FIX: delete npy_dict


          txy77

699a537

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

FIX: bert_tokenize -> BertToeknizer


          txy77

b58aefa

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

FIX: variable name
self.Tokenizer -> self.tokenizer


          txy77

de8e3f3

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

FIX: add copy_mask = None

txy77 added 3 commits

November 14, 2022 21:32


          txy77

86a23d4


          txy77

f4c073e


          txy77

c7acee8

txy77 commented

View reviewed changes

Collaborator Author

txy77 left a comment

list_word -> word_list
add return copy_mask


          txy77

4edd74a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet