12月09, 2021

spacy: resume training from existing model

Spacy is awsome of NLP tool, that greatly reduce complexity to resolve ner/cat issue.

Spacy是我用过的最好用的NLP工具,它使得解决实体识别和文档分类问题变得非常简单。

在实践中通常需要训练较大的模型,如果语料只是做了小量修改,从头训练模型通常不太划算;而从上一次训练参考中继续微调通常能合损失函数更快收敛从而节省训练时间

选择参数微调需要修改配置文件,下面是NER模型举例:

[components.ner]
source = "./saved_model/model-last"
#factory = "ner"
#moves = null
#update_with_oracle_cut_size = 100

[components.tok2vec]
source = "./saved_model/model-last"
#factory = "tok2vec"

如果遇到下面的错误,可能是因为原始模型与当前的spacy版本不一样 operands could not be broadcast together with shapes (11,94) (86,) (11,94)

本文链接:http://57km.cc/post/spacy: resume training from existing model.html

-- EOF --

Comments