BiLSTM+CRF命名体识别pytorch版本整理汇总
最近在系统地接触学习NER(命名实体识别/实体抽取),但是发现这⽅⾯的⼩帖⼦还⽐较零散。所以我把学习的记录放出来给⼤家作参考,其中汇聚了很多其他博主的知识,在本⽂中也放出了他们的原链。希望能够以这篇⽂章为载体,帮助其他跟我⼀样的学习者梳理、串起NER的各个⼩知识点,最后上⼿NER的主流模型(Bilstm+CRF)。
全⽂结构
⼀、NER资料
⼆、主流模型Bilstm-CRF实现详解(Pytorch篇)
三、实现代码的拓展(在第⼆点的基础上进⾏拓展)
⼀、NER资料
参考: 包括:CRF++的详细解析、Bi-LSTM+CRF中CRF层的详细解析、Bi-LSTM后加CRF的原因、Bert+Bi-LSTM+CRF、CRF和Bi-LSTM+CRF优化⽬标的区别
CRF++完成的是学习和解码的过程:训练即为学习的过程,预测即为解码的过程。
参考: (这份资料对后⾯代码的理解是有帮助的)
在这⾥插⼊图⽚描述
在这⾥插⼊图⽚描述
在这⾥插⼊图⽚描述
动画片排名在这⾥插⼊图⽚描述
序列标注问题就是对序列中每个元素打标签(基于标签集合进⾏多分类,这⾥的元素单位是字,上⼀篇博客是词,本质原理是⼀样的,具体描述稍有区别)。
参考:
在上⼀篇的参考中提到,会在每⼀句话的开始加上“START”,在句尾加上“END”,这点我们可能会有疑惑。
这篇参考给予了解答:
这是为了使转移得分矩阵的鲁棒性更好,才额外加两个标签:START和END,START表⽰⼀句话的开
始,注意这不是指该句话的第⼀个单词,START后才是第⼀个单词,同样的,END代表着这句话的结束。
下表就是⼀个转移得分矩阵的⽰例,该⽰例包含了START和END标签。
在这⾥插⼊图⽚描述
每⼀个格⾥的值表⽰的意思是:这个格的⾏值转成列值的概率⼤⼩。打个⽐⽅:上图中红框(B-Person,I-person)的值为0.9,表⽰的意思就是B-person转移⾄I-person的概率为0.9(上⼀个元素(字或词)标注为B-Person,下⼀个元素标注为I-Person的概率),这是合乎BIO标注的规定的(B是实体的开始,I是实体的内部或结束,O⾮实体)。类推⼀下,蓝框的意思代表的就是B-Organization转移⾄I-Organization的概率为0.8。
参考: (看完前⾯的参考来看这份,简直不要太良⼼了,易懂很多)自制火锅汤底
但是前⾯很多概念有提到,就不赘述了,只是加深⼀下印象,顺带推⼀下这个博主对CRF的⼀系列解析。
在这⾥插⼊图⽚描述
其中
P
i
,
y
i
P_{i,y_i}
</span><span class="katex-html"><span class="ba"><span class="strut" ></span><span class= "mord"><span class="mord mathdefault" >P</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vli st-r"><span class="vlist" ><span class="" ><span class="pstrut" ></span><span class="sizing ret-size6 size3 mtight"><span class="mord mtight"><span class="mord mathd efault mtight">i</span><span class="mpunct mtight">,</span><span class="mord mtight"><span class="mord mathdefault mtight" style="margin-r
ight: 0.03 588em;">y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" > <span class="" ><span class="pstrut" ></sp an><span class="sizing ret-size3 size1 mtight"><span class="mord mathdefault mtight">i</span></span></span></span><span class="vlist-s"></span></ span><span class="vlist-r"><span class="vlist" ><span class=""></span></span></span></span></span></span></span></span></ span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" ><span class=""></span></span> </span></span></span></span></span></span></span></span>为第 i 个位置(序列中第i个元素(字或词)) softmax 输出为(标签)<span class="katex--inline" ><span class="katex"><span class="katex-mathml">欧美风图片
><span class="katex"><span class="katex-mathml">
y
i
芹菜热量y_i
</span><span class="katex-html"><span class="ba"><span class="strut" ></span><span class="mor d"><span class="mord mathdefault" >y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"> <span class="vlist" ><span class="" ><span class="pstrut" ></span><span class="sizing ret-size6 size3 mtight"><span class="mord mathdefault mtight">i</sp an></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" ><span class=""></span> </span></span></span></span></span></span></span></span></span> 的概率, <span class="katex--inline"><span class="katex"><span class="katex-m athml">
A
y
i
,
y
i
+
1
A_{y_i,y_{i+1}}
</span><span class="katex-html"><span class="ba"><span class="strut" ></span><span class= "mord"><span class="mord mathdefault">A</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" ><span class="" ><span class="pstrut" style ="height: 2.7em;"></span><span class="sizing ret-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mord mathd efault mtight" >y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" styl e="height: 0.3280857142857143em;"><span class="" ><span clas s="pstrut" ></span><span class="sizing ret-size3 size1 mtight"><span class="mord m
athdefault mtight">i</span></span></span></ span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" ><span class=""></span></span></span></sp an></span></span><span class="mpunct mtight">,</span><span class="mord mtight"><span class="mord mathdefault mtight" >y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" ><s pan class="" ><span class="pstrut" ></span ><span class="sizing ret-size3 size1 mtight"><span class="mord mtight"><span class="mord mathdefault mtight">i</span><span class="mbin mtight">+< /span><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" ><span class=""></span></span></span></span></span></span></span></span></span></span><span class="v list-s"></span></span><span class="vlist-r"><span class="vlist" ><span class=""></span></span></span></span></span></spa n></span></span></span></span> 为从 (前⼀个元素标签)<span class="katex--inline"><span class="katex"><span class="katex-mathml">
y
二胎剖腹产i
y_i
</span><span class="katex-html"><span class="ba"><span class="strut" ></span><span class="mor d"><span class="mord mathdefault" >y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"> <span class="vlist" ><span class="" ><span class="pstrut" ></span><span class="sizing ret-size6 size3 mtight"><span class="mord mathdefault mtight">i</sp an></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span class="vlist" ><span class=""></span> </span></span></span></span></span></span></span></span></span> 到(当前元素标签) <span class="katex--inline"><span class="katex"><span clas s="katex-mathml">月子可以洗头吗
y
i
+
属兔和什么属相相冲
1可爱动画图片
y_{i+1}
</span><span class="katex-html"><span class="ba"><span class="strut" ></span><span class= "mord"><span class="mord mathdefault" >y</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlis