资源简介
BIO标注集,即B-PER、I-PER代表人名首字、人名非首字,B-LOC、I-LOC代表地名首字、地名非首字,B-ORG、I-ORG代表组织机构名首字、组织机构名非首字,O代表该字不属于命名实体的一部分。

代码片段和文件信息
# Python version of the evaluation script from CoNLL‘00-
# Originates from: https://github.com/spyysalo/conlleval.py
# Intentional differences:
# - accept any space as delimiter by default
# - optional file argument (default STDIN)
# - option to set boundary (-b argument)
# - LaTeX output (-l argument) not supported
# - raw tags (-r argument) not supported
import sys
import re
import codecs
from collections import defaultdict namedtuple
ANY_SPACE = ‘‘
class FormatError(Exception):
pass
Metrics = namedtuple(‘Metrics‘ ‘tp fp fn prec rec fscore‘)
class EvalCounts(object):
def __init__(self):
self.correct_chunk = 0 # number of correctly identified chunks
self.correct_tags = 0 # number of correct chunk tags
self.found_correct = 0 # number of chunks in corpus
self.found_guessed = 0 # number of identified chunks
self.token_counter = 0 # token counter (ignores sentence breaks)
# counts by type
self.t_correct_chunk = defaultdict(int)
self.t_found_correct = defaultdict(int)
self.t_found_guessed = defaultdict(int)
def parse_args(argv):
import argparse
parser = argparse.ArgumentParser(
description=‘evaluate tagging results using CoNLL criteria‘
formatter_class=argparse.ArgumentDefaultsHelpFormatter
)
arg = parser.add_argument
arg(‘-b‘ ‘--boundary‘ metavar=‘STR‘ default=‘-X-‘
help=‘sentence boundary‘)
arg(‘-d‘ ‘--delimiter‘ metavar=‘CHAR‘ default=ANY_SPACE
help=‘character delimiting items in input‘)
arg(‘-o‘ ‘--otag‘ metavar=‘CHAR‘ default=‘O‘
help=‘alternative outside tag‘)
arg(‘file‘ nargs=‘?‘ default=None)
return parser.parse_args(argv)
def parse_tag(t):
m = re.match(r‘^([^-]*)-(.*)$‘ t)
return m.groups() if m else (t ‘‘)
def evaluate(iterable options=None):
if options is None:
options = parse_args([]) # use defaults
counts = EvalCounts()
num_features = None # number of features per line
in_correct = False # currently processed chunks is correct until now
last_correct = ‘O‘ # previous chunk tag in corpus
last_correct_type = ‘‘ # type of previously identified chunk tag
last_guessed = ‘O‘ # previously identified chunk tag
last_guessed_type = ‘‘ # type of previous chunk tag in corpus
for line in iterable:
line = line.rstrip(‘\r\n‘)
if options.delimiter == ANY_SPACE:
features = line.split()
else:
features = line.split(options.delimiter)
if num_features is None:
num_features = len(features)
elif num_features != len(features) and len(features) != 0:
raise FormatError(‘unexpected number of features: %d (%d)‘ %
(len(features) num_features))
if len(features) == 0 or features[0] == options.boundary:
features = [options.boundary
属性 大小 日期 时间 名称
----------- --------- ---------- ----- ----
文件 12728 2017-07-05 00:18 ChineseNER-master(来源联合数据)\conlleval
文件 10110 2017-07-05 00:18 ChineseNER-master(来源联合数据)\conlleval.py
文件 10110 2017-07-05 00:18 ChineseNER-master(来源联合数据)\data\conlleval.py
文件 1383712 2017-07-05 00:18 ChineseNER-master(来源联合数据)\data\example.dev
文件 1405788 2017-07-05 00:18 ChineseNER-master(来源联合数据)\data\example.test
文件 5596172 2017-07-05 00:18 ChineseNER-master(来源联合数据)\data\example.train
文件 8104 2017-07-05 00:18 ChineseNER-master(来源联合数据)\data_utils.py
文件 5782 2017-07-05 00:18 ChineseNER-master(来源联合数据)\loader.py
文件 8918 2017-07-05 00:18 ChineseNER-master(来源联合数据)\main.py
文件 11605 2017-07-05 00:18 ChineseNER-master(来源联合数据)\model.py
文件 1273 2017-07-05 00:18 ChineseNER-master(来源联合数据)\README.md
文件 9470 2017-07-05 00:18 ChineseNER-master(来源联合数据)\rnncell.py
文件 6038 2017-07-05 00:18 ChineseNER-master(来源联合数据)\utils.py
文件 15335492 2017-07-05 00:18 ChineseNER-master(来源联合数据)\wiki_100.utf8
目录 0 2018-08-06 17:18 ChineseNER-master(来源联合数据)\data
目录 0 2018-08-06 17:19 ChineseNER-master(来源联合数据)
----------- --------- ---------- ----- ----
23805302 16
- 上一篇:OpenGL实践三:水面涟漪的逼真绘制毕业设计
- 下一篇:广工计算机组成原理实验报告
相关资源
- 关于角点检测算法HarrisForstner经典算子
- Prinergy 印能捷 4-7.5算号器
- SDINBDG4-64GB_datasheet generic final v1.pdf
- 彩色玻璃冷凝物密度矩阵:Lindblad演化
- 彩色玻璃冷凝液的衍射dijet产量和Wi
- 光锥夸克模型中介子介子的Quark Wign
- Dr. Cleaner Pro mac破解版
- 认识界面以及PCB设计整体要求
- CCleaner Pro v5.06.5219中文版(集成注册码
- ccleaner专业版安装exe
- 超级场景清理器(SPCleaner)v1.0免费版
- Sun 系统为NewEnergy 网格基础架构带来活
- Novel fluorescent proteins generated by de nov
- Altium designer超全元件库+封装库部分
- Mentor Graphics Expedition Enterprise v7.9.5.r
- PowerDesigner16.6 破解补丁
- 视频处理软件NeroVideo2019v20.0.3001中文特
- Altium Designer实战攻略与高速PCB设计P
- New Analytical Solution of a Generalized Negat
- Analytical Studies of the (2+1)-Dimensiona
- Existence of Solutions for Degenerate Elliptic
- Orientation-resolved 3d5/2 energy shift of Rh
- Coordination-resolved 4f binding energy shift
- lotus domino notes(包括client administrato
- Type-c 接口封装,24引脚,Altium Design
- Behavior Designer 1.6.3(u2018.3.0).unitypa
- 基于弱测量的Werner态的量子关联
- LoadRunner11.0地址及破解用户
- loadrunner11破解版
- LoadRunner11.0地址+汉化+破解
评论
共有 条评论