• 大小: 9.85MB
    文件类型: .zip
    金币: 1
    下载: 0 次
    发布日期: 2023-10-08
  • 语言: 其他
  • 标签: NLP  CRF  

资源简介

中文NLP序列标注工具。利用CRF进行命名实体识别NER,自动标注数据集产生语料库,可以选择BIO或者BMES标注体系。

资源截图

代码片段和文件信息

# -*- coding: utf-8 -*-
# @Author: Jie Yang from SUTD
# @Date:   2016-Jan-06 17:11:59
# @Last Modified by:   Jie     @Contact: jieynlp@gmail.com
# @Last Modified time: 2017-07-05 22:59:46
#!/usr/bin/env python
# coding=utf-8

from Tkinter import *
from ttk import *#frame Button Label style Scrollbar
import tkFileDialog
import tkFont
import re
from collections import deque
import pickle
import os.path
import platform


class Example(frame):
    def __init__(self parent):
        frame.__init__(self parent)
        self.OS = platform.system().lower()
        self.parent = parent
        self.fileName = ““
        self.debug = False
        self.colorAllChunk = True
        self.history = deque(maxlen=20)
        self.currentContent = deque(maxlen=1)
        self.pressCommand = {‘a‘:u“参与者“
                             ‘b‘:u“动作“
                             ‘c‘:u“对象“
                             ‘d‘:u“状态“
                             ‘e‘:u“时间“
                             ‘f‘:u“地点“
                             ‘g‘:u“金额“
                             ‘h‘:u“内容“
                             ‘i‘:u“Transaction-方式“ 
                             ‘j‘:u“Peron-原单位“
                             ‘k‘:u“Per-新单位“
                             ‘l‘:u“Per-原职务“
                             ‘m‘:u“Per-新职务“
                             ‘n‘:u“Quantity-指标“
                             ‘o‘:u“Q-对比值“
                             ‘p‘:u“Q-当前值“
                             ‘r‘:u“Q-变化趋势幅度“
                             ‘s‘:u“Q-对比时间“
                             ‘t‘:u“Policy-影响行业“
                             ‘u‘:u“Pol-鼓励限制“
                             ‘v‘:u“Project-主导方“
                             ‘w‘:u“Pro-投资方“
                             ‘x‘:u“Pro-承建方“
                             ‘y‘:u“Pro-开工时间“
                             ‘z‘:u“Pro-完成时间“
                                }
        self.allKey = “abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ“
        self.numberKey = “0123456789“
        self.controlCommand = {‘q‘:“unTag“ ‘ctrl+z‘:‘undo‘}
        self.labelEntryList = []
        self.shortcutLabelList = []
        # default GUI display parameter
        if len(self.pressCommand) > 20:
            self.textRow = len(self.pressCommand)
        else:
            self.textRow = 20
        self.textColumn = 5
        self.tagScheme = “BMES“
        self.onlyNP = False  ## for exporting sequence 
        self.seged = True
        self.configFile = “config“
        self.entityRe = r‘\[\@.*?\#.*?\*\](?!\#)‘
        self.insideNestEntityRe = r‘\[\@\[\@(?!\[\@).*?\#.*?\*\]\#‘
        ## configure color
        self.entityColor = “SkyBlue1“
        self.insideNestEntityColor = “light slate blue“
        self.selectColor = ‘light salmon‘
        self.maxEventId = 0
        self.currentEventId = ““
        self.textFontstyle = “Times“
        self.EventIdString = StringVar()

 属性            大小     日期    时间   名称
----------- ---------  ---------- -----  ----
     目录           0  2018-06-13 10:59  YEDDA\
     文件        7756  2018-06-05 15:24  YEDDA\.config.un~
     目录           0  2018-06-13 10:59  YEDDA\.git\
     文件         295  2018-06-05 14:19  YEDDA\.git\config
     文件          73  2018-06-05 14:18  YEDDA\.git\description
     文件          23  2018-06-05 14:19  YEDDA\.git\HEAD
     目录           0  2018-06-05 14:18  YEDDA\.git\hooks\
     文件         478  2018-06-05 14:18  YEDDA\.git\hooks\applypatch-msg.sample
     文件         896  2018-06-05 14:18  YEDDA\.git\hooks\commit-msg.sample
     文件        3327  2018-06-05 14:18  YEDDA\.git\hooks\fsmonitor-watchman.sample
     文件         189  2018-06-05 14:18  YEDDA\.git\hooks\post-update.sample
     文件         424  2018-06-05 14:18  YEDDA\.git\hooks\pre-applypatch.sample
     文件        1642  2018-06-05 14:18  YEDDA\.git\hooks\pre-commit.sample
     文件        1348  2018-06-05 14:18  YEDDA\.git\hooks\pre-push.sample
     文件        4898  2018-06-05 14:18  YEDDA\.git\hooks\pre-rebase.sample
     文件         544  2018-06-05 14:18  YEDDA\.git\hooks\pre-receive.sample
     文件        1492  2018-06-05 14:18  YEDDA\.git\hooks\prepare-commit-msg.sample
     文件        3610  2018-06-05 14:18  YEDDA\.git\hooks\update.sample
     文件        4251  2018-06-05 14:35  YEDDA\.git\index
     目录           0  2018-06-05 14:18  YEDDA\.git\info\
     文件         240  2018-06-05 14:18  YEDDA\.git\info\exclude
     目录           0  2018-06-05 14:19  YEDDA\.git\logs\
     文件         180  2018-06-05 14:19  YEDDA\.git\logs\HEAD
     目录           0  2018-06-05 14:19  YEDDA\.git\logs\refs\
     目录           0  2018-06-05 14:19  YEDDA\.git\logs\refs\heads\
     文件         180  2018-06-05 14:19  YEDDA\.git\logs\refs\heads\master
     目录           0  2018-06-05 14:19  YEDDA\.git\logs\refs\remotes\
     目录           0  2018-06-05 14:19  YEDDA\.git\logs\refs\remotes\origin\
     文件         180  2018-06-05 14:19  YEDDA\.git\logs\refs\remotes\origin\HEAD
     目录           0  2018-06-05 14:18  YEDDA\.git\objects\
     目录           0  2018-06-05 14:18  YEDDA\.git\objects\info\
............此处省略67个文件信息

评论

共有 条评论