• 大小: 29.08MB
    文件类型: .zip
    金币: 1
    下载: 0 次
    发布日期: 2023-07-28
  • 语言: Python
  • 标签:

资源简介

中文错别字纠正工具。音似、形似错字(或变体字)纠正,可用于中文拼音、笔画输入法的错误纠正。python3开发。pycorrector依据语言模型检测错别字位置,通过拼音音似特征、笔画五笔编辑距离特征及语言模型困惑度特征纠正错别字。

资源截图

代码片段和文件信息

# -*- coding: utf-8 -*-
# Author: XuMing 
# Brief: 
from __future__ import print_function

import sys

from setuptools import setup find_packages

from pycorrector import __version__

if sys.version_info < (3):
    sys.exit(‘Sorry Python3 is required for pycorrector.‘)

with open(‘README.md‘ ‘r‘ encoding=‘utf-8‘) as f:
    readme = f.read()

with open(‘LICENSE‘ ‘r‘ encoding=‘utf-8‘) as f:
    license = f.read()

with open(‘requirements.txt‘ ‘r‘ encoding=‘utf-8‘) as f:
    reqs = f.read()

setup(
    name=‘pycorrector‘
    version=__version__
    description=‘Chinese Text Error Corrector‘
    long_description=readme
    long_description_content_type=‘text/markdown‘
    author=‘XuMing‘
    author_email=‘xuming624@qq.com‘
    url=‘https://github.com/shibing624/pycorrector‘
    license=“Apache 2.0“
    classifiers=[
        ‘Intended Audience :: Developers‘
        ‘Operating System :: OS Independent‘
        ‘Natural Language :: Chinese (Simplified)‘
        ‘Natural Language :: Chinese (Traditional)‘
        ‘Programming Language :: Python‘
        ‘Programming Language :: Python :: 3‘
        ‘Programming Language :: Python :: 3.5‘
        ‘Programming Language :: Python :: 3.6‘
        ‘Topic :: Text Processing‘
        ‘Topic :: Text Processing :: Indexing‘
        ‘Topic :: Text Processing :: Linguistic‘
    ]
    keywords=‘NLPcorrectionChinese error correctorcorrector‘
    install_requires=reqs.strip().split(‘\n‘)
    packages=find_packages(exclude=[‘tests‘])
    package_dir={‘pycorrector‘: ‘pycorrector‘}
    package_data={
        ‘pycorrector‘: [‘*.*‘ ‘LICENSE‘ ‘../LICENSE‘ ‘README.*‘ ‘../*.txt‘ ‘data/*‘
                        ‘data/kenlm/people_chars_lm.klm‘ ‘utils/*‘
                        ‘bert/*‘ ‘deep_context/*‘ ‘rnn_attention/*‘ ‘rnn_crf/*‘ ‘rnn_lm/*‘ ‘conv_seq2seq/*‘
                        ‘seq2seq_attention/*‘ ‘transformer/*‘]
    }
    test_suite=‘tests‘


 属性            大小     日期    时间   名称
----------- ---------  ---------- -----  ----
     目录           0  2019-08-10 07:12  pycorrector-master\
     目录           0  2019-08-10 07:12  pycorrector-master\.github\
     文件         766  2019-08-10 07:12  pycorrector-master\.github\stale.yml
     文件        1157  2019-08-10 07:12  pycorrector-master\.gitignore
     文件         263  2019-08-10 07:12  pycorrector-master\.travis.yml
     文件       11357  2019-08-10 07:12  pycorrector-master\LICENSE
     文件       13560  2019-08-10 07:12  pycorrector-master\README.md
     文件          25  2019-08-10 07:12  pycorrector-master\_config.yml
     目录           0  2019-08-10 07:12  pycorrector-master\docs\
     文件        2271  2019-08-10 07:12  pycorrector-master\docs\logo.svg
     文件     1854862  2019-08-10 07:12  pycorrector-master\docs\基于深度学习的中文文本自动校对研究与实现.pdf
     目录           0  2019-08-10 07:12  pycorrector-master\examples\
     文件         284  2019-08-10 07:12  pycorrector-master\examples\correct_demo.py
     文件         553  2019-08-10 07:12  pycorrector-master\examples\detect_demo.py
     文件        1531  2019-08-10 07:12  pycorrector-master\examples\enable_char_error_detect.py
     文件         693  2019-08-10 07:12  pycorrector-master\examples\load_custom_language_model.py
     文件         240  2019-08-10 07:12  pycorrector-master\examples\my_confusion.txt
     文件         122  2019-08-10 07:12  pycorrector-master\examples\my_custom_word.txt
     文件        3821  2019-08-10 07:12  pycorrector-master\examples\use_custom_confusion.py
     目录           0  2019-08-10 07:12  pycorrector-master\pycorrector\
     文件         909  2019-08-10 07:12  pycorrector-master\pycorrector\__init__.py
     目录           0  2019-08-10 07:12  pycorrector-master\pycorrector\bert\
     文件        3333  2019-08-10 07:12  pycorrector-master\pycorrector\bert\README.md
     文件        5594  2019-08-10 07:12  pycorrector-master\pycorrector\bert\bert_corrector.py
     文件        4562  2019-08-10 07:12  pycorrector-master\pycorrector\bert\bert_detector.py
     文件         414  2019-08-10 07:12  pycorrector-master\pycorrector\bert\config.py
     文件       17375  2019-08-10 07:12  pycorrector-master\pycorrector\bert\predict_mask.py
     目录           0  2019-08-10 07:12  pycorrector-master\pycorrector\bert\tf\
     文件          83  2019-08-10 07:12  pycorrector-master\pycorrector\bert\tf\__init__.py
     文件       40725  2019-08-10 07:12  pycorrector-master\pycorrector\bert\tf\modeling.py
     文件       20251  2019-08-10 07:12  pycorrector-master\pycorrector\bert\tf\tf_predict_perplexity.py
............此处省略163个文件信息

评论

共有 条评论