• 大小: 123KB
    文件类型: .zip
    金币: 1
    下载: 0 次
    发布日期: 2021-06-15
  • 语言: 其他
  • 标签: BiLSTM-CRF  Deep  Learnin  

资源简介

使用深度学习方法BiLSTM,并结合CRF模型的标签依赖性特点,解决命名实体识别的序列标注问题

资源截图

代码片段和文件信息

“““Build an np.array from some glove file and some vocab file

You need to download ‘glove.840B.300d.txt‘ from
https://nlp.stanford.edu/projects/glove/ and you need to have built
your vocabulary first (Maybe using ‘build_vocab.py‘)
“““

__author__ = “Guillaume Genthial“

from pathlib import Path

import numpy as np


if __name__ == ‘__main__‘:
    # Load vocab
    with Path(‘vocab.words.txt‘).open() as f:
        word_to_idx = {line.strip(): idx for idx line in enumerate(f)}
    size_vocab = len(word_to_idx)

    # Array of zeros
    embeddings = np.zeros((size_vocab 300))

    # Get relevant glove vectors
    found = 0
    print(‘Reading GloVe file (may take a while)‘)
    with Path(‘glove.840B.300d.txt‘).open() as f:
        for line_idx line in enumerate(f):
            if line_idx % 100000 == 0:
                print(‘- At line {}‘.format(line_idx))
            line = line.strip().split()
            if len(line) != 300 + 1:
                continue
            word = line[0]
            embedding = line[1:]
            if word in word_to_idx:
                found += 1
                word_idx = word_to_idx[word]
                embeddings[word_idx] = embedding
    print(‘- done. Found {} vectors for {} words‘.format(found size_vocab))

    # Save np.array to file
    np.savez_compressed(‘glove.npz‘ embeddings=embeddings)

 属性            大小     日期    时间   名称
----------- ---------  ---------- -----  ----
     目录           0  2018-10-10 03:03  tf_ner-master\
     文件       10763  2018-10-10 03:03  tf_ner-master\LICENSE
     文件        7383  2018-10-10 03:03  tf_ner-master\README.md
     目录           0  2018-10-10 03:03  tf_ner-master\data\
     目录           0  2018-10-10 03:03  tf_ner-master\data\example\
     文件         206  2018-10-10 03:03  tf_ner-master\data\example\Makefile
     文件        1354  2018-10-10 03:03  tf_ner-master\data\example\build_glove.py
     文件        1960  2018-10-10 03:03  tf_ner-master\data\example\build_vocab.py
     文件       35951  2018-10-10 03:03  tf_ner-master\data\example\glove.npz
     文件         156  2018-10-10 03:03  tf_ner-master\data\example\testa.tags.txt
     文件         212  2018-10-10 03:03  tf_ner-master\data\example\testa.words.txt
     文件         156  2018-10-10 03:03  tf_ner-master\data\example\testb.tags.txt
     文件         212  2018-10-10 03:03  tf_ner-master\data\example\testb.words.txt
     文件         156  2018-10-10 03:03  tf_ner-master\data\example\train.tags.txt
     文件         212  2018-10-10 03:03  tf_ner-master\data\example\train.words.txt
     文件          64  2018-10-10 03:03  tf_ner-master\data\example\vocab.chars.txt
     文件          20  2018-10-10 03:03  tf_ner-master\data\example\vocab.tags.txt
     文件         140  2018-10-10 03:03  tf_ner-master\data\example\vocab.words.txt
     目录           0  2018-10-10 03:03  tf_ner-master\images\
     文件       39578  2018-10-10 03:03  tf_ner-master\images\data.png
     文件       19599  2018-10-10 03:03  tf_ner-master\images\ner.png
     目录           0  2018-10-10 03:03  tf_ner-master\models\
     目录           0  2018-10-10 03:03  tf_ner-master\models\chars_conv_lstm_crf\
     文件        8866  2018-10-10 03:03  tf_ner-master\models\chars_conv_lstm_crf\main.py
     文件        1557  2018-10-10 03:03  tf_ner-master\models\chars_conv_lstm_crf\masked_conv.py
     文件         315  2018-10-10 03:03  tf_ner-master\models\chars_conv_lstm_crf\metrics.py
     目录           0  2018-10-10 03:03  tf_ner-master\models\chars_conv_lstm_crf_ema\
     文件       10997  2018-10-10 03:03  tf_ner-master\models\chars_conv_lstm_crf_ema\main.py
     文件        1557  2018-10-10 03:03  tf_ner-master\models\chars_conv_lstm_crf_ema\masked_conv.py
     文件         577  2018-10-10 03:03  tf_ner-master\models\chars_conv_lstm_crf_ema\metrics.py
     目录           0  2018-10-10 03:03  tf_ner-master\models\chars_lstm_lstm_crf\
............此处省略13个文件信息

评论

共有 条评论