《机器学习实战》python3完美运行代码

大小:

文件类型: .zip

金币: 1

下载: 0 次

发布日期: 2023-06-15
语言: Python
标签: python3 人工智能 机器学习 第二版

高速下载

资源简介

资源截图

小图大图

代码片段和文件信息

# -*- coding: utf-8 -*-
“““
Created on Thu Jul 26 09:22:46 2018

@author: wzy
“““
import numpy as np
import matplotlib.pyplot as plt

“““
函数说明：创建单层决策树的数据集

Parameters:
    None
    
Returns:
    dataMat - 数据矩阵
    classLabels - 数据标签

Modify:
    2018-07-26
“““
def loadsimpData（）:
    datMat = np.matrix（[[1.  2.1]
                        [1.5 1.6]
                        [1.3 1. ]
                        [1.  1. ]
                        [2.  1. ]]）
    classLabels = [1.0 1.0 -1.0 -1.0 1.0]
    return datMat classLabels


“““
函数说明：单层决策树分类函数

Parameters:
    dataMatrix - 数据矩阵
    dimen - 第dimen列，也就是第几个特征
    threshVal - 阈值
    threshIneq - 标志
    
Returns:
    retArray - 分类结果

Modify:
    2018-07-26
“““
def stumpClassify（dataMatrix dimen threshVal threshIneq）:
    # 初始化retArray为全1列向量
    retArray = np.ones（（np.shape（dataMatrix）[0] 1））
    if threshIneq == ‘lt‘:
        # 如果小于阈值则赋值为-1
        retArray[dataMatrix[: dimen] <= threshVal] = -1.0
    else:
        # 如果大于阈值则赋值为-1
        retArray[dataMatrix[: dimen] > threshVal] = -1.0
    return retArray


“““
函数说明：找到数据集上最佳的单层决策树

Parameters:
    dataArr - 数据矩阵
    classLabels - 数据标签
    D - 样本权重每个样本权重相等 1/n
    
Returns:
    bestStump - 最佳单层决策树信息
    minError - 最小误差
    bestClassEst - 最佳的分类结果

Modify:
    2018-07-26
“““
def buildStump（dataArr classLabels D）:
    # 输入数据转为矩阵（5 2）
    dataMatrix = np.mat（dataArr）
    # 将标签矩阵进行转置（5 1）
    labelMat = np.mat（classLabels）.T
    # m=5 n=2
    m n = np.shape（dataMatrix）
    numSteps = 10.0
    bestStump = {}
    # （5 1）全零列矩阵
    bestClasEst = np.mat（np.zeros（（m 1）））
    # 最小误差初始化为正无穷大inf
    minError = float（‘inf‘）
    # 遍历所有特征
    for i in range（n）:
        # 找到（每列）特征中的最小值和最大值
        rangeMin = dataMatrix[: i].min（）
        rangeMax = dataMatrix[: i].max（）
        # 计算步长
        stepSize = （rangeMax - rangeMin） / numSteps
        for j in range（-1 int（numSteps） + 1）:
            # 大于和小于的情况均遍历，lt:Less than  gt:greater than
            for inequal in [‘lt‘ ‘gt‘]:
                # 计算阈值
                threshVal = （rangeMin + float（j） * stepSize）
                # 计算分类结果
                predictedVals = stumpClassify（dataMatrix i threshVal inequal）
                # 初始化误差矩阵
                errArr = np.mat（np.ones（（m 1）））
                # 分类正确的，赋值为0
                errArr[predictedVals == labelMat] = 0
                # 计算误差
                weightedError = D.T * errArr
                print（“split: dim %d thresh %.2f thresh ineqal: %s the weighted error is %.3f“ % （i threshVal inequal weightedError））
                # 找到误差最小的分类方式
                if weightedError < minError:
                    minError = weightedError
                    bestClasEst = predictedVals.copy（）
                    bestStump[‘dim‘] = i
                    bestStump[‘thresh‘] = threshVal
                    bestStump[‘ineq‘] = inequa

属性            大小     日期    时间   名称
----------- ---------  ---------- -----  ----
     目录           0  2018-12-17 08:25  MachineLearning\
     目录           0  2018-07-26 09:23  MachineLearning\AdaBoost_Project1\
     文件        7450  2018-07-26 20:57  MachineLearning\AdaBoost_Project1\AdaBoost.py
     目录           0  2018-12-17 08:23  MachineLearning\AdaBoost_Project2\
     文件        7109  2018-07-26 21:51  MachineLearning\AdaBoost_Project2\AdaBoost.py
     文件       13547  2018-06-24 16:10  MachineLearning\AdaBoost_Project2\horseColicTest2.txt
     文件       60479  2018-06-24 16:10  MachineLearning\AdaBoost_Project2\horseColicTraining2.txt
     目录           0  2018-12-17 08:23  MachineLearning\AdaBoost_Project3\
     文件        1729  2018-07-27 15:19  MachineLearning\AdaBoost_Project3\AdaBoost.py
     文件       13547  2018-06-24 16:10  MachineLearning\AdaBoost_Project3\horseColicTest2.txt
     文件       60479  2018-06-24 16:10  MachineLearning\AdaBoost_Project3\horseColicTraining2.txt
     目录           0  2018-12-17 08:23  MachineLearning\AdaBoost_Project4\
     文件        7990  2018-07-29 15:16  MachineLearning\AdaBoost_Project4\AdaBoost.py
     文件       60479  2018-06-24 16:10  MachineLearning\AdaBoost_Project4\horseColicTraining2.txt
     目录           0  2018-12-17 08:23  MachineLearning\Apriori_Project1\
     文件        7857  2018-08-05 17:14  MachineLearning\Apriori_Project1\Apriori.py
     文件      570408  2011-07-13 09:49  MachineLearning\Apriori_Project1\mushroom.dat
     目录           0  2018-07-29 16:45  MachineLearning\BayesianAnalysisWithPython\
     文件        1481  2018-07-29 17:16  MachineLearning\BayesianAnalysisWithPython\Gauss.py
     目录           0  2018-12-17 08:23  MachineLearning\Bayes_Project1\
     文件        7548  2018-07-21 15:29  MachineLearning\Bayes_Project1\Bayes.py
     目录           0  2018-12-17 08:23  MachineLearning\Bayes_Project2\
     文件        8969  2018-07-21 16:40  MachineLearning\Bayes_Project2\Bayes.py
     目录           0  2018-12-17 08:23  MachineLearning\Bayes_Project2\email\
     目录           0  2018-12-17 08:23  MachineLearning\Bayes_Project2\email\ham\
     文件         141  2018-06-24 16:10  MachineLearning\Bayes_Project2\email\ham\1.txt
     文件          82  2018-06-24 16:10  MachineLearning\Bayes_Project2\email\ham\10.txt
     文件         122  2018-06-24 16:10  MachineLearning\Bayes_Project2\email\ham\11.txt
     文件         172  2018-06-24 16:10  MachineLearning\Bayes_Project2\email\ham\12.txt
     文件         164  2018-06-24 16:10  MachineLearning\Bayes_Project2\email\ham\13.txt
     文件         162  2018-06-24 16:10  MachineLearning\Bayes_Project2\email\ham\14.txt
............此处省略8959个文件信息

上一篇：gtk+-bundle_2.22.1-20101229_win64
下一篇：（中文+英文版+源代码）Python编程：从入门到实践_Python Crash Course - Eric Matthes

共有条评论

《机器学习实战》python3完美运行代码

资源简介

资源截图

代码片段和文件信息

评论

相关资源