0%

机器学习实践Numpy【1】引入

复习了Python的基础语法,开始使用Numpy进行数据操作,同时使用JuPyter notebook 实践、记录。

Numpy

1
2
3
4
5
6
1. 定义:开源的Python科学计算库,用于快速处理任意维度的数组。
2. 存储对象: ndarray
3. 创建:np.array([])
4. 优势
- 内存块存储一体
- 支持并行运算,内部C实现,释放GIL(释放全局锁)

Numpy - N维数组 ndarray

  1. 属性&形状&类型
    • shape
    • ndim
    • size
    • itemsize

生成初始化数组

普通

1.生成0和1的数组

np.ones([4,5])

np.zeros([3,3])

2.从现有数组中生成

np.array(one_array) # 深拷贝,全新一个实例

np.asarray(one_array) # 浅拷贝,指向原有

3.生成固定范围数组

np.linspace(0,10,5) # [0,10] 生成等间隔的5个item

np.arange(0,10,2) # [0,10] 生成以2为间隔生成

分布

1.均匀分布

np.random.uniform(0,10,5) # 均匀分布(low,high,size)

2.正态分布

np.random.normal(0,10,20) # 正态分布(loc,标准差,size)

1
2
3
4
import numpy as np
one_array = np.ones([4,5]) # 按照参数维度,生成全1的数组
np.ones_like(one_array) # 按照参数数组,生成全1的数组
np.zeros([3,3])
array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])
1
2
np.array(one_array) # 深拷贝,全新一个实例
np.asarray(one_array) # 浅拷贝,指向原有
array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])
1
2
np.linspace(0,10,5) # [0,10] 生成等间隔的5个item
np.arange(0,10,2) # [0,10] 生成以2为间隔生成
array([0, 2, 4, 6, 8])
1
np.random.uniform(0,10,5) # 均匀分布
array([5.08178447, 4.02888709, 4.59175026, 6.21041799, 0.90498804])
1
np.random.normal(0,10,20) # 正态分布
array([  5.67674523,   5.77575242,  -1.85493383,   7.33692221,
        -0.05362885,   8.37029132,   7.95216801,  21.64118456,
        14.00327488,  -0.52143642,  -4.40416346,  13.31843223,
        -6.6974112 ,  19.52709879,  -6.95182398,  -0.90412052,
       -11.02669012,   4.27056343,  18.97736721,  24.2468938 ])

数组操作

数组切片

1
2
3
[行数,列数]  
1. 先行后列,左闭右开
2. 索引从外及里

数组类型修改

1
对象.astype()

数组去重

1
np.unique(temp)
1
2
import numpy as np
stock_change = np.random.normal(0, 1, (8,10))
array([[-0.21431481, -0.94781074,  0.72798452,  0.16755977, -0.21868384,
        -0.76869999, -0.42344986,  1.88452471, -1.09014707,  0.04393599],
       [ 0.1480258 , -0.83848401, -1.36803501, -1.41729986, -0.95472286,
         1.59203922, -0.65986402,  0.03174573, -0.18274345, -1.44023589],
       [ 1.22951175, -0.10736634,  0.0224487 , -0.76569652,  0.39459141,
         2.11813401,  0.61387705, -1.19309158,  0.81355314,  0.56004444],
       [-1.19703107, -1.02937508,  0.60327008,  0.18401519, -1.61605819,
         0.65697408,  0.98575318,  1.78356349,  1.5498125 , -1.06082879],
       [-1.93006799, -1.19670857,  1.35584068, -0.96465165, -0.42776941,
        -2.45202067,  0.54192585,  1.05160372, -0.20648608, -1.46869715],
       [-1.10488814,  0.75455409,  0.13580849,  0.10064928,  0.04829683,
        -0.52473154, -0.30782629,  1.475804  , -0.93086951,  0.49169795],
       [-1.52363283, -1.53559218, -0.32670834, -0.75836768, -0.47355597,
         0.6849614 ,  0.32947873,  0.42595307,  0.86099386,  0.24105507],
       [-1.34253205,  0.13808892, -0.3581911 ,  0.16412846, -0.01493121,
        -0.78940982,  0.20047251, -0.69006736,  0.78435666,  0.05314826]])
1
stock_change[0:3,0:2]
array([[-0.21431481, -0.94781074],
       [ 0.1480258 , -0.83848401],
       [ 1.22951175, -0.10736634]])
1
stock_change[0,1]
-0.9478107415708911

数组矩阵形状修改

1
2
3
1. 数组.T # 行列互换
2. 数组.reshape([行,列]) # 排成一行后,重新行列划分。不修改原有变量,如果传-1,代表以另一维度拆分。
3. 数组.resize([行,列]) # 排成一行后,重新行列划分。修改原有变量
1
stock_sharp_change = np.random.normal(0,1,(4,5))
1
stock_sharp_change
array([[-0.54011821,  1.68452432,  0.01165422,  0.92022483, -1.76956384],
       [ 0.12076425,  0.48550684, -2.07158306,  0.11184936,  0.13483726],
       [-1.08849942, -0.33872445,  0.19081035, -0.51772807, -0.05330802],
       [ 1.72971777,  1.15105593, -0.70068092,  0.50980343,  2.6761524 ]])
1
stock_sharp_change.reshape([5,4]) # 排成一行后,重新行列划分。
array([[-0.54011821,  1.68452432,  0.01165422,  0.92022483],
       [-1.76956384,  0.12076425,  0.48550684, -2.07158306],
       [ 0.11184936,  0.13483726, -1.08849942, -0.33872445],
       [ 0.19081035, -0.51772807, -0.05330802,  1.72971777],
       [ 1.15105593, -0.70068092,  0.50980343,  2.6761524 ]])
1
stock_sharp_change.reshape([-1,2]) # 排成一行后,按照2列划分,行数待确定。
array([[-0.54011821,  1.68452432],
       [ 0.01165422,  0.92022483],
       [-1.76956384,  0.12076425],
       [ 0.48550684, -2.07158306],
       [ 0.11184936,  0.13483726],
       [-1.08849942, -0.33872445],
       [ 0.19081035, -0.51772807],
       [-0.05330802,  1.72971777],
       [ 1.15105593, -0.70068092],
       [ 0.50980343,  2.6761524 ]])
1
2
stock_sharp_change.resize([5,4]) # 修改原有变量
stock_sharp_change
array([[-0.54011821,  1.68452432,  0.01165422,  0.92022483],
       [-1.76956384,  0.12076425,  0.48550684, -2.07158306],
       [ 0.11184936,  0.13483726, -1.08849942, -0.33872445],
       [ 0.19081035, -0.51772807, -0.05330802,  1.72971777],
       [ 1.15105593, -0.70068092,  0.50980343,  2.6761524 ]])
1
stock_sharp_change.T # 矩阵转置,行列互换
array([[-0.54011821, -1.76956384,  0.11184936,  0.19081035,  1.15105593],
       [ 1.68452432,  0.12076425,  0.13483726, -0.51772807, -0.70068092],
       [ 0.01165422,  0.48550684, -1.08849942, -0.05330802,  0.50980343],
       [ 0.92022483, -2.07158306, -0.33872445,  1.72971777,  2.6761524 ]])
觉得不错?