u_c_new = clf1.predict(u_d) # 这里直接使用有标签数据训练得到的SVM模型对无标签数据进行分类,将其分类结果作为无标签数据的类别
cu, cl = 0.0001, 1 # 初始化有标签数据无标签数据重要程度的折中【参数】
sample_weight = np.ones(n) # 样本权重, 直接让有标签数据的权重为Cl,无标签数据的权重为Cu
print(u_c_new.shape)
print(type(u_c_new))
print(l_c.shape)
print(type(l_c))
# sample_weight[len(l_c):] = cu
# id_set = np.arange(len(u_d))
lu_c = np.concatenate((l_c, u_c_new))
结果为:
(113, 1)
<class 'numpy.ndarray'>
(60, 1)
<class 'numpy.ndarray'>
Traceback (most recent call last):
File "E:/PYTHON/PYCHARM/Demo/TSVM2.py", line 48, in <module>
lu_c = np.concatenate((l_c, u_c_new))
File "<__array_function__ internals>", line 6, in concatenate
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 2 dimension(s) and the array at index 1 has 1 dimension(s)
从Debugger中看到u_c_new:{ndarray:(113,)} ; 而l_c:{ndarray:(60,1)} ,这是为什么呀?怎么解决呀?
参考一下这里的解决方法:
https://blog.csdn.net/liangjiu2009/article/details/104371329
解决上述问题有两种方法:
u_c_new = pd.DataFrame(u_c_new)
lu_c = np.concatenate((l_c, u_c_new))
或者
u_c_new = u_c_new.reshape((n,m)) # n,m根据自身数据决定
lu_c = np.concatenate((l_c, u_c_new))
当拼接在循环中,每一次拼接之前都需要进行reshape或者dataframe转换