毕设要用的😭
就是输入一个弹琴的音频,然后判断这是第几个音,应该咋实现啊!
希望可以给一个思路或者方向🧎♀️
参考:https://blog.csdn.net/calvinpaean/article/details/103204271
不知道你这个问题是否已经解决, 如果还没有解决的话:def get_myclass(columns,similarity_df,threshold):
myclass = {0:[]}
row_col_names_init = columns.copy()
i = 0
j = 1
while True:
row_col_names_init = row_col_names_init-set(myclass[i])
print(len(row_col_names_init))
if len(row_col_names_init)<=2:
myclass[j] = row_col_names_init
break
similarity_test = similarity_df.loc[row_col_names_init,row_col_names_init]
try:
myclass[j] = get_submatrix(similarity_test,threshold)
except:
threshold -= 0.01
continue
if len(myclass[j]) > 1:
myclass[j] = filter_submatrix(myclass[j],similarity_df)
else:
myclass[j] = myclass[j][0]
print(j,myclass[j])
print(threshold)
i += 1
j += 1
return myclass
针对此问题,可以采用以下步骤:
1.数据准备:收集一定数量的不同琴键的音频数据,并进行标注,即将每个音频文件标记为属于哪个琴键音。
2.特征提取:使用音频信号处理技术,从每个音频文件中提取出特征向量作为输入特征。常用的音频特征包括Mel频率倒谱系数(MFCC)、色度图谱、音高谱等。
代码示例:
import librosa import numpy as np
def extract_feature(file_name): ''' 从音频文件中提取特征 参数: file_name: 音频文件路径 返回值: feature: 提取得到的特征向量 ''' # 加载音频文件 y, sr = librosa.load(file_name)
# 提取MFCC特征
mfccs = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=40)
mfccs = np.mean(mfccs.T, axis=0)
# 提取色度图谱特征
chroma = librosa.feature.chroma_stft(y=y, sr=sr)
chroma = np.mean(chroma.T, axis=0)
# 提取音量特征
rmse = librosa.feature.rmse(y=y)
rmse = np.mean(rmse.T, axis=0)
# 存储所有特征
feature = np.hstack([mfccs, chroma, rmse])
return feature
3.模型训练:使用机器学习算法(如支持向量机、随机森林、神经网络等)构建分类模型,并使用标注好的数据进行模型训练。
代码示例:
from sklearn.svm import SVC from sklearn.ensemble import RandomForestClassifier from sklearn.neural_network import MLPClassifier from sklearn.model_selection import train_test_split from sklearn.metrics import accuracy_score
data = [] labels = [] for i in range(1, 7): file_names = glob.glob('path/to/{}/*.wav'.format(i)) for file_name in file_names: feature = extract_feature(file_name) data.append(feature) labels.append(i)
X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42)
clf = SVC(kernel='rbf', gamma=0.001, C=100)
clf = RandomForestClassifier(n_estimators=100, max_depth=5, random_state=42)
clf = MLPClassifier(hidden_layer_sizes=(100,), max_iter=1000, alpha=1e-4, solver='adam', tol=1e-4, random_state=42)
clf.fit(X_train, y_train) y_pred = clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
4.模型优化:根据模型表现进行调参和优化,如调整SVM的超参数C和gamma等,增加训练数据数目,尝试不同的特征提取方式等。
5.模型应用:使用已经训练好的模型对新的音频文件进行分类预测。
代码示例:
new_file = 'path/to/new_audio.wav' new_feature = extract_feature(new_file)
new_label = clf.predict(new_feature.reshape(1, -1))[0]
print('The new audio file belongs to key {}.'.format(new_label))