网格交叉验证grid.fit(X_train, y_train)编码报错

出错代码段:
from sklearn.model_selection import GridSearchCV

# Now that we know standard scaling is best for our features, we'll use those for our training and test sets
X_train, X_test, y_train, y_test = train_test_split(
    features_scaled, 
    emotions, 
    test_size=0.2, 
    random_state=69
)

# Initialize the MLP Classifier and choose parameters we want to keep constant
model = MLPClassifier(
    # tune batch size later 
    batch_size=10,  
    # keep random state constant to accurately compare subsequent models
    random_state=69
)

# Choose the grid of hyperparameters we want to use for Grid Search to build our candidate models
parameter_space = {
    # A single hidden layer of size between 8 (output classes) and 180 (input features) neurons is most probable
    # It's a bad idea at guessing the number of hidden layers to have
    # ...but we'll give 2 and 3 hidden layers a shot to reaffirm our suspicions that 1 is best
    'hidden_layer_sizes': [(8,), (180,), (300,),(100,50,),(10,10,10)], 
    'activation': ['tanh','relu', 'logistic'],
    'solver': ['sgd', 'adam'],
    'alpha': [0.0001, 0.001, 0.01],
    'epsilon': [1e-08, 0.1 ],
    'learning_rate': ['adaptive', 'constant']
}
   
# Create a grid search object which will store the scores and hyperparameters of all candidate models 
grid = GridSearchCV(
    model, 
    parameter_space,
    cv=10,
    n_jobs=4)
# Fit the models specified by the parameter grid 

grid.fit(X_train, y_train)

# get the best hyperparameters from grid search object with its best_params_ attribute
print('Best parameters found:\n', grid.best_params_)

报错如下:

UnicodeEncodeError                        Traceback (most recent call last)
<ipython-input-32-90e0439e78b9> in <module>
     41 # Fit the models specified by the parameter grid
     42 
---> 43 grid.fit(X_train, y_train)
     44 
     45 # get the best hyperparameters from grid search object with its best_params_ attribute

d:\miniconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
     70                           FutureWarning)
     71         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 72         return f(**kwargs)
     73     return inner_f
     74 

d:\miniconda3\lib\site-packages\sklearn\model_selection\_search.py in fit(self, X, y, groups, **fit_params)
    693                                     verbose=self.verbose)
    694         results = {}
--> 695         with parallel:
    696             all_candidate_params = []
    697             all_out = []

d:\miniconda3\lib\site-packages\joblib\parallel.py in __enter__(self)
    728     def __enter__(self):
    729         self._managed_backend = True
--> 730         self._initialize_backend()
    731         return self
    732 

d:\miniconda3\lib\site-packages\joblib\parallel.py in _initialize_backend(self)
    739         try:
    740             n_jobs = self._backend.configure(n_jobs=self.n_jobs, parallel=self,
--> 741                                              **self._backend_args)
    742             if self.timeout is not None and not self._backend.supports_timeout:
    743                 warnings.warn(

d:\miniconda3\lib\site-packages\joblib\_parallel_backends.py in configure(self, n_jobs, parallel, prefer, require, idle_worker_timeout, **memmappingexecutor_args)
    495             n_jobs, timeout=idle_worker_timeout,
    496             env=self._prepare_worker_env(n_jobs=n_jobs),
--> 497             context_id=parallel._id, **memmappingexecutor_args)
    498         self.parallel = parallel
    499         return n_jobs

d:\miniconda3\lib\site-packages\joblib\executor.py in get_memmapping_executor(n_jobs, **kwargs)
     18 
     19 def get_memmapping_executor(n_jobs, **kwargs):
---> 20     return MemmappingExecutor.get_memmapping_executor(n_jobs, **kwargs)
     21 
     22 

d:\miniconda3\lib\site-packages\joblib\executor.py in get_memmapping_executor(cls, n_jobs, timeout, initializer, initargs, env, temp_folder, context_id, **backend_args)
     40         _executor_args = executor_args
     41 
---> 42         manager = TemporaryResourcesManager(temp_folder)
     43 
     44         # reducers access the temporary folder in which to store temporary

d:\miniconda3\lib\site-packages\joblib\_memmapping_reducer.py in __init__(self, temp_folder_root, context_id)
    529             # exposes exposes too many low-level details.
    530             context_id = uuid4().hex
--> 531         self.set_current_context(context_id)
    532 
    533     def set_current_context(self, context_id):

d:\miniconda3\lib\site-packages\joblib\_memmapping_reducer.py in set_current_context(self, context_id)
    533     def set_current_context(self, context_id):
    534         self._current_context_id = context_id
--> 535         self.register_new_context(context_id)
    536 
    537     def register_new_context(self, context_id):

d:\miniconda3\lib\site-packages\joblib\_memmapping_reducer.py in register_new_context(self, context_id)
    558                 new_folder_name, self._temp_folder_root
    559             )
--> 560             self.register_folder_finalizer(new_folder_path, context_id)
    561             self._cached_temp_folders[context_id] = new_folder_path
    562 

d:\miniconda3\lib\site-packages\joblib\_memmapping_reducer.py in register_folder_finalizer(self, pool_subfolder, context_id)
    588         # semaphores and pipes
    589         pool_module_name = whichmodule(delete_folder, 'delete_folder')
--> 590         resource_tracker.register(pool_subfolder, "folder")
    591 
    592         def _cleanup():

d:\miniconda3\lib\site-packages\joblib\externals\loky\backend\resource_tracker.py in register(self, name, rtype)
    189         '''Register a named resource, and increment its refcount.'''
    190         self.ensure_running()
--> 191         self._send('REGISTER', name, rtype)
    192 
    193     def unregister(self, name, rtype):

d:\miniconda3\lib\site-packages\joblib\externals\loky\backend\resource_tracker.py in _send(self, cmd, name, rtype)
    202 
    203     def _send(self, cmd, name, rtype):
--> 204         msg = '{0}:{1}:{2}\n'.format(cmd, name, rtype).encode('utf-8')
    205         if len(name) > 512:
    206             # posix guarantees that writes to a pipe of less than PIPE_BUF

UnicodeEncodeError: 'ascii' codec can't encode characters in position 18-20: ordinal not in range(128)

网上各种方法都试了,还是不行

NameError                                 Traceback (most recent call last)
<ipython-input-30-a31d90ae8313> in <module>
      5 import sys
      6 
----> 7 reload(sys)
      8 sys.setdefaultencoding('utf8')
      9 from sklearn.model_selection import GridSearchCV

NameError: name 'reload' is not defined

 

报错信息是什么?

我贴一下

 


 
UnicodeEncodeError                        Traceback (most recent call last)
<ipython-input-32-90e0439e78b9> in <module>
     41 # Fit the models specified by the parameter grid
     42 
---> 43 grid.fit(X_train, y_train)
     44 
     45 # get the best hyperparameters from grid search object with its best_params_ attribute
 
d:\miniconda3\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
     70                           FutureWarning)
     71         kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 72         return f(**kwargs)
     73     return inner_f
     74 
 
d:\miniconda3\lib\site-packages\sklearn\model_selection\_search.py in fit(self, X, y, groups, **fit_params)
    693                                     verbose=self.verbose)
    694         results = {}
--> 695         with parallel:
    696             all_candidate_params = []
    697             all_out = []
 
d:\miniconda3\lib\site-packages\joblib\parallel.py in __enter__(self)
    728     def __enter__(self):
    729         self._managed_backend = True
--> 730         self._initialize_backend()
    731         return self
    732 
 
d:\miniconda3\lib\site-packages\joblib\parallel.py in _initialize_backend(self)
    739         try:
    740             n_jobs = self._backend.configure(n_jobs=self.n_jobs, parallel=self,
--> 741                                              **self._backend_args)
    742             if self.timeout is not None and not self._backend.supports_timeout:
    743                 warnings.warn(
 
d:\miniconda3\lib\site-packages\joblib\_parallel_backends.py in configure(self, n_jobs, parallel, prefer, require, idle_worker_timeout, **memmappingexecutor_args)
    495             n_jobs, timeout=idle_worker_timeout,
    496             env=self._prepare_worker_env(n_jobs=n_jobs),
--> 497             context_id=parallel._id, **memmappingexecutor_args)
    498         self.parallel = parallel
    499         return n_jobs
 
d:\miniconda3\lib\site-packages\joblib\executor.py in get_memmapping_executor(n_jobs, **kwargs)
     18 
     19 def get_memmapping_executor(n_jobs, **kwargs):
---> 20     return MemmappingExecutor.get_memmapping_executor(n_jobs, **kwargs)
     21 
     22 
 
d:\miniconda3\lib\site-packages\joblib\executor.py in get_memmapping_executor(cls, n_jobs, timeout, initializer, initargs, env, temp_folder, context_id, **backend_args)
     40         _executor_args = executor_args
     41 
---> 42         manager = TemporaryResourcesManager(temp_folder)
     43 
     44         # reducers access the temporary folder in which to store temporary
 
d:\miniconda3\lib\site-packages\joblib\_memmapping_reducer.py in __init__(self, temp_folder_root, context_id)
    529             # exposes exposes too many low-level details.
    530             context_id = uuid4().hex
--> 531         self.set_current_context(context_id)
    532 
    533     def set_current_context(self, context_id):
 
d:\miniconda3\lib\site-packages\joblib\_memmapping_reducer.py in set_current_context(self, context_id)
    533     def set_current_context(self, context_id):
    534         self._current_context_id = context_id
--> 535         self.register_new_context(context_id)
    536 
    537     def register_new_context(self, context_id):
 
d:\miniconda3\lib\site-packages\joblib\_memmapping_reducer.py in register_new_context(self, context_id)
    558                 new_folder_name, self._temp_folder_root
    559             )
--> 560             self.register_folder_finalizer(new_folder_path, context_id)
    561             self._cached_temp_folders[context_id] = new_folder_path
    562 
 
d:\miniconda3\lib\site-packages\joblib\_memmapping_reducer.py in register_folder_finalizer(self, pool_subfolder, context_id)
    588         # semaphores and pipes
    589         pool_module_name = whichmodule(delete_folder, 'delete_folder')
--> 590         resource_tracker.register(pool_subfolder, "folder")
    591 
    592         def _cleanup():
 
d:\miniconda3\lib\site-packages\joblib\externals\loky\backend\resource_tracker.py in register(self, name, rtype)
    189         '''Register a named resource, and increment its refcount.'''
    190         self.ensure_running()
--> 191         self._send('REGISTER', name, rtype)
    192 
    193     def unregister(self, name, rtype):
 
d:\miniconda3\lib\site-packages\joblib\externals\loky\backend\resource_tracker.py in _send(self, cmd, name, rtype)
    202 
    203     def _send(self, cmd, name, rtype):
--> 204         msg = '{0}:{1}:{2}\n'.format(cmd, name, rtype).encode('utf-8')
    205         if len(name) > 512:
    206             # posix guarantees that writes to a pipe of less than PIPE_BUF
 
UnicodeEncodeError: 'ascii' codec can't encode characters in position 18-20: ordinal not in range(128)

 

补充一下,我改n_job=1 他就一直运行,不报错,但不出结果

不是太懂,能具体说说嘛?我看网上报错范围有的也不是18-20

 

可能是训练数据有问题 你检查训练数据看看

什么问题?命名?

 

Python在安装时,默认的编码是ascii,当程序中出现非ascii编码时,python的处理常常会报这样的错UnicodeDecodeError: 'ascii' codec can't decode byte 0x?? in position 1: ordinal not in range(128),python没办法处理非ascii编码的,此时需要自己设置将python的默认编码,一般设置为utf8的编码格式。

查询系统默认编码可以在解释器中输入以下命令:

Python代码    

  1. >>>sys.getdefaultencoding()  

设置默认编码时使用:

Python代码    

  1. >>>sys.setdefaultencoding('utf8')  

 可能会报AttributeError: 'module' object has no attribute 'setdefaultencoding'的错误,执行reload(sys),在执行以上命令就可以顺利通过。

此时在执行sys.getdefaultencoding()就会发现编码已经被设置为utf8的了,但是在解释器里修改的编码只能保证当次有效,在重启解释器后,会发现,编码又被重置为默认的ascii了,那么有没有办法一次性修改程序或系统的默认编码呢。

 

有2种方法设置python的默认编码:

一个解决的方案在程序中加入以下代码:

Python代码    

  1. import sys  
  2. reload(sys)  
  3. sys.setdefaultencoding('utf8')   

 另一个方案是在python的Lib\site-packages文件夹下新建一个sitecustomize.py,内容为:

Python代码    

  1. # encoding=utf8  
  2. import sys  
  3.   
  4. reload(sys)  
  5. sys.setdefaultencoding('utf8')   

此时重启python解释器,执行sys.getdefaultencoding(),发现编码已经被设置为utf8的了,多次重启之后,效果相同,这是因为系统在python启动的时候,自行调用该文件,设置系统的默认编码,而不需要每次都手动的加上解决代码,属于一劳永逸的解决方法。

另外有一种解决方案是在程序中所有涉及到编码的地方,强制编码为utf8,即添加代码encode("utf8"),这种方法并不推荐使用,因为一旦少写一个地方,将会导致大量的错误报告,我曾经遇到这种情况,错误日志压缩之后尚有70多K,全都是这一个问题,让人有很崩溃的感觉。

注意参数

ascii范围必须是1-17,21-128