如何在tensorflow estimator 中调用自定义的metrics 评价指标?

如何在tensorflow estimator 中调用自定义的metrics 评价指标?

#  代码位于metric_funcs.py 中
class MyMAE(tf.keras.metrics.Metric):
    def __init__(self, name="mymae", **kwargs):
        super(MyMAE,self).__init__(name=name, **kwargs)
        self.total = self.add_weight('total', initializer='zeros')
        self.count = self.add_weight('count', initializer='zeros')

    def update_state(self, y_true, y_pred, sample_weight=None):
        y_true = tf.cast(y_true,dtype=tf.float32)
        y_pred = tf.cast(y_pred, dtype=tf.float32)
        t = tf.reduce_mean(tf.abs(y_true-y_pred),axis=-1)
        if sample_weight is not None:
            sample_weight = tf.cast(sample_weight, dtype=tf.float32)
            ndim = t.ndim
            weight_ndim = sample_weight.ndim
            t = tf.reduce_mean(t, axis=list(range(weight_ndim, ndim)))
            t = tf.multiply(t, sample_weight)

        t_sum = tf.reduce_sum(t)
        self.total.assign_add(t_sum)
        if sample_weight is not None:
            num = tf.reduce_sum(sample_weight)
        else:
            num = tf.cast(tf.size(t),dtype=tf.float32)
        self.count.assign_add(num)

    def result(self):
        return self.total / self.count

需要注意的是:

 1. 修改权重值,比如total和count,要用self.total.assign_add(),不能直接对self.total加减。
 2. 有些方法、函数无法在自定义类中使用,编程的时候需要注意。Tensorflow会有错误提示。
 
 与MeanAbsoluteError对比:
 

```python
a = tf.random.uniform([2,3,4])
b = tf.random.uniform([2,3,4])
w = tf.random.uniform([2,3])
m = tf.keras.metrics.MeanAbsoluteError()
m.update_state(a,b,w)
print(m.get_weights())
print(m.result().numpy())
mae=MyMAE()
mae.update_state(a,b,w)
print(mae.get_weights())
print(mae.result().numpy())

上述代码展示了如何直接调用metrics。如果我想在estimator 中调用该metric,相关语句该如何写?

我的实现如下,但运行报错!

# 同样位于metric_funcs.py 中
def my_metric(per_example_loss,label_ids, logits):
  """Compute eval metrics."""
  mae=MyMAE()

  return {
      "my_mae":
          mae.update_state(label_ids,logits),
      "eval_loss":
          tf.metrics.mean(per_example_loss)
  }


在estimator中调用上述自定义metric 的示例代码如下:

```python
# main.py 

  metric_fn = metric_funcs.my_metric
  
      eval_metrics = (metric_fn, [loss,  label_ids, logits])

      # eval_metrics = (metric_fn, [label_ids, logits])

      return tf.compat.v1.estimator.tpu.TPUEstimatorSpec(
          mode=mode, loss=loss, eval_metrics=eval_metrics)
报错如下:
 File "/lustre/home/xlyun/Software/anaconda3/envs/tensorflow23/lib/python3.6/site-packages/tensorflow_estimator/python/estimator/model_fn.py", line 488, in _validate_eval_metric_ops
    'tuples, given: {} for key: {}'.format(value, key))
TypeError: Values of eval_metric_ops must be (metric_value, update_op) tuples, given: name: "StatefulPartitionedCall"
op: "StatefulPartitionedCall"

求各位指点!十分感谢!

解铃还须系铃人

my_metric() 函数的正确写法如下:

# 自定义 metrics
# 参考 https://tensorflow.google.cn/versions/r2.6/api_docs/python/tf/estimator/add_metrics

# 同样位于metric_funcs.py 中
def my_metric(per_example_loss,label_ids, logits):
  """Compute eval metrics."""
  mae=MyMAE()
  mae.update_state(label_ids,logits)
 
  return {
      "my_mae":
          mae,
      "eval_loss":
          tf.metrics.mean(per_example_loss)
  }