Google Cloud Data Flow作业失败,错误为“无法检索暂存文件:3次尝试均无法检索工作程序:MD5错误……”

SDK: Apache Beam SDK for Go 0.5.0

We are running Apache Beam Go SDK jobs in Google Cloud Data Flow. They had been working fine until recently when they intermittently stopped working (no changes made to code or config). The error that occurs is:

Failed to retrieve staged files: failed to retrieve worker in 3 attempts: bad MD5 for /var/opt/google/staged/worker: ..., want ; bad MD5 for /var/opt/google/staged/worker: ..., want ;

(Note: It seems as if it's missing a second hash value in the error message message.)

As best I can guess there's something wrong with the worker - It seems to be trying to compare md5 hashes of the worker and missing one of the values? I don't know exactly what it's comparing to though.

Does anybody know what could be causing this issue?

The fix to this issue seems to have been to rebuild the worker_harness_container_image with the latest changes. I had tried this but I didn't have the latest release when I built it locally. After I pulled the latest from the Beam repo, and rebuilt the image (As per the notes here https://github.com/apache/beam/blob/master/sdks/CONTAINERS.md) and reran it seemed to work again.

I'm seeing the same thing. If I look into the Stackdriver logging I see this:

Handler for GET /v1.27/images/apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515/json returned error: No such image: apache-docker-beam-snapshots-docker.bintray.io/beam/go:20180515

However, I can pull the image just fine locally. Any ideas why Dataflow cannot pull.