So I've been looking into building an application that relies on something similar to a messaging bus. The idea is to be extremely fault tolerant. I have a queue of tasks that need to be performed, and here are the steps that I believe would be the end goal in a queue based system.
One of two things happens now:
- OR -
I was looking into different solutions, and I see a lot of people using REDIS as a backend for performing this operation, but it's queue is fairly simplistic. For example RPOPLPUSH
will remove the key from the queue. What happens if the server crashes? The queue now thinks it processed that item, and we have a lost task.
What steps are recommended for ensuring task completions, and noting task failures to be able to be reproccessed by another server? I intend to write the tasks in go and I'm open to using cloud services such as AWS.
Redis is a basic component on which you can build a queuing system. That said, implementing a true guaranteed delivery system on top of Redis is not trivial, especially if you need a transactional behavior.
Here are some queuing systems implemented with Redis in various languages:
Similar things could be developed in Go, but when it comes to a true guaranteed delivery semantic, the devil is in the details.
You will probably be better served by a dedicated queuing system, such as RabbitMQ or ActiveMQ. While they are more complex, they offer more features, and probably better guarantees.
Here is a Go client for RabbitMQ: https://github.com/streadway/amqp
You might also be interested by looking at disque (a dedicated queuing solution from Redis author), and the corresponding Go client at https://github.com/EverythingMe/go-disque
Finally, beanstalkd is another lightweight solution; you can find the Go client at: https://github.com/kr/beanstalk
Going to probably state the obvious here, but SQS (http://aws.amazon.com/sqs/) gives you what you need out of the box. You don't need to worry about managing the queueing system, it will scale automatically for you, you are going to focus on writing the application.
you push messages to the queue. workers pull them from the queue, process them and ack the message when done. if the workers don't ack the message after the timeout you specify the message will be surfaced back to another worker.