I need to push output of RESTAPI call into KAFKA. Restapi returns json output which has supporting information along with data output into json.RawMessage
type Response struct {
RequestID string `json:"requestId"`
Success bool `json:"success"`
NextPageToken string `json:"nextPageToken,omitempty"`
MoreResult bool `json:"moreResult,omitempty"`
Errors []struct {
Code string `json:"code"`
Message string `json:"message"`
} `json:"errors,omitempty"`
**Result json.RawMessage `json:"result,omitempty"`**
Warnings []struct {
Code string `json:"code"`
Message string `json:"message"`
} `json:"warning,omitempty"`
}
json.RawMessage has data for 200 records.
Question: 1. As a producer, should I put the whole raw message into kafka topic as one message? Or unmarshal(parse) the json raw message and put each message records as a message( In this case there will be 200 records) 2. if I unmarshal(parse) the data will not be in json format anymore.
I'm not providing any code here... my code can be in GO, python
End consumer for the topic is Spark or custom program which read the data from topic and push the data to another system.
Please let me know what's the best design/ approach?
Thanks
There's no other answer than a great big "It Depends" :)
It Depends on what you're doing with the data ("push to another system" is just a step on the way to doing something with the data), and it depends on the semantic and business meaning of the data.
If each of your 200 messages means something on its own, independent from other messages, then unbundling and putting as individual messages on Kafka makes sense.