h.264：解析NAL然后解码，还是解析和解码交织在一起？

I'm trying to work out what the intended manner in which h.264 is decoded, might be. Specifically, whether it's a matter of first parsing a NAL unit and it's contents, and then decoding, or is parsing and decoding intended to be intertwined ? i.e. parse this syntax element and then perform a decoding action, then the next, and another decoding action.

I've had a look at the specifications, and they don't say much on this. I've also had a look at the reference implementation, and it seems at least on the surface that decoding is done while parsing occurs. Intuitively, performing decoding actions as needed while parsing seems like it would be faster, but I like the modularity that could be achieve with parsing the entire nal and then performing a decoding process on the resultant struct.

Would love to hear some opinions on this! Cheers