递归链接对象的算法

I'm maintaining a small program that goes through documents in a Neo4j database and dumps a JSON-encoded object to a document database. In Neo4j—for performance reasons, I imagine—there's no real data, just ID's.

Imagine something like this:

posts:
    post:
        id: 1
        tags: 1, 2
        author: 2
        similar: 1, 2, 3

I have no idea why it was done like this, but this is what I have to deal with. The program then uses the ID's to fetch information for each data structure, resulting in a proper structure. Instead of author being just an int, it's an Author object, with name, email, and so on.

This worked well until the similar feature was added. Similar consists of ID's referencing other posts. Since in my loop I'm building the actual post objects, how can I reference them in an efficient manner? The only thing I could imagine was creating a cache with the posts I already "converted" and, if the referenced ID is not in the cache, put the current post on the bottom of the list. Eventually, they will all be processed.

The approach you're proposing won't work if there are cycles of similar relationships, which there probably are.

For example, you've shown a post 1 that is similar to a post 2. Let's say you come across post 1 first. It refers to post 2, which isn't in the cache yet, so you push post 1 back onto the end of the queue. Now you get to post 2. It refers to post 1, which isn't in the cache yet, so you push post 2 back onto the end of the queue. This goes on forever.

You can solve this problem by building the post objects in two passes. During the first pass, you make Post objects and fill them with all the information except for the similar references, and you build up a map[int]*Post that maps ID numbers to posts. On the second pass, for each post, you iterate over the similar ID numbers, look up each one in the map, and use the resulting *Post values to fill a []*Post slice of similar posts.