根据相似度最高的值对字典列表进行排序

Given the following python list of dictionaries:

results = [[{'id': '001', 'result': [0,0,0,0,1]},
           {'id': '002', 'result': [1,1,1,1,1]},
           {'id': '003', 'result': [0,1,1,None,None]},
           {'id': '004', 'result': [0,None,None,1,0]},
           {'id': '005', 'result': [1,0,None,1,1]},
           {'id': '006', 'result': [0,0,0,1,1]}],
          [{'id': '001', 'result': [1,0,1,0,1]},
           {'id': '002', 'result': [1,1,1,1,1]},
           {'id': '003', 'result': [0,1,1,None,None]},
           {'id': '004', 'result': [0,None,None,1,0]},
           {'id': '005', 'result': [1,0,None,1,1]},
           {'id': '006', 'result': [1,0,1,0,1]}]
            ]

I would like to generate a new sorted list (in both python and golang) based on the values of 'result' by comparing results between the players ('id') in each group and then sorting them based on the number of matching entries (None results are discarded and not counted):

During the first round and second round 001 and 006 had nine matching answers:
001 = [0,0,0,0,1] 006 = [0,0,0,1,1] - four matching answers.
During the second round, 001 and 006 had five matching answers:
001 = [1,0,1,0,1] 006 = [1,0,1,0,1] - five matching answers

sorted_results = ['001','006','002','005','003','004']

'001' and '006' are the first two items in the list because they have the highest number of matching results - nine.

If you sort those items by the "highest number of identical results", this is what you get:

['003', '004', '005', '006', '001', '002']

If you meant something else (i.e. not "highest number of identical results"), please clarify your question. Also, you can simply modify the max_identical function so that it acts according to your definition of similar.

The above result was computed with:

from collections import defaultdict


results = [{'id': '001', 'result': [0, 0, 0, 0, 1]},
           {'id': '002', 'result': [1, 1, 1, 1, 1]},
           {'id': '003', 'result': [0, 1, 1, None, None]},
           {'id': '004', 'result': [0, None, None, 1, 0]},
           {'id': '005', 'result': [1, 0, None, 1, 1]},
           {'id': '006', 'result': [0, 0, 0, 1, 1]}]


def max_identical(lst):
    counts = defaultdict(lambda: 0)
    for x in lst:
        if x is not None:
            counts[x] += 1
    return max(counts.values())


results = sorted(results, key=lambda x: max_identical(x['result']))

print [x['id'] for x in results]

Looking for a solution for a problem very similar to yours I found this page: http://w3facility.org/question/sorting-a-python-dictionary-after-running-an-itertools-function/

Using your example:

import itertools
results = [[{'id': '001', 'result': [0,0,0,0,1]},
           {'id': '002', 'result': [1,1,1,1,1]},
           {'id': '003', 'result': [0,1,1,None,None]},
           {'id': '004', 'result': [0,None,None,1,0]},
           {'id': '005', 'result': [1,0,None,1,1]},
           {'id': '006', 'result': [0,0,0,1,1]}],
          [{'id': '001', 'result': [1,0,1,0,1]},
           {'id': '002', 'result': [1,1,1,1,1]},
           {'id': '003', 'result': [0,1,1,None,None]},
           {'id': '004', 'result': [0,None,None,1,0]},
           {'id': '005', 'result': [1,0,None,1,1]},
           {'id': '006', 'result': [1,0,1,0,1]}]
          ]

This will create an all vs all comparison of the ids, each for for each round.

similarity = {}
for p1, p2 in itertools.combinations(results[0], 2):
    similarity.setdefault((p1["id"], p2["id"]), sum([1 for i in range(len(p1["result"])) if p1["result"][i] == p2["result"][i]]))
for p1, p2 in itertools.combinations(results[1], 2):
    similarity.setdefault((p1["id"], p2["id"]), 0)
    similarity[(p1["id"], p2["id"])] += sum([1 for i in range(len(p1["result"])) if p1["result"][i] == p2["result"][i]])

Now to sort the ids pairs by their matching values, will return a list of ordered tuples of ids.

similarity = sorted(similarity, key=lambda x:similarity[x], reverse=True)
print(similarity)

Now to eliminate the duplicate values, it is just necessary to retain the first occurence of each id, in that order and forget of the rest.

sorted_ids = []
for tuple_id in similarity:
    if tuple_id[0] not in sorted_ids:
        sorted_ids.append(tuple_id[0])
    if tuple_id[1] not in sorted_ids:
        sorted_ids.append(tuple_id[1])

print sorted_ids