too long

I will break this into two parts:

  1. I have a user with a list of product 'features' that he desires. The user is asked to rank these features from 1-10. I also have a table full of products. Each of these products has various features. Some products will have all features 1-10, but some will only have a few of them. I'd like to determine which product matches the user's ranking the best, and provide an ordered result set to the user, with the best match first and the worst match last.

  2. The aforementioned query is actually just one part of what the user is doing. The user is also providing other search criteria (e.g. product category, price, etc) along with the feature ranking. So, I'd like the feature matching to only apply to products that pass through this set of filters. How would I combine these?

I have found some similar questions like this and this, but they aren't close enough for me to figure out how to apply those solutions to my situation.

The tables involved look something like this:

products (id,product)

features (id, feature)

product_features (id, product_id, feature_id)

and the user will be submitting his search through an array like this (where the keys are actually feature ids, and the rank is the value):

[users_features] => Array
    (
        [1] => Array
            (
                [rank] => 9
            )

        [2] => Array
            (
                [rank] => 1
            )

        [3] => Array
            (
                [rank] => 3
            )

(this will be a 10 item array right now, but could change in the future)

    )

I can't quite get this to work using SQL only but a simple PHP solution would be similar to the following (untested):

//Array of $productId => list of feature ids
$products;
$userFeatures;
$scores = array();
//For every product work out a score based on features and user ranking.
foreach($products as $productId => $prodFeatures){
    $score = 0;
    foreach($prodFeatures as $feature){
        //Could also perhaps penalise products lacking features.
        $score += $userFeatures[$feature]["rank"];
    }
    $scores[$productId] = $score;
}
arsort($scores);
echo "Best match is ".$scores[0];

Obviously this is a bit rough and ready but hopefully it helps.

Edit: This assumes that a ranking of 10 is the best.

A solution is to divide 1 by the user rank. For example 1/R. Then sum all the possibilities and sort the list in descendant order. 1/R is also the chance to not pick this product from the list. When you need to normalize the value or make the sum of the values equal to 1 you need to multiply the value with the sum of all the reziprokes of the values. For example r1=30 and r2=15. 1/(1/30 + 1/15) = 10, so P1 = 10 * 1/30 = 1/3 and P2 = 10 * 1/15 = 2/3. When you normalize the value it also means that a product with less attributes is also less likely.