哪个更好,使用SQL查询进行数据操作或在php中操作数组中的数据? [重复]

Possible Duplicate:
When to use a query or code

I have a large database of users and their details. I want to find all the people that are most similar to each other. Which way is preferable for comparison ? Directly writing SQL queries such that they do manipulation within database to retrieve a filtered (matched) record. OR Retrieving the raw data first in an array in php & then applying comparison operations on that array ? Lets say I have 1000 users, and I want to find the people with most mutual friends.

UserA has {a,b,r,c,g,h,r,q,l}
UserB has {x,y,z}
UserC has {a,c,r,g,q}
.
.
.
UserN has{x,y....n}

So I want to find the users with the most mutual friends with that of UserA. Such as comparing each element of UserA to all the elements of all the user arrays from whole database ? In the above case UserC should be the most similar user to UserA. I don't know how to accomplish this.

Suppose you represent friendships in a table like this:

Table: friendships
from_id  |  to_id
----------------------------
      a  |      a
      a  |      b
      a  |      r
      a  |      c
 ...etc...
      b  |      x
      b  |      y
      b  |      z
 ...etc...

Now you can write a query to answer your question (tested in MySQL):

SELECT user_id AS user_id_with_most_common_friends, MAX(cnt) AS number_of_common_friends FROM 
  (SELECT f2.from_id AS user_id, COUNT(*) AS cnt
   FROM friendships f1
   JOIN friendships f2 ON (f1.to_id = f2.to_id AND f2.from_id <> 'a')
   WHERE f1.from_id = 'a'
   GROUP BY f2.from_id) totals;

As to the performance question, empirical testing in your specific situation will give the most reliable answer, but for a large database I would expect using the above SQL query to be much faster than querying every row and calculating the results in PHP, for these reasons:

  1. If from_id and to_id are indexed, the RDBMS may not need to touch every row
  2. Returning all records will ensure you touch every row and cause the php code to consume a lot of memory (which adds some overhead for memory allocations, and will slow things to a deadly crawl if you run out of RAM).
  3. Once your php code has the data, it will not likely be able to perform the individual steps of the computation any faster than SQL could.

Unless your php code has something up its sleeve to compensate for these disadvantages (like an algorithm with lower complexity that cannot be expressed in SQL, or calling out to some specialized C code, e.g. image processing), it's typically going to be faster to keep the work inside the RDBMS as much as possible.