Let us Imagine the Facebook homepage. There is a list of posts, I report a post, and that post is blocked.
So, in the PHP & Mysql backend, I would do something like.
reported_posts
= MySQL GROUP_CONCAT(reported_post_id)
and fetch all my reported posts, store it in some cache like memcached or redis. This will give me a response with comma separated post_ids like 123, 234, 45
homepage_posts
which are NOT IN (reported_posts)
. This will give us all the post_ids
that needs to be in the homepage other than the posts, 123, 234 and 45, as I have used NOT IN
.The issue here is that, as time goes by, the reported_posts
will keep on increasing(lets assume it increases 1000 ids). At that time, the NOT IN (reported_posts)
clause will take a huge input. Will this effect the performance of the query? What is an alternative solution to this?
I hope I could convey my doubt clearly, please let me know if it needs more clarification, I would edit as such. Thank you.
EDIT
The Reported post is not to be considered Globally, i.e. If I report the post, it should be Hidden only for me, and not for anyone else. So, it's also dependent on the account_id as well.
Assuming that reported_posts
contains a list of user-specific blacklisted posts, it would be much better to do an exclusive left join and let the database handle everything:
SELECT *
FROM homepage_posts hp
LEFT JOIN
reported_posts rp
ON hp.id = rp.post_id
AND rp.account_id = 123
WHERE
rp.id IS NULL
In mysql "IN" operator works fine if the column is indexed. If that column is not indexed then it impacts performance.