I am making an admin panel where real times tweets will be displayed and after approval from administrator, only approved tweets will be shown on website, fetching part is done, I made a cron job so after every 1 hour my script will fetch latest tweets from twitter and save in DB. Question comes if my first batch is run and save in DB after 1 hour new results will come(Not sure if any new) or all new, but duplicate results will be saved on DB which I don't want to do. how to stop this duplicate tweets saving to DB after every 1 hour. I used php, mysql for this.
You can keep hash
field in DB, which stands for function md5($author.$tweet.$datetime)
. When cron task begins to process, check every tweet from the list for matching record from DB by hash
field.