I have a table called tags with columns id
, vid_id
, name
where id is a random id generated with md5(uniqid());
vid_id is the id of the video that tag is associated with, and name is the tag name. If a video has 5 tags, they all get stored in the tags table. I realized lately this is a bad table design because I have many duplicate tags. I created another table tag_map. It has three columns id
, vid_id
, tag_id
. I want to basically implement the 'toxi' solution shown here http://www.pui.ch/phred/archives/2005/04/tags-database-schemas.html.
What I want to do is somehow transfer the data over from tags to tag_map, delete the vid_id column in tags and delete any extra entries in tags so each tag_map entry is mapped to only one tag entry. Anyone know an efficient way to do this? I keep thinking:
$sql = 'SELECT * FROM tags';
$stmt3 = $conn->prepare($sql);
$result=$stmt3->execute();
while ($row = $stmt3->fetch(PDO::FETCH_ASSOC)) {
$id=md5(uniqid());
$sql = 'INSERT INTO tag_map VALUES (?,?,?)';
$insert = $conn->prepare($sql);
$result=$insert->execute(array($id,$row['vid_id'],$row['id']));
}
but then I get confused when I try to think of how I am going to remove the extra tags in table tags and map each tag_map entry to only one tags entry. Any advice would be greatly appreciated.
If I understand your post, then what you're after is a "Junction table" or "join table." (Upon re-reading, yes, this is the "Toxi" solution you're referring to. Not sure where it got the name "Toxi" but whatever.)
http://en.wikipedia.org/wiki/Junction_table
You want to have a table that has exactly one row for a given video. And another table that has exactly one row for each tag. Then you need a third table in between that has one row for every combination of video and tag. That way, you can query the third join table for all of the videos that match a given tag, or all the tags that match a given video.
I would create a new table "tag" and a table "video_tag".
foreach ( video_record in the video table ) {
existingTags = read in list of tags for video_record from existing_tags_table
foreach ( tag in existingTags ) {
newTag = read tag from new tag table
if ( newTag == null ) {
save tag in new table
newTag.name = existing tag name
newTag.id = id from save procedure above
}
save entry in video_tag( video.id, newTag.id )
}
}
Then delete the old tag table. If you want, do some renaming of the new tag tables.
One thing you could do is set the vid_id
and tag_id
pair as the primary key in your new tag_map
table. That way, when you go through the algorithm you have listed above, duplicate map entries will not be created, but you'll still have one for each tag. Then, you can run a query to delete duplicates in the tags
table like this one:
DELETE FROM TAGS WHERE ID NOT IN (SELECT tag_id FROM TAG_MAP)