I have a problem that I haven't been able to solve in an efficient way. I need to dynamically generate a list of reps for each of the 50 states in the USA. Each state has a variable number of reps that is equal to one or more and which will fluctuate. Each rep can potentially show up in more than one state. To make things more complex, several states are divided into non-standardized regions (e.g. "North", "Upstate", "Central", etc. but not consistent nomenclature across all states). The final product will produce a series of tables similar to what can be seen at https://www.compatico.com/rep-tables.php.
Right now, I'm envisioning two database tables:
**StateTable_example1**
state_id | state_name | state_reps
MI Michigan 1,2,3,4
OH Ohio 3,4,5,6,7
IN Indiana 2,4,5
**RepTable_example1**
rep_id | rep_fname | replname | (more columns not relevant here)
1 John Smith
2 Jane Doe
etc.
One idea that I have to handle regional data is to tack on extra data in the StateTable.state_reps column, sorta like this:
**StateTable_example2**
state_id | state_name | state_reps
MI Michigan 1-Upper Peninsula,2-Lower Peninsula,3-Western
OH Ohio 3-Northern,4-Southern
IN Indiana 2-Northern,4-Eastern,5-Western
That would allow one row in the RepTable per rep while still allowing for regional variations... which would be great for updating RepTable data for stuff like phone numbers, emails, fax numbers, etc... but I can't begin to wrap my head around how to write a single query in php/mysql to handle that efficiently.
My other idea is to add a column to the RepTable to handle region and then have multiple rows for each rep if they have a position in more than one state... like this:
**StateTable_example3**
state_id | state_name | state_reps
MI Michigan 1,2,3
OH Ohio 4,5
IN Indiana 6,7,8
**RepTable_example3**
rep_id | rep_region | rep_fname | replname | (more columns not relevant here)
1 Upper Peninsula John Smith
2 Lower Peninsula Jane Doe
3 Western Tom Jones
4 Northern Tom Jones
5 Southern Amy Wong
6 Northern Jane Doe
7 Eastern Amy Wong
8 Western Jim Johnson
I think that the php/mysql statement for this method would be cleaner but then there's the question of updating rep information (which changes frequently) when each rep is represented by multiple rows.
I've thought about a third table, to map the reps to states, but I can't figure out how to do that in a way that allows for disparate regions for each rep (e.g., in the above data Amy Wong's region is "Southern" for Ohio but "Eastern" for Indiana).
Anyone know of a way to do this efficiently, without multiple calls to the database? Any help would be much appreciated!
*****EDIT**** The final output, while resembling the tables in the link above, will actually be used as input into another script with graphically maps the reps. The table data becomes tooltips. The actual final output will look like this: https://www.compatico.com/find-a-rep.php (but with more region data, as that's currently lacking).
*****EDIT_2***** I renamed my example tables to clarify.
Here's what you should have for tables:
Rep
---
rep_id
fname
lname
State
----------
state_code
name
Region
----------
region_code
name
state_code (fk)
RepRegion
---------
region_code
rep_id
Just based on your report spec, a query like this should be pretty close to what you would need -- very standard joins and an order by to get things listed in the order you want.
The only trick to making this work, is that for each state where you need it, you should have either an 'ALL' or 'NONE' region, or you could include a tinyint flag named something like 'noRegion' which is 0 by default. In other words, you need at least one region for every state, even if the state has no regions. You only need an 'ALL/None' region if there are either no regions for the state, or there are regions and reps, but you also have reps who are not tied to a specific region.
For reps that span regions, states or any combination, you need to have a RepRegion row for that Rep.
Reps who are not part of a Region would be designated as such, by the fact that they are attached to one of these special "no region" regions.
Of course this is a result set, so your report assumes some procedural product to handle the format and understand when to add a new section header by state, or subheading by region.
At that point you can join from RepRegion to Rep & State, and you will easily create the report in your sample.
SELECT rr.*, rp.*, re.*, s.*
FROM RepRegion rr
JOIN Rep rp ON (rp.rep_id = rr.rep_id)
JOIN Region re ON (re.region_code = rr.region_code)
JOIN State s ON (s.state_code = re.state_code)
ORDER BY s.name, r.name, rp.lname, rp.fname
Normalize your data as follows. The StateTable
has information just for each state:
state_id | state_name
MI Michigan
OH Ohio
IN Indiana
The RepTable
table can remain as is. And finally create a new junction table StateToRepTable
which relates states to its individual reps:
state_id | state_reps
MI 1
MI 2
MI 3
MI 4
OH 3
... ...
The motivation for avoiding the CSV data in your current StateRep
table is that CSV data makes it difficult to do joins. MySQL actually does provide one function, FIND_IN_SET()
which can partially alleviate this problem, but it is limited. Also, maintaining and updating a CSV list is error prone and not easy. By getting each state rep on a separate record, we let the database do for what it was intended: operate over groups of records.