I have a bunch of rows in a Google Spreadsheet that look as such:
a,b,c,d
a,d
c,d
b,d,f
a,b,f,g,h,i
q,b,b
...And on.
I need a way to create a unique list of related pairs in this format:
a,b
a,c
a,d
b,c
b,d
c,d
b,f
d,f
a,f
a,g
a,h
a,i
...And on.
Any idea how I'd do that? I'm open to answers using Google Spreadsheet's scripting language, Excel 2004's scripting language or something else like PHP.
Thanks!
Not sure if this satisfies your platform requirements, but here is a spreadsheet formula that can be used in Google Spreadsheets itself (not in any Excel version though):
=ArrayFormula(SORT(TRANSPOSE(SPLIT(CONCATENATE(REPT(UNIQUE(TRANSPOSE(SPLIT(JOIN(",";A:A);",")))&","&TRANSPOSE(UNIQUE(TRANSPOSE(SPLIT(JOIN(",";A:A);","))));(UNIQUE(TRANSPOSE(SPLIT(JOIN(",";A:A);",")))<=TRANSPOSE(UNIQUE(TRANSPOSE(SPLIT(JOIN(",";A:A);",")))))*REGEXMATCH(CONCATENATE(","&SUBSTITUTE(A:A;",";",,")&","&CHAR(9));"(,"&UNIQUE(TRANSPOSE(SPLIT(JOIN(",";A:A);",")))&",[^\t]*,"&TRANSPOSE(UNIQUE(TRANSPOSE(SPLIT(JOIN(",";A:A);","))))&",)|(,"&TRANSPOSE(UNIQUE(TRANSPOSE(SPLIT(JOIN(",";A:A);","))))&",[^\t]*,"&UNIQUE(TRANSPOSE(SPLIT(JOIN(",";A:A);",")))&",)"))&CHAR(9));CHAR(9)))))
It also assumes you don't want to list "b,a" as well as "a,b".
EDIT: These sort of formulae can be horribly inefficient for very large data sets, so consider only using if processing a few hundred rows or less.
Here is the function to make pairs:
<?php
function make_pairs($str) {
$chars = explode(',', $str);
for ($i = 0; $i <= count($chars); $i++) {
$f = array_shift($chars);
foreach ($chars as $char)
echo "$f,$char
";
}
}
make_pairs('a,b,c,d');
Result:
a,b
a,c
a,d
b,c
b,d
c,d
Since you have tagged the above question with VBA
, here is a vba solution.
This will give you all the 45 unique combinations your above example is supposed to have.
My assumptions
1) Data is in Col A of Sheet1
2) Col A doesn't have any headers
3) Output to be generated in Col B
4) You are using Excel 2007 +
5) You are considering b,b
as a valid combination because of q,b,b
. If not then a small tweak needs to be added.
Option Explicit
Sub Sample()
Dim ws As Worksheet
Dim lRow As Long, nRow As Long, n As Long
Dim i As Long, j As Long, k As Long
Dim Myar() As String, TempAr() As String
Set ws = Sheet1
lRow = ws.Range("A" & Rows.count).End(xlUp).Row
n = 0: nRow = 1
With ws
For i = 1 To lRow
Myar = Split(.Range("A" & i).Value, ",")
If UBound(Myar) > 1 Then
For j = LBound(Myar) To UBound(Myar)
For k = LBound(Myar) To UBound(Myar)
If j <> k Then
ReDim Preserve TempAr(n)
TempAr(n) = Myar(j) & "," & Myar(k)
n = n + 1
End If
Next k
Next j
Else
ReDim Preserve TempAr(n)
TempAr(n) = .Range("A" & i).Value
n = n + 1
End If
Next i
For i = LBound(TempAr) To UBound(TempAr)
.Range("B" & nRow).Value = TempAr(i)
nRow = nRow + 1
Next i
'~~> Remove duplicates
.Range("$B$1:$B$" & UBound(TempAr) + 1).RemoveDuplicates _
Columns:=1, Header:=xlNo
'~~> Sort data
.Range("$B$1:$B$" & UBound(TempAr) + 1).Sort _
.Range("B1"), xlAscending
Debug.Print "Total Combinations : " & _
Application.WorksheetFunction.CountA(Columns(2))
End With
End Sub
FOLLOWUP
Not sure if this will work with Excel 2004 but replace the line
.Range("$B$1:$B$" & UBound(TempAr) + 1).RemoveDuplicates _
Columns:=1, Header:=xlNo
with
For i = 1 To UBound(TempAr)
If Application.WorksheetFunction.CountIf(.Range("B" & i).Value) > 1 Then
.Range("B" & i).ClearContents
End If
End With
Rest remains the same I guess. Test it and let me know if you get any errors?