PHP - 从多个MySQL查询创建XML并按日期排序

I have 10-20 log-tables in a MySQL database. Each table contains 50-100.000 rows. I need to export these to XML and sort them by creation date.

Union is a good option as the tables doesn't contain the same columns (one table might contain 3 column, and another 30 columns).

This is how I create the XML:

// Events
$stmt = $db->query("
  SELECT id, columnX, created
  FROM table1
");
$row_count = $stmt->rowCount();
if ($row_count != '0') {
  while($row = $stmt->fetch(PDO::FETCH_ASSOC)) {
    $event = $xml->createElement("event");
    $events->appendChild($event);
    $event->appendChild($xml->createElement("ID", "XXXX"));
    $event->appendChild($xml->createElement("columnX", $row['columnX']));
    $event->appendChild($xml->createElement("created", $row['created']));
  }
}

// Other events
$stmt = $db->query("
  SELECT id, columnY1, columnY2, columnY3, created
  FROM table2
");
$row_count = $stmt->rowCount();
if ($row_count != '0') {
  while($row = $stmt->fetch(PDO::FETCH_ASSOC)) {
    $event = $xml->createElement("event");
    $events->appendChild($event);
    $event->appendChild($xml->createElement("ID", "XXXX"));
    $event->appendChild($xml->createElement("columnY1", $row['columnY1']));
    $event->appendChild($xml->createElement("columnY2", $row['columnY2']));
    $event->appendChild($xml->createElement("columnY3", $row['columnY3']));
    $event->appendChild($xml->createElement("created", $row['created']));
  }
}

Anyone got an idea of how to solve this?

If there is possibility to sort all queries, you are able to sort final XML by getting all queries from database and then printing out them like in code bellow.

Be aware, that this code WILL probably consume as much memory as data returned by all queries in one time, because you cannot use unbuffered query in this case. I don't know, how big are datasets, you are talking about.

If memory would be your concern, you can use same algorithm to combine any data source. So you can prepare three XML files (per query) and combine these instead of combining SQL. It would be (in combination with mysql unbuffered queries) probably better variant for memory usage, but slower as you will need generate and parse XML.

// convert queries to generator
function processQuery(mysqli $db, $sql) {
    $q = $db -> query($sql);
    while ($row = $q -> fetch_assoc()) {
        // just yield
        yield $row;
    }
}

// prepare all queries
$queries = [
    processQuery($db, "SELECT id, columnX, created FROM table1 ORDER BY created"),
    processQuery($db, "SELECT id, columnY1, columnY2, columnY3, created FROM table2 ORDER BY created"),
    processQuery($db, "SELECT id, created FROM table3 ORDER BY created"),
];

// run all queries and fetch first row
foreach ($queries as $query) {
    $query -> next(); // see \Generator
}

// now, we will run while any query has rows (generator returns key)
while (array_filter(array_map(function(Generator $query) { return $query -> key(); }, $queries))) {
    // now we have to find query, which next row has minimal date
    $minTimestamp = NULL;
    $queryWithMin = NULL;
    foreach ($queries as $queryId => $query) {
        $current = $query -> current();
        if ($current !== FALSE) {
            if ($minTimestamp === NULL || $minTimestamp > $current['created']) {
                // this query has row with lower date than previous queries
                $minTimestamp = $current['created'];
                $queryWithMin = $queryId;
            }
        }
    }
    // we now know, which query returns row with minimal date
    PRINT_TO_XML($queries[$queryWithMin] -> current());
    // move cursor of this query to next row
    $queries[$queryWithMin] -> next();
}

Another aproach could be MySQL UNION only for getting ids (already sorted) and then process them in batches.

 $q = $db -> query("SELECT 'table1' AS tableName, id, created FROM table1
 UNION ALL SELECT 'table2' AS tableName, id, created FROM table2
UNION ALL SELECT 'table3' AS tableName, id, created FROM table3
ORDER BY created");

$sorter = [];
while ($row = $q -> fetch_assoc()) {
    $sorter []= [$row['tableName'], $row['id']];
}

foreach (array_chunk($sorter, 5000) as $dataChunk) {
    // get ids from each table
    $table1Ids = array_map(function($rowInfo) { return $rowInfo[1]; }, array_filter($dataChunk, function($rowInfo) { return $rowInfo[0] === 'table1'; }));
    $table2Ids = array_map(function($rowInfo) { return $rowInfo[1]; }, array_filter($dataChunk, function($rowInfo) { return $rowInfo[0] === 'table2'; }));
    $table3Ids = array_map(function($rowInfo) { return $rowInfo[1]; }, array_filter($dataChunk, function($rowInfo) { return $rowInfo[0] === 'table3'; }));
    // load full data from each table
    $dataTable1 = [];
    $q = $db -> query("SELECT * FROM table1 WHERE id IN (".implode(",", $table1Ids).")");
    while ($row = $q -> fetch_assoc()) {
        $dataTable1[$row['id']] = CREATE_XML($row);
    }
    // ... same with table2
    // ... same with table3
    // store
    foreach ($dataChunk as $row) {
        if ($row[0] === 'table1') {
            echo $dataTable1[$row[1]];
        }
        if ($row[1] === 'table1') {
            echo $dataTable2[$row[1]];
        }
        if ($row[2] === 'table1') {
            echo $dataTable3[$row[1]];
        }
    }
}

This approach is less memory consuming, but in this exact code, you will need to load all IDs to memory first. It's possible to simple rewrite to generate XML in first loop (if count($sorter) > 5000 { printXmlForIds($sorter); $sorter = []; }) and algorithm would not exceed memory limt.

I suggest using an INSERT INTO ... SELECT ... UNION ... SELECT construct to fetch all the data into a (temporary) table. INSERT INTO ... SELECT allows you to directly insert the result of an select into a table. UNION allows you to concat SELECT results. Because it is a database statement it all happens in the DBMS.

After that use a select to fetch the data ordered by date field and use XMLWriter to create the XML.