是否可以使用PHPExcel库导入和导出大小为70MB的excel文件?

I have one excel file with 3 columns in which 2nd column contains email hyper-link. So I have to import this file and export it with only 2 columns first one should contains name and second one email means I have to split that hyper-link into name and email.

For 31MB file I changed memory limit to 2048MB and execution time 1200 in php.ini file. I can successfully imported and exported excel file of 31MB but while exporting 70MB file execution takes so much time and gives the following error message.

Fatal error: Allowed memory size of 2147483648 bytes exhausted (tried to allocate 15667514 bytes) in /var/www/html/PHPExcel/Reader/Excel2007.php on line 327

Is it possible to import and export excel file with size 70MB using PHPExcel library? And what I have to change like memory limit and max execution time etc in php.ini file.

require "PHPExcel.php";
require "PHPExcel/IOFactory.php";

$inputFileName = 'xxx.xlsx';

    $inputFileType = PHPExcel_IOFactory::identify($inputFileName);
    $objReader = PHPExcel_IOFactory::createReader($inputFileType);
    $objReader->setReadDataOnly(true);
    $objPHPExcel = $objReader->load($inputFileName);

    $outputObj = new PHPExcel();

//  Get worksheet dimensions
$sheet = $objPHPExcel->getSheet(0);
$highestRow = $sheet->getHighestRow();

$outputObj->setActiveSheetIndex(0);
$outSheet = $outputObj->getActiveSheet();

//  Loop through each row of the worksheet in turn
for ($row = 2; $row <= $highestRow; $row++){ // As row 1 seems to be header
    //  Read cell B2, B3, etc.
    $line = $sheet->getCell('B' . $row)->getValue();

    preg_match("|([^\.]+)\ <([^>]+)>|", $line, $data);

    if(!empty($data))
    {
        // $data[1] will be name & $data[2] will be email
        $outSheet->setCellValue('A' . $row, $data[1]);
        $outSheet->setCellValue('B' . $row, $data[2]);  
    }

}

$objWriter = new PHPExcel_Writer_CSV($outputObj);
$objWriter->save("xxx.csv");

NOTE: Can I export excel file without making any changes in php.ini file

I got solution. Successfully I have done this task in python. Hopefully it will help someone. :)

# Time taken 2min 4sec for 69.9MB file.

import csv
import re
from openpyxl import Workbook, load_workbook

location = 'big.xlsx'

wb = load_workbook(filename=location, read_only=True)
users_data = []
# pattern = '^(.+?) <([^>].+)>$' # matches "your name <email@email.com>"
# pattern_new = '^(.+?)<([^>].+)>$' # matches "your name<email@email.com>"
# pattern_email = '([\w.-]+@[\w.-]+)' # extracts email from sentence

# Define patterns to check on string.
patterns = ['^(.+?) <([^>].+)>$', '^(.+?)<([^>].+)>$']

# Loop through all sheets in XLSX
for wsheet in wb.get_sheet_names():
    # Load data from Sheet.
    ws = wb.get_sheet_by_name(wsheet)
    # Loop through each row in current Sheet.
    for row in ws.rows:
        # We need column B data, so get that directly.
        # Check if its not empty.
        if row[1].value:
            val = ""
            # Get column B data, remove unnecessary data and encode using utf-8 format.
            data = row[1].value.replace("(at)", "@").replace("(dot)", ".").encode('utf-8')
            # Loop through all patterns to match in current data.
            for pattern in patterns:
                # Apply regex on data.
                name_data = re.search(pattern, data)
                # If match found.
                if name_data:
                    # Create list of matched data and break loop to avoid extra searches on current row.
                    val = [name_data.group(1), name_data.group(2)]
                    # val = name_data.group()
                    break
            # If no matches found, check for only email, if not then use data as it is.
            if not val:
                # val = data
                name_data = re.search('([\w.-]+@[\w.-]+)', data)
                # If match found, then use that, else use data.
                if name_data:
                    val = [name_data.group(1)]
                else:
                    val = data
            # Append new data to users_data array.
            users_data.append(val)

# Open CSV file for writting list.
myfile = open('big.csv', 'wb')


# Open file in write mode.
wr = csv.writer(myfile, dialect='excel', delimiter = ',', quotechar='"', quoting=csv.QUOTE_MINIMAL, lineterminator='
')
# Loop through each value in list.
for word in users_data:
    # Append data in CSV.
    wr.writerow([word])

# Close CSV file.
myfile.close()

I don't see the point in loading one spreadsheet file, copying everything from that to a second, then saving the second.... that will be memory and performance intensive

why not just load the first, delete your heading row 1, then save to your CSV output

// Read the original spreadsheet
$inputFileName = 'TraiDBDump.xlsx';

$inputFileType = PHPExcel_IOFactory::identify($inputFileName);
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$objReader->setReadDataOnly(true);
$objPHPExcel = $objReader->load($inputFileName);

// Remove header row
$objPHPExcel->getSheet(0)->removeRow(1, 1);

// Save as a csv file
$objWriter = new PHPExcel_Writer_CSV($objPHPExcel);
$objWriter->save("TraiDBDump.csv");

If your original has a lot of columns, and you only need A and B, then you could use a read filter to read only those two columns

@Priyanka, you can also try using Spout: https://github.com/box/spout. It works great for large files! You won't have to change your php.ini file, as it won't require more than 10MB of memory and should finish before the default time limit.

You can do something like this:

$filePath = 'xxx.xlsx';
$reader = ReaderFactory::create(Type::XLSX);
$reader->open($filePath);

$writer = WriterFactory::create(Type::CSV);
$writer->openToFile($'xxx.csv');

$rowCount = 0;
while ($reader->hasNextSheet()) {
    $reader->nextSheet();

    while ($reader->hasNextRow()) {
        $row = $reader->nextRow();
        $rowCount++;

        if ($rowCount === 1) {
            continue; // that's for the header row
        }

        // get the values you need in the current row
        // for example:
        $name = $row[1];
        $email = $row[2];

        // write the data to the CSV file
        $writer->addRow([$name, $email]);
    }
}

$reader->close();
$writer->close();

Give it a try! Hopefully it will solve your problem :)