使用Excel在PHP应用程序中进行核心分析?

I think this is really a question about approach.

We have a core Excel spreadsheet that comes in from a third party, and it provides some complex analysis of raw data. The raw data arrives directly from customers in a second spreadsheet.

We are looking to create an automated web-server based system where the raw data Excel file can be uploaded and the analysis performed, and a report generated.

The report generation is no problem, but is there any mileage in actually using the core Excel spreadsheet itself within the application to do the mathematics (using PHPExcel for example)? Or is the best approach always to replicate the analysis in code?

Our thinking is that the analysis (which comes in from a third party) is fairly complex, so will take quite a lot of unpicking to replicate in PHP, and secondly parts of the analysis are often being tweaked by the third party meaning new versions of the Excel are regularly provided to us.

I could imagine using PHPExcel to do something like this, creating the sheet in memory and merging the raw data in as a second worksheet, then reading out the calculated results, but is this realistic - would it be too slow - or just simply not work how I am imagining?

Any thoughts would be much appreciated.

Depending on the complexity of the Excel formulae, it can be perfectly viable to use PHPExcel's calculation engine, but you'd need to watch out for a few potential "gotchas".

  • Not all Excel functions are fully supported (There is a list provided in the PHPExcel documentation, but I think 316 functions out of 356 are now supported)
  • Excel functions from the analysis toolpack are not supported for BIFF (.xls) workbooks; though they will work with OfficeOpenXML (xlsx) workbooks
  • User-defined functions are not supported
  • Macros and other VB code are not supported
  • Array formulae are not yet supported (this is a significant part of the planned rewrite of the calculation engine due to start by the end of the year)
  • R1C1 format for cell references is not supported

It is relatively slow (part of the reason it needs a rewrite). You could certainly write PHP code that would execute the formulae you need more efficiently, because you could code to your specific requirements. Whether PHPExcel would be too slow? Only you can determine what "too slow" actually means.

Note that it's often better to submit requests for extensive memory- or time-consuming tasks to run outside of the web server

On the other hand, if your client ever changes their spreadsheet (and you indicate that this is fairly frequent), then using PHPExcel's built-in calculation engine means that it would still work without problems, whereas custom-written code would need modification. As you've already pointed out, complex formulae could require a lot of work converting to PHP code; and unpicking changes made to formulae in a spreadsheet would be particularly difficult.

Not really a solution to your problem, but the preferred approach would be you build the calculations PHP-side, and then get clients/third-parties to submit data in a specified format (and ideally as a CSV file), and then you parse the data and perform any necessary calculations.