如何查询多个XML文件? [关闭]

I will be getting tens of thousands of XML documents that I'll need to query. The queries need to encompass all the XML files, not just querying individual files. For example, I might need:

  • Return the <name> value from the XML file whose <publish_date> is the most recent

What technologies or approach can I use for this scenario?

  • Loop through each XML file and execute an XPath? This would be too expensive and not scalable
  • Consume the XML and insert it into a database that has been modeled to respect the XML's schema? Then just do regular SQL queries to get the data I need?
  • Use an XML database?
  • Would XQuery be an option?

This needs to be part of an PHP/MySQL solution.

Take your XML files and insert them into eXist-db. You can insert these easily from PHP by doing either a HTTP POST or PUT against their REST API (depending on your needs). If you insert them into the same collection you can then from PHP do a HTTP GET or POST sending an XQuery that queries all of the documents from the same collection, for example.:

collection("/db/your-collection-of-documents")//name[parent::element()/publish_date gt "2014-006-14"]

If you can be more specific about your XML, I could update this question with the REST URI that you would need to use, and an appropriate XQuery.