I'm doing an import module that bulk inserts 90000+ registers with symfony/doctrine. In order to insert each object I must read a field from other table. So, for each register I first obtain the relevant object from the other table, like this:
$this->doctrine->getRepository('table1')
put it in the new object I want to write, then write it, like this:
$em = $this->doctrine->getManager();
$em->merge($newObject);
$em->flush();
(I use merge because it's a general method to save both existing and new objects) But that takes too much time and the response timeouts even if i set apache for a long wait (which is not desirable). The Doctrine_Collection method doesn't work also. Anyone know a method to do this better so it return in a resonable amount of time ?
Thanks
Doctrine will hold all the managed entity instances within a identity map ( the UnitOfWork
) - This means that any entities that are scheduled to be persisted (on flush()
) are held in memory. If you are performing a huge number of inserts, this can be a performance killer.
Conversely, persisting/saving just one instance and then calling flush each time will cause at least one INSERT/UPDATE
per entity - this again will effect performance due to unneeded database queries.
You should consider breaking down the required inserts into smaller chunks and allow the entity manager to release any in memory instances:
foreach($entities as $index => $entity) {
$entity->setFoo('bar');
$objectManager->merge($entity);
if (($index % 1000) == 0) {
$entityManager->flush(); // Flush the changes every 1000 iterations
$entityManager->clear(); // Clear all managed entities
}
}
The Doctrine_Collection
you mentioned is actually applicable to Doctrine 1
and alot has changed since then.
You should check out the Doctrine 2 documentation on batch processing for more information.