[Cubicweb] work on dataimport stores

aurélien campéas aurelien.campeas at gmail.com
Tue Feb 24 17:43:09 CET 2015

2015-02-20 8:35 GMT+01:00 Sylvain Thénault <sylvain.thenault at logilab.fr>:

> Hi there,

Hi Sylvain,

> I've recently put a bunch of work in a benchmark comparing the currently
> available data import stores. It's available in
> http://hg.logilab.org/users/sthenault/dibench/

Very cool. I had started last year a small patch in the cubicweb repo to
the same effect
but this goes much further.

> That would be great if some people could take a look at it and may be take
> some
> time to champion one store or another, as I suppose that the current
> "generic"
> implementation could be optimized depending on each store particularities.
> You'll find setup instructions in the README file. The results.txt file
> states
> the overall goals I'm looking for, and provides some results and
> discussion. You
> should probably at least take a look at this if you're interested:
> http://hg.logilab.org/users/sthenault/dibench/file/tip/results.txt

I am interested in the memory consumption issue of fastimport.
However I stop it at around 8mins run time (without seeing a memory
spike, but with Christophe fix applied).

Upon investigation I suspect the FastExtEntitiesImporter is suboptimal,
though I have yet to understand what it does...

I will dig into this.

> You're much welcome to comment about any point on the list or to provide
> some
> patch to the benchmark repository.
Small note: the patch to handle only eids in fastimport is moot provided a
small patch
on dibench that sends entities instead of eids.
I don't think it will affect performance (though I should test it) since at
relation insertion
time the entity is actually cached on the cnx anyway.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cubicweb.org/pipermail/cubicweb/attachments/20150224/231db493/attachment-0186.html>

More information about the Cubicweb mailing list