[Cubicweb] CubicWeb I/O

Aurélien Campéas aurelien.campeas at logilab.fr
Tue May 27 15:54:02 CEST 2014


On 27/05/2014 15:41, Dimitri Papadopoulos Orfanos wrote:
> Dear all,

Hello Dimitri,

> 
> I have already written in the past about I/O in CubicWeb. This is
> especially important for us because we run lots of automated scripts
> requesting, reading and writing data from the CubicWeb server, not
> necessarily through the Web interface.
> 
> We have been suggested a few methods to access data:
> 
> 1. HTTP/HTTPS remote access
>   - has always worked and still works seamlessly
>   - login from scripts might be a little bit tedious
>   - limitation on length of HTTP request
>   - poor performance on large datasets
> 
> 2. ØMQ / Pyro + DB API
>   - obsolete, do not write new code based on it
>   - admin rights, security issue!
> 
> 3. Run a script in the CubicWeb shell on a CubicWeb server, directly use
> the session.
>   - has always worked and still works seamlessly
>   - admin rights
>   - poor performance on large datasets
> 
> 4. Run a script in the CubicWeb shell on a CubicWeb server, use
> specialized stores to boost I/O for large datasets.
>   - admin rights
>   - lots of different classes:
>     . RQLObjectStore         (cubicweb)
>     . NoHookRQLObjectStore   (cubicweb)
>     . SQLGenObjectStore      (cubicweb)
>     . MassiveObjectStore     (dataio cube)
>     . ?                      (fastimport cube)

      . FlushController        (fastimport cube)

The "flushcontroller" name is admittedly arbitrary
and a bit unsatisfying...

The goal however is to get:

.insert_entities
.insert_relations

on the standard connexion object, plus a special
hooks runner supporting:

* hook cherry-picking by __regid__
* hook folding (by ... rewriting code)
* hook execution delaying in a followup transaction

>   - chaotic releases without a roadmap

I believe the roadmap is purely customer-driven.
If you need specific features, bug fixes, etc. you
should just peruse the forge.


>   - inconsistent APIs, classes are not interchangeable

The "store" API vs the connexion object is a known issue.
It will be addressed at some point in the future.

The incompatible "store" objects are bugs.

I would like the "fastimport" experiment to lay a future path
for this (do not read too much in "experiment", it will go
in production soon).

>   - bugs in some of the classes

Please report them.

> 
> 
> I have a few questions/remarks:
> 
> * Am I missing any I/O method?
> 
> * Concerning item 4, I believe more formal planning would benefit all
> CubicWeb users:
>   - Share a "massive data import" roadmap with CubicWeb users.
>   - Stick to a stable API that will remain unchanged for a "long time".
>   - Maybe start a CWEP on the subject?

I do plan indeed to write a CWEP for cnx.insert_entities/relations and
newer hooks APIS (for vectorized insertions). But not right now (tm).


> 
> * Are there still plans for remote access other than HTTP/HTTPS?
> 
> 
> Best,
> 

Regards,
Aurélien.




More information about the Cubicweb mailing list