[Cubicweb] Dataimport

Antoine Grigis antoine.grigis at cea.fr
Thu Dec 10 16:26:34 CET 2015

I have the filling that the CW 3.20.9 machinery supports Binary fields.
We just need to restore the previous behavior of the 
'_create_copyfrom_buffer' function.
Here is quick test with my modifications in bold:

*from psycopg2.extensions import Binary*

def _create_copyfrom_buffer(data, columns=None, **convert_opts):
     Create a StringIO buffer for 'COPY FROM' command.
     Deals with Unicode, Int, Float, Date... (see ``converters``)

     :data: a sequence/dict of tuples
     :columns: list of columns to consider (default to all columns)
     :converter_opts: keyword arguements given to converters
     # Create a list rather than directly create a StringIO
     # to correctly write lines separated by '\n' in a single step
     rows = []
     if columns is None:
         if isinstance(data[0], (tuple, list)):
             columns = range(len(data[0]))
         elif isinstance(data[0], dict):
             columns = data[0].keys()
             raise ValueError('Could not get columns: you must provide 
     for row in data:
         # Iterate over the different columns and the different values
         # and try to convert them to a correct datatype.
         # If an error is raised, do not continue.
         formatted_row = []
         for col in columns:
                 value = row[col]
             except KeyError:
                 warnings.warn(u"Column %s is not accessible in row %s"
                               % (col, row), RuntimeWarning)
                 # XXX 'value' set to None so that the import does not 
end in
                 # error.
                 # Instead, the extra keys are set to NULL from the
                 # database point of view.
                 value = None
*            if isinstance(value, type(Binary(""))):**
**                return None*
             for types, converter in _COPYFROM_BUFFER_CONVERTERS:
                 if isinstance(value, types):
                     value = converter(value, **convert_opts)
                 raise ValueError("Unsupported value type %s" % type(value))
             # We push the value to the new formatted row
             # if the value is not None and could be converted to a string.
     return StringIO('\n'.join(rows))

Using a 'SQLGenObjectStore' I am then able to insert Binary fields.
Do you think such modification is valid and could be integrated in a 
future release?


Le 04/12/2015 14:44, Rémi Cardona a écrit :
> Le 04/12/2015 13:44, Antoine Grigis a écrit :
>> I noticed a change in the 'dataimport' module between CubicWeb (CW)
>> 3.19.6 and CW 3.20.9.
>> The '_create_copyfrom_buffer' function has been refactored and the
>> default return case removed.
>> In the former version, the function default returned code was 'None'.
>> This is catched in both CW versions by the '_execmany_thread_copy_from'
>> function in order to execute thread without 'copy from' tabular data
>> speedup.
>> Thus in CW 3.20.9 inserting a 'Binray' entity field through the
>> 'SQLGenObjectStore' raises an Exception.
>> Is it a normal behavior? Is it still possible to insert a 'Binary' field
>> using a store and a schema with inlined relations?
> Hi Antoine,
> The dataimport module was made stricter in CubicWeb 3.20 and newer 
> releases. Its handling of binary objects is one such case.
> Do note that the current code (in 3.20 and all newer releases) relies 
> on pyscopg2's copy_from() method [1] which only supports PostgreSQL's 
> "text" format [2].
> AIUI, this "text" format cannot be used to deal with binary data, 
> though we haven't given it much thought.
> If dealing with binary is possible, please let us know how. Patches 
> will be greatly appreciated (even in an unfinished state).
> Cheers,
> Rémi
> [1] http://initd.org/psycopg/docs/usage.html#using-copy-to-and-copy-from
> [2] http://www.postgresql.org/docs/9.4/static/sql-copy.html#AEN71994

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.cubicweb.org/pipermail/cubicweb/attachments/20151210/3354e383/attachment-0186.html>

More information about the Cubicweb mailing list