[Cubicweb] datafeed

Aurélien Campéas aurelien.campeas at logilab.fr
Mon Dec 16 10:33:04 CET 2013


On 16/12/2013 09:10, David Douard wrote:
> On 04/12/2013 09:36, Dimitri Papadopoulos Orfanos wrote:
>> I've found a few words on datafeed in the Logilab blog about CubicWeb 3.11:
>>      http://www.cubicweb.org/blogentry/1512775
>>
>>      A new 'datafeed' source was introduced, inspired by the
>>      soon to be deprecated datafeed cube. It needs polishing
>>      but sets the foundation for advanced semantic web
>>      applications that import content from others site using
>>      simple http request.
>>
>>      A 'datafeed' source is associated to a parser that
>>      analyses the imported data and then creates/updates
>>      entities accordingly. There is currently a single parser
>>      in the core that imports CubicWeb-generated xml and needs
>>      to be configured with a mapping information that defines
>>      how relations are to be followed. It provides a viable
>>      alternative to 'pyrorql' sources. Other parsers to import
>>      RDF, RSS, etc should come soon.
>>
>>  From what I gather datafeed can help import data from structured Web
>> sites. It doesn't help importing data from files (we have to parse at
>> least CSV, XML, DICOM files), does it?
>
> Not really, datafeed really is about importing (part of) an external
> database in a CW instance. datafeed can update entities as long as it
> can identify a unique identifier for that entity in the remote database
> (typically an URI).
>
>> In our application we repeatedly scan a directory for new files and data
>> appended to existing files. Can datafeed help in that case? I'm thinking
>> here about the "creates/updates entities accordingly" part of datafeed.
>
>
> At update time, it needs to be able to ask the objects in the remote
> database that have been created/modified since last synchronization. All
> these stuff you won't (easily) have from files. What you need is a set
> of (smart) importation scripts.

You can also remain within the source/datafeed realm and implement
your own datafeed.DataFeedParser subclass to handle sync from files.

Regards,
Aurélien.





More information about the Cubicweb mailing list