[Cubicweb] Multisource in CW

Sylvain Thénault sylvain.thenault at logilab.fr
Thu May 24 18:19:19 CEST 2012


On 24 mai 12:09, Nicolas Chauvat wrote:
> Hi All,
> 
> On Thu, May 24, 2012 at 09:26:54AM +0200, Vincent Michel wrote:
> > - we do not include at all the schema, and let the user deal with the remote 
> > schema within the RQL request. I think that this is an interesting option if 
> > we consider that this multisource is dedicated to quick and on-the-fly joints 
> > to remote instances (with schemas that may changed...), and that we do not 
> > want to migrate the local instance.
> 
> The key point is that in order to use CW's "base ui framework" that
> generates a large part of the UI, we need a datamodel/schema. Quick
> joints and on-the-fly queries without previous knowledge of the schema
> means that you have to code a lot by hand in the view that fetches
> this external data.

That's a basement question : should we consider that having to define
a schema for the external source is a pb or not? CW without schema
information doesn't sounds like CW anymore, so I hope the answer is no :)
 
> > I don't have a source Geoname, I have an instance with a Geoname cube and 
> > Geoname data. Thus, it may be fetched from an URL (but for now, it is only 
> > in_memory connections). It may be even possible to think a future improvement 
> > that allows to query SPARQL endpoints or JSONp endpoints, rather that CubicWeb 
> > instances..

So I guess you're doing things for experimental purpose, since once you've
a geoname CW instance, you can use regular pyrorql source right ?
 
> Allowing the query planner to mix and match CubicWeb, SPARQL and
> JSONp... sounds interesting, but difficult.

Writing SPARQL/JSONp source shouldn't be that hard.

I still suspect Vincent's problem lies in the way the multi-sources query
planner is implemented currently, eg :

 "Any X,XA WHERE Y linked_to X, X attribute A" 

where X could come from an external (but not Y) is currently executed with 
the following steps:

1. fetch "Any X,XA WHERE X attribute A" from the external source and store
   results in a temporary table, along with records for this query from the
   system source

2. execute "Any X,XA WHERE Y linked_to X, X attribute A" on the system source
   using the temporary table for X/XA

while we could want:

1. execute "Any X WHERE Y linked_to X" on the system source

2. retrieve XA from external sources for each X returned by the previous step

3. build rset for X,XA

which would be practical on situation where the source is e.g. geonames or
dbpedia while the former isn't.

-- 
Sylvain Thénault, LOGILAB, Paris (01.45.32.03.12) - Toulouse (09.54.03.55.76)
Formations Python, Debian, Méth. Agiles: http://www.logilab.fr/formations
Développement logiciel sur mesure:       http://www.logilab.fr/services
CubicWeb, the semantic web framework:    http://www.cubicweb.org



More information about the Cubicweb mailing list