[Cubicweb] experimenting with a DataFeed entity
nicolas.chauvat at logilab.fr
Mon Feb 15 13:17:22 CET 2010
On Mon, Feb 15, 2010 at 09:44:17AM +0100, Florent Cayré wrote:
> this experimentation is a good start towards effectively using LinkedData,
> which is a challenge because :
> * we need to reference entities in someone else's database and to use it
> effectively (there are of course performance issues => caching is needed) ;
> * we want to keep data as up-to-date as possible, the best being our copy
> being identical to the original (=> caching becomes a problem).
Yes, caching is an important issue when it comes to using data stored
in remote databases. I am not sure how others are doing it at the
moment. I will try to find something useful along the lines of
For now, since we are mostly manipulating text, I was planning on
storing a local copy everything the app downloads from third parties.
The time for which the data stays valid should probably be inferred
from the HTTP headers.
> As far as I understand, you are trying to address performance issues using
> caching, also trying to keep data up-to-date through regular polling.
http://www.w3.org/Submission/SPARQL-Update/ is the 'push' option that
was advertised lately on the semantic-web at w3c mailing list.
Polling is bad for performance from a global point of view, but it is
also very easy to get working as a first step.
> I would propose two other *complementary* approaches regarding data
> freshness issue :
> * web hooks (http://www.webhooks.org/) : in the long term, we need a way to
> be notified when the distant data changes ; the problem is we need a
> standard to do so, thus it is probably not a short term solution, although
> necessary to promote LinkedData usage I think ;
I will look at it, but the links to wiki and blog on the home page are
> * browser data freshness check : each time we use a distant data cache, we
> freshness by querying the original entity last modification date (or such)
> on the original website, and use this information to eventually refresh our
> copy. We just need a way to query an entity modification date effectively.
Is this the same as looking at the HTTP headers?
logilab.fr - services en informatique scientifique et gestion de connaissances
More information about the Cubicweb