Friday, November 2, 2012

Auto-dropboxify all web-downloaded documents

I often download pdf presentations from the web, but then when I reopen them in my laptop later, I wonder if they got updated/changed in the meanwhile. It would be nice to have a service where my local copy also gets automatically updated/synced (a la dropbox) when the authoritative source copy gets changed. (The difference from the dropbox model is here the source does not need to cooperate by deploying a sync software, e.g. dropbox, or even be aware that its document is being synced.)

So here is my project idea: auto-sync all web/url downloaded documents, so the downloaded local copy of a file is kept up-to-date with the authoritative copy on the web. I think many people will find this useful. Some other examples of documents downloaded from the web and would benefit from auto-updating include: city documents (e.g., street-parking rules, garbage collection rules, etc.), event calendars, tax documents, CVs. (Do you have any other examples?)

This shouldn't be too hard to build. A completely client-side solution is possible. A client-side software that keeps track of web-downloaded documents, and periodically checks with http to detect whether any of these documents got changed would do it. If a change is detected, the software should download a copy, and should prompt the user when she opens the local document next about whether the updated or the old copy to be used.

Of course a cloud-hosted push-based synchronization solution would be more scalable and efficient. This would be also more gentle for the original content provider as instead of thousands of client-side software periodically checking for updates, only the cloud-hosted synchronization service will check for the update to the document. Upon detecting an update the cloud service will push the updated document to all clients that posses this document. I am sure there are a lot of details to work out to make this service efficient. And there may even be a way to monetize this service eventually, who knows? This cloud hosted synchronization service idea may be seen as extending the CDNs to reach out and embrace the pc/laptops as the last hop for content-replication.

A further generalization of this "auto-sync documents downloaded from the web" idea is to allow any host to modify/update the document and yet still maintain the other copies in sync/up-to-date. This then turns into an application-level virtual-filesystem that is web-scale. And implementing that would be more tricky.


Ted Herman said...

What about Or, as Fukuyama suggested, is history ended (though that's not what he meant in this context)?

Murat said...

@ Ted Herman
My idea is selfish. I don't have the persistence or archiving of web documents (for the good of mankind) in mind. I am lazy, and I don't want to check via web if the documents I downloaded earlier got updated. If it is updated, I have access [at least choice to access] to the updated copy.

Mert Emin said...

Deploy Button ( works the other way around. You can keep your documents on the web up-do-date with the ones in your Dropbox or github etc.

Email archiving software said...

Good tips, im dropbox user for years and this is helpful a lot!

Two-phase commit and beyond

In this post, we model and explore the two-phase commit protocol using TLA+. The two-phase commit protocol is practical and is used in man...