Auto-dropboxify all web-downloaded documents
I often download pdf presentations from the web, but then when I reopen them in my laptop later, I wonder if they got updated/changed in the meanwhile. It would be nice to have a service where my local copy also gets automatically updated/synced (a la dropbox) when the authoritative source copy gets changed. (The difference from the dropbox model is here the source does not need to cooperate by deploying a sync software, e.g. dropbox, or even be aware that its document is being synced.)
So here is my project idea: auto-sync all web/url downloaded documents, so the downloaded local copy of a file is kept up-to-date with the authoritative copy on the web. I think many people will find this useful. Some other examples of documents downloaded from the web and would benefit from auto-updating include: city documents (e.g., street-parking rules, garbage collection rules, etc.), event calendars, tax documents, CVs. (Do you have any other examples?)
This shouldn't be too hard to build. A completely client-side solution is possible. A client-side software that keeps track of web-downloaded documents, and periodically checks with http to detect whether any of these documents got changed would do it. If a change is detected, the software should download a copy, and should prompt the user when she opens the local document next about whether the updated or the old copy to be used.
Of course a cloud-hosted push-based synchronization solution would be more scalable and efficient. This would be also more gentle for the original content provider as instead of thousands of client-side software periodically checking for updates, only the cloud-hosted synchronization service will check for the update to the document. Upon detecting an update the cloud service will push the updated document to all clients that posses this document. I am sure there are a lot of details to work out to make this service efficient. And there may even be a way to monetize this service eventually, who knows? This cloud hosted synchronization service idea may be seen as extending the CDNs to reach out and embrace the pc/laptops as the last hop for content-replication.
A further generalization of this "auto-sync documents downloaded from the web" idea is to allow any host to modify/update the document and yet still maintain the other copies in sync/up-to-date. This then turns into an application-level virtual-filesystem that is web-scale. And implementing that would be more tricky.
Comments
My idea is selfish. I don't have the persistence or archiving of web documents (for the good of mankind) in mind. I am lazy, and I don't want to check via web if the documents I downloaded earlier got updated. If it is updated, I have access [at least choice to access] to the updated copy.