Offline Web Applications, we're not there yet.

Being able to use applications offline is an important feature in the quest for native like (mobile) web apps. The offline mechanisms currently available are indexedDB and appcache. Developers who’ve already used the latter know that it’s at least clumsy, and sometimes useless. If you haven’t, here’s why.

Use case, use case, use case

I want to create a mobile client for Twitter, no wait! I want to create a mobile client for Google Maps, no wait! I want to create a mobile news reader, no wait! …
Anyway, I’m a strong believer believer in the Open Web and a challenge taker, I don’t even want to use phonegap.
To display the home page and the different views of my app, I need a way to store the HTML, CSS and JS. To store the tweets, maps or news items of my app in a manageable way, I need a database. This content should be updated as soon as a connection is available; same goes for the HTML, CSS and JS that runs the app. More importantly: my tweets, maps or news items aren’t only text, I need to store the associated images or profile pictures.

What appcache offers

The main way to use the appcache is the cache manifest: a text file associated to an HTML page using the manifest attribute.

<!DOCTYPE html>
<html manifest="index.appcache">
<head>
...

The manifest itself consists mainly in a list of resources that should be cached and made available for offline use.

CACHE MANIFEST

CACHE
# resources listed in this section are cached
styles.css
script.js
jquery.js
logo.png

NETWORK
# resources matching those prefixes will be available when online
http:/search.twitter.com/*
http:/api.twitter.com/*

Here’s how a browser deals with appcache (simplified):

For more info, have a look at appacachefacts or the spec.

If the appcache has been udated, the browser does not refresh the page to display the new content. It is however possible to listen to updateready events and propose users to reload the page:

if (window.applicationCache) {
    applicationCache.addEventListener('updateready', function() {
        if (confirm('An update is available. Reload now?')) {
            window.location.reload();
        }
    });
}

Is that an acceptable user experience?
Having to do that every time the content of your app is updated is not acceptable (that’s why appcache should not be considered as a way to improve performances of normal websites). It would be acceptable if all the content of the app was updated using XHR, and only the structural HTML, CSS and JS was listed in the cache and updated occasionally (this would actually be a simpler update mechanism than the ones used by native apps). But current websites are usually not developed this way, as this can cause some accessibility and crawling problems.

Storing and deleting content

To store the content of my app in a manageable way, some kind of database is required. Luckily enough, IndexedDB will soon be available in the five major browsers. It will allow tweets, maps or news items to be first stored in plain text, as JSON or HTML; and later listed, sorted, updated and deleted.

What about the associated images?
That’s the tricky part when building a third-party app for a popular API such as Twitter or Google Maps: images loaded from a foreign origin cannot be turned to data-URLs for security reasons (unless they’re served with CORS headers, but they’re not; and using a server as a proxy to set CORS headers doesn’t scale, as those APIs have IP based usage limits).

Those images can still be listed in the cache manifest: a manifest server could be built to keep track of what should be cached for each user; it would dynamically generate a new manifest every time an image has to be added or removed; and once the new manifest is available, the client would do a applicationCache.update().
This is however absolutely unrealistic: First, a server is required, and that’s very much unfortunate. Then, when updating the cache, the browser throws away all of its current content and loads all resources again… This means that every-time a single image has to be added or removed from the cache, all other resources are downloaded again!

Can I build offline web apps, now?

App logic available offline appcache
Text content available offline indexedDB
App up-to-date when online appcache + prompt user to reload
Content up-to-date when online indexedDB + update using XHR
Foreign images available offline not yet

How should appcache be improved?

Content update

Yehuda Katz’s suggestion is to let developers specify that the manifest should be checked for updates before the page is loaded from the cache. If it has been updated, the page should be downloaded from the network.
This sounds like a good idea when XHR updates are not an option, and would probably yield good performances. I’m just afraid this could be abused by many websites just to improve web perfs, and users would end up with 1MB of HTML, CSS, JS and images on their disk for every site they ever visited in their life :-/
Ian Hickson suggested to rename the appcache to “offline application store”. Although this is unlikely to happen, I would also advocate using it only for offline purposes, and leaving caching strategies to the browser.
A compromise could be to adopt Yehuda’s solution, but always prompt users for permission to use the appcache. This should effectively prevent cache bloat. Not the perfect solution, sometime ago I actually opened a bug in Firefox to remove the user prompt: application cache should not bother users.

Dynamic cache

The best suggestion I’ve seen to address the need for a dynamic cache is the defunct DataCache proposition. The idea is to add an independant and dynamic cache to the static appcache, to let developers add and remove resources from the cache at will.
I confess I had trouble understanding all the details of the spec, but here’s a naive and rough API I’m proposing:

applicationCache.dynamicStore.add( uri );
applicationCache.dynamicStore.remove( uri );
applicationCache.dynamicStore.update( uri );

var cacheTransaction = new applicationCache.dynamicStore.transaction();
cacheTransaction.add( uri );
cacheTransaction.add( anotherUri );
cacheTransaction.run();

Of course, it should be possible to listen to events such as “updated” on applicationCache.dynamicStore.

This spec also introduces the interesting concept of offlineHandlers that allows XHRs to be rerouted to client-side functions when offline.

navigator.registerOfflineHandler( "search.twitter.com/*", function( request, response ) {
  // search for results in indexedDB instead
  ...
});

I definitely advise you to take a look at the code examples.

Conclusion

We’re not there yet and unfortunately I’ve got no ideal solution to the content update problem that provides both a good developer and user experience. “What Yehuda says + permission prompt” is the least worst option I can think of.
To make the appcache dynamic, I would strongly suggest giving a second chance to DataCache, and maybe simplify the API a bit.

The W3C will soon gather to think about the future of offline web app, so if you’ve got any valuable input, speak up!

8 thoughts on “Offline Web Applications, we're not there yet.

  1. jpvincent

    I would also advocate using it only for offline purposes, and leaving caching strategies to the browser
    le problème étant sur mobile les limitations ridicules du cache des navigateurs : de mémoire dans iOS 4, on peut cacher 25Ko par objet, avec un total par page de l’ordre de 200Ko. Pour desktop, je suis d’accord pour dire qu’une politique de cache classique marche déjà très bien, même avec les anciens navigateurs.
    The idea is to add an independant and dynamic cache to the static appcache, to let developers add and remove resources from the cache at will
    D’accord pour dire que le côté “tout ou rien” du mécanisme d’appcache actuel est un peu violent, mais il est en même temps plus simple à appréhender et correspond à une logique d’application : la version d’une appli est incrémentée dès qu’un seul de ses composants change. C’est très pratique pour la maintenance et ça évite les bugs difficiles à trouver de considérer une page comme un pack uni de fichier.
    Pour invalider et aller rechercher des JS / CSS et d’autres ressources et les stocker en cache, sur mobile j’ai vu une technique utilisant localStorage : tu as droit à 5Mo, tu peux aller chercher en XHR tous les fichiers qui t’intéressent et les gérer indépendamment, et tu n’exécutes ton CSS ou ton JS que lorsque tu en as besoin. Par contre ça fait beaucoup de code bien sur :)

    1. louisremi Post author

      The fact that some websites could want to use the appcache to increase webperfs is understandable. All I’m saying is that it shouldn’t be too easy, otherwise the 16GB of you iPhone will soon be full of JS and CSS. And Apple should raise this 25KB limit for sure.

      Regarding the dynamic cache, I’m of course aware of the possibility to put JS and CSS in localStorage, but it’s impossible with foreign images. And the problem will be the same with videos and audio files.

  2. Joseph

    In some cases you don’t need to wait. Sections of the DataCache API can be implemented on top of client side storage like IndexDB, Web SQL, or DOM Storage if you put a facade around XHR. I did this with an earlier draft of the DataCache API and put the code up on github. As you mentioned earlier, this won’t work with foreign content but it at least does let you experiment with the APIs and get an idea of how it can be simplified and improved.

    You’ve rightly pointed out some limitations and room for improvement with the existing offline web applications solutions. I’ve seen similar ideas and concerns raised by others as more developers are using application caches. This post does a good job articulating concrete problems for an understandable and relatable use case.

    1. louisremi Post author

      Wow, I should check comments on my blog more often.
      No, I wasn’t aware of this moz specific API. This is great stuff, exactly, what I would like to see added to the spec.

  3. Pingback: Web Design Weekly #19 | Web Design Weekly

  4. Pingback: O appCache e como usar a web offline | Blog Gonow

  5. Pingback: CORS: an insufficient solution for same-origin restrictions | Louis-Rémi

Comments are closed.