Web, REST, Hypermedia, Java, Framework
November 19, 2013

Ideas for a hypermedia framework

(tl;dr) Watch this presentation video, this document describes a derivative of the presented architecture.


The server offers an HTTP remote API, making the state of its resources available. No session-state is managed on the server.

The server delivers hypermedia representations of resource state. The only relevant hypermedia format is HTML. These representations contain semantic descriptors with data and allowed affordances (links and form controls). The current state of the application is therefore included in the messages between client and server, they are self-descriptive (HATEOAS).

Semantic descriptors identify parts of the HTML representation with application semantics (microformat). The same markup strategy must be used to declare the relationship of links and form controls with the current, and potentially the resulting, resource representation. Application semantics and controls must be documented with a human-readable profile, in HTML accessible by users of the API.

The server may send styling information to a client, these stylesheets should select semantic descriptors, as well as link/form relations; they should not be coupled to the structure of an HTML representation.

The server may send executable client code on demand, depending on classification of the client making a request.

Resource state may be manipulated by clients with HTML form-encoded representations.


The client renders HTML hypermedia resource representations received from a server.

The application transitions to a new state when the client follows links or submits forms. The only URL known by a client is the published index URL of the server's API. Every representation must be accessible from the starting representation, via a state transition. All URLs are opaque to a client, the server controls the URL namespace of the API (versioning, rolling upgrades). The client is only allowed to construct URLs with standardized routines of the hypermedia format (HTML GET forms).

A human-driven client may render an HTML representation faithfully. It may transclude representations obtained by following embedded links in the rendering (images, frames). A user with a web browser, without scripting, must be able to use the whole API of the application (accessibility). Humans may use less faithful clients, which interpose their own interpretation of resource representations (one-off clients).

Automated clients (crawler, monitor, script, agent) may render representations, follow links, and potentially submit forms, to reach their goal.

Clients may evolve independently from servers, depending on their degree of knowledge of application semantics. A client must not rely on the structure of an HTML representation except for containment of semantic descriptors. It may select descendant descriptors of known descriptors, or select and invoke known link relations and form controls. Generic clients for well-known application profiles are possible.

Clients may understand semantic descriptor proxies for transclusion ('self' links), allowing lazy loading of resource state.

If a client does not understand some part of a form control, it must send that part as given to the server.

Clients may compose controls by obtaining affordances applicable to analyzed resource representations from distributed sources.


HTTP/OAuth authentication identifies the principal of a request, no server-side session cookies must be used. Cross-site request forgery is prevented by requiring, on the server, a double-submit of form representations (form digest in out-of-band cookie) for unsafe or non-idempotent actions.
Fine-grained hypermedia messages and lazy-loading clients may increase latency. Servers may create coarse-grained messages and eagerly compose larger representation. Utilize HTTP 2.0 multiplexing to reduce the cost of connections. Utilize gzip compression of HTML representations to reduce bandwidth usage.
HTTP caching is extremely effective in this architecture; any resource representation, style information, and even profile representation can be cached with flexible server-side policies. HTTP caches can be added on the client and between client and server (reverse proxy cache).
Public and standardized profiles as well as link and form relations should be used whenever possible to describe application semantics. However, the format of a profile must be HTML and declared with a URL in a resource representation. Profiles should be interlinked. RDFa/RDF schema may be a future option.

Proposed Java stack

The development experience and programming model must be seamless, any part of the system can be previewed, examined, and tested with even a text-only web browser (lynx). Applications are written in an incremental fashion, adding more powerful client functionality without compromising accessibility (a broken escalator is stairs): First text-only browser, then graphical desktop or mobile browser, then graphical browser with script engine enabled.

Implement the HTTP remote interface with JAX-RS in a Java EE runtime. RESTEasy has rudimentary support for mapping form representations to Java classes/properties. The Java EE container can run in debug mode, allowing preview in a web browser without redeployment, unless a class signature changes. Redeploy can then be performed with a shortcut.

Write HTML templates on the server with thymeleaf in XHTML mode, enabling preview with or without deployment, as well as instant live-editing of content and style (IntelliJ & Chrome). JSP (yes) may be a viable alternative.

Deliver code-on-demand with HTML (not XHTML!) representations as cross-compiled JavaScript, based on GWT. Utilize gwtquery to select semantic descriptors and controls in the host document, then manipulate the DOM to interpose a custom interpretation of the resource representation.



(Update: More comments here)