Rediscovering Web 1.0 Principles in Virtual Worlds
The dream of three-dimensional cyberspace was imagined decades ago in a world that did not yet have the available technology to realize it. Meanwhile, a different kind of interactive world was forming in the World Wide Web, and web developers have been unlocking its potential ever since. There’s an important lesson in history catching up with us, one fundamental to all interactive worlds (not just the web), with the possibility of redefining the trajectory of the metaverse’s technological future. In this article I talk about a new open source library I’ve created – HyperShape – that implemented 3D worlds in a fundamentally different way and what my discoveries were when making it.
On places like Hacker News, there is an ongoing conversation about web applications and alternatives to React. The root emotion of this conversation seems to be burnout, and its reality is wasted money and time. Web developers are starting to suspect that their craft has become unnecessarily complex. Anyone in this industry knows the feeling of endless web framework churn. Like a frog in a boiling pot, you wake up one day and suddenly hooks, virtual DOM, webpack, functional components powered by immutability all seem indispensable. Some are taking a step back and asking: do things really need to be like this?
Certainly, some applications require high performance, componentization, and cascades of state. Google Docs is a good example of a web app that pushes the limits of performance and properly requires advanced usage of JavaScript. But more often we deal with more basic problems: settings screens, basic navigation, etc. We don’t need to bring out the whole armada of JavaScript libraries to make these, and this is where I think the seed of doubt about our current industry lies: in a sense that what we are building is inappropriate for the problems we work on.
When we use multiple layers of abstraction (virtual DOM diffed UI, server side rendering, state, API clients, API handlers, ORMs, etc.) to make a simple email submission form, for instance, it not only takes time to create these abstractions (hoping for the day we’ll encounter a complicated frontend that fully utilizes them), it also ostracizes folks like backend developers who increasingly feel like they can’t understand the most basic frontend code. We’ve raised the bar of what it means to be a full-stack developer, to the point that some doubt such a role is even possible anymore. All this increases businesses costs, or if it’s your personal project, chips away at your precious time.
The New Old Way
Recently, technologies like HTMX have come into the scene, saying with a tempting whisper that there’s another way. They pick up on old ideas and reapply them, ideas they suggest are simpler. For those who aren’t familiar with the library, HTMX tries to pick up where the HTML spec was abandoned. To quote their website:
Why should only <a> and <form> be able to make HTTP requests?
Why should only click & submit events trigger them?
Why should only GET & POST methods be available?
Why should you only be able to replace the entire screen?
They believe these ideas could radically improve HTML as a language for making interactive web pages. HTMX makes a tradeoff by making the frontend radically thinner and tilting logic to the backend. It does this by using a JavaScript library that extends the functionality of HTML elements in ways they believe is a logical extension of the original HTML spec (and hopefully in the process let you avoid writing scripts almost entirely). The programmers behind HTMX believe the browser already has a great deal of tech inside of it that is not being fully utilized.
To give an example from their website:
<div hx-target="this" hx-swap="outerHTML">
<div><label>First Name</label>: Francisco</div>
<div><label>Last Name</label>: Danconia</div>
<div><label>Email</label>: francisco@danconiacopper.com</div>
<button hx-get="/contact/1/edit" class="btn btn-primary">
Click To Edit
</button>
</div>
Notice how the button indicates that a HTTP GET request will be called when it’s clicked, and a specific <div> element target will be replaced. What does this endpoint “/contact/1/edit” return? More HTML. This is a shift away from the common practice of JavaScript-driven UIs that use APIs returning JSON data, rather than usable hypertext including state.
<form hx-put="/contact/1" hx-target="this" hx-swap="outerHTML">
<div>
<label>First Name</label>
<input type="text" name="firstName" value="Francisco">
</div>
<div class="form-group">
<label>Last Name</label>
<input type="text" name="lastName" value="Danconia">
</div>
<div class="form-group">
<label>Email Address</label>
<input type="email" name="email" value="francisco@danconiacopper.com">
</div>
<button class="btn">Submit</button>
<button class="btn" hx-get="/contact/1">Cancel</button>
</form>
We see how little JavaScript is necessary to replace a small island of HTML in our website with an editable variant. The state is the HTML itself. This is a general theme of HTMX.
The authors of HTMX are huge proponents of the esoteric world of hypermedia systems. They wrote a book on this topic and addressed common questions like “Why didn’t this spec advance further? How could it have been made better?” and more importantly, where did all these ideas come from in the 1980s/1990s?
Primitive elements of an interactive World Wide Web
You might be wondering why looking to the past is of any value. Looking into the origins of Web 1.0 teaches the history of how it evolved, and the impact at each stage of evolution offers insight for today — in particular, the point where the World Wide Web became a place of interactive applications rather than just a read only network of research data. History can show us the absolutely essential requirements for the web — that which cannot be removed without destroying our concept of what’s possible to do.
The first essential element discovered came from the original web standard as a solution to a desperate documentation issue at CERN. Scientists and university academics had information piling up in siloed servers across CERN and there was a need for this information to be shared. This was in 1989, so computers had been around for some time — someone just had not put together all the elements into an idea of the internet (and a wiki was far from people’s imaginations at the time).
First the information had to be shareable even to read. This led to a language for describing the structure of this text, but most importantly the development of a first HTML element to unlock the power of the web.
“The WorldWideWeb (W3) is a wide-area hypermedia information retrieval initiative aiming to give universal access to a large universe of documents.”
The World Wide Web project (the internet’s first web page)
The link
The humble <a> tag or anchor link is a powerful idea. It is what allows us to move around from document to document. The World Wide Web was built to be unstructured. There was no intrinsic hierarchy to the information on the web (though Yahoo and others tried to impose one!). Links allow exploration without a fixed path — as many who get lost in Wikipedia are familiar with.
This innovation alone allowed a plethora of information to be explored and shared, and unlocked the read-only World Wide Web.
The form
People’s imaginations were not satisfied with read-only browsers, and it didn’t take them long to think of ways that browsers could send information. This goal was realized in 1993 with the standardization of the <form> element, which became the first HTML element that could explicitly send data to servers. Standardized input controls in browsers unlocked the revolution of web applications. Imagine the World Wide Web transforming from a place where you could navigate pre-structured information about chemicals or celebrities, to a place where you could make search queries, order pizza, and send emails via web interfaces.
If the history of web forms interests you I recommend checking out The Evolution of Web Forms.
The foundations of an interactive world of information
With just these two elements, an entire textual world of modifiable data experienced by “Web surfers” could be created. What this suggests, surprisingly, is that JavaScript, CSS and so on are non-essential to the basic functionality of the Web. Yes, there were further advancements that allowed more sophisticated website experiences, but the link and form were the foundations to create the most minimal web as we know it. These foundations were heavily used by the software architecture of the server-side web applications in the early 2000s, powering the internet for over a decade.
Virtual worlds and the Web
Interactive experiences are not only of the flat 2D variety, but also exist in 3D. Typically these experiences occur in video games that attempt to maximize the experience of the user, and not creators (especially users who want to create entirely unique personal content). After reading about HTMX and the history of Web 1.0, I became curious how these basic principles could be applied to 3D environments.
Certainly, all the technology is now here to create 3D worlds with JavaScript. Browsers in the 2020s are not the same as in the 1990s. We now have a bevy of technologies at our fingertips: Canvas, WebGL, WebAudio, WebGPU, WebSocket, WebRTC, WebXR. These technologies come with the expectation of a lot of JavaScript to use them. I wanted to explore the question of whether virtual worlds could be expressed without JavaScript, but with as much power as Web 1.0.
Before we continue, I want to clearly define some terms:
3D world - Any 3D environment, including the most basic static artwork-like experience that can be roamed around.
Virtual world - A 3D world in which multiple players can see and interact with each other and with the world.
Metaverse - An interlinked collection of virtual worlds, including worlds created by different people.
Web 1.0-Era Virtual Worlds
Looking back in time, we can find artifacts of similar methods of describing 3D worlds that tried to anticipate the future during the Web 1.0 era. VRML was one such technology introduced in 1994, aimed at 3D visualization using a hypertext language similar to HTML. It was envisioned by David Raggett as a potential language for the World Wide Web for 3D.
I will be the first to say I am not a VRML expert, but imagine a markup language that defines explorable worlds. These worlds were meant to be interactive and scriptable (not with JavaScript originally). At this time there were not highly capable browsers, so VRML relied on specialized viewer applications.
A scene like this might be built with code like:
DirectionalLight {
direction 0 -1 0
}
# A yellow cone rotated about 57 degrees from vertical.
Transform {
rotation 0 0 -1 1
children [
Shape {
appearance Appearance {
material Material {diffuseColor 1 1 0}
}
geometry Cone {
bottomRadius 2
height 8.1
}
}
]
}
When building my own virtual world technology inspired by the history of Web 1.0, I knew it had to have at minimum the concepts of a link and form. Unlike VRML, I can take advantage of many advancements that exist in the current web graphics visualization world.
Libraries like ThreeJS for 3D visualizations (which have solved many important problems like 3D model import, common shaders, and abstraction of graphics hardware).
The standardization of GLTF as a file format for 3D on the web, optimized for transmission speeds in browsers.
The standardization and implementation of Physical Based Rendering (PBR) as a shader language for lighting models to make things beautiful.
The imminent cross-browser standardization of WebGPU, using cutting edge multi-threaded graphics APIs like Vulkan and Metal.
The extension of HTML elements with Web Components, allowing me to more easily make up an HTML-like markup language.
Various implementations of WebXR for augmented reality and the complex types of 3D worlds I think we can expect in the future.
A new hypertext for virtual worlds
Let’s imagine a simple Web 1.0-era markup language. Let’s start first with creating the world.
<mv-space>
</mv-space>
Think of this as the creation of an axis you can attach 3D objects to. Which objects? GLTF 3D models of course.
<mv-space>
<mv-model src="fox.gltf" position="0,1,0"></mv-model>
</mv-space>
Now remember, Web 1.0 didn’t have JavaScript. We need to create a world where interactivity is built into our hypertext language itself. Let’s implement the first powerful idea – the link! Imagine that I and my friend want to create two sites that link to each other. We have a website at https://www.mr-fox.com/fox.html:
<mv-space>
<mv-model src="fox.gltf" position="0,1,0"></mv-model>
<mv-link href="https://www.mrs-rabbit.com/rabbit.html">
<mv-model src="next_arrow.gltf" position="2,1,0"></mv-model>
</mv-link>
</mv-space>
My friend might have her own website at the URL https://www.mrs-rabbit.com/rabbit.html:
<mv-space>
<mv-model src="rabbit.gltf" position="0,1,0"></mv-model>
<mv-link href="http://mr-fox.com/fox.html">
<mv-model src="back_arrow.gltf" position="-2,1,0"></mv-model>
</mv-link>
</mv-space>
We can now bounce back and forth between each other’s sites by clicking on the appropriate 3D element link. (Internet history-conscious or older readers might recognize this as an old school “web ring,” such as on Geocities and Angelfire).
The simple and powerful idea of the hyperlink alone could create a virtual world that’s read-only and explorable. Servers could dynamically generate whole worlds for the user to explore based on just this idea and the URLs they link to. They need not only be static models: for example, they could include dynamic changes driven by time of day or weather. It’s in the power of the server to generate the world people see.
We shouldn’t stop here though! To extend our virtual world technology to the full potential of an application platform, we need to add the idea of a form. Forms are what allow our browsers to send information to server endpoints and see rendered hypertext results. In an HTML form, we have a submit button — let’s create a parallel idea in our hypertext language.
<mv-space>
<mv-model src="rabbit.gltf" position="0,1,0"></mv-model>
<mv-link href="http://mr-fox.com/fox.html">
<mv-model src="back_arrow.gltf" position="-2,1,0"></mv-model>
</mv-link>
<mv-form action="http://favoriteanimal.com/like/rabbit" method="POST">
<mv-input type="submit">
<mv-model src="thumbs_up.gltf" position="0,3,0"></mv-model>
</mv-input>
<mv-form>
</mv-space>
And here we have a “Like” button! (Alas, we are on the web and we do like to “Like” things.) But wait, couldn’t we have just had a server that watched for linked special pages for GET requests and also tracked the Like server-side? Yes — that brings us to an important aspect of forms: input controls. HTML comes standard with a number of controls (text, checkboxes, dropdown boxes, passwords, etc). Their purpose is to feed named values into the form’s action URL HTTP request when submitted.
<mv-space>
<mv-model src="rabbit.gltf" position="0,1,0"></mv-model>
<mv-link href="http://mr-fox.com/fox.html">
<mv-model src="back_arrow.gltf" position="-2,1,0"></mv-model>
</mv-link>
<mv-form action="http://favoriteanimal.com/like/rabbit" method="POST">
<mv-input type="text" position="0,4,0" name=”email”></mv-input>
<mv-input type="submit">
<mv-model src="thumbs_up.gltf" position="0,3,0"></mv-model>
</mv-input>
<mv-form>
</mv-space>
Using these, we can add a user’s email to their intention of a “Like” using this text entry input to fully utilize the vision of forms.
So there we have it, the two foundational primitives that powered Web 1.0 — now driving a virtual world. I call this hypertext language technology HyperShape, and it’s an implemented library you can play with right now.
This library is primarily created to show the most essential primitives of hypertext language applied to a 3D world. I’m not expecting anyone to go and create a high-quality game using this technology in its current form, but I think this library could be used for thinking about how a virtual world might be created from radically different first principles. Similar principles in the foundations of libraries like HTMX solve similar problems: particularly, the burnout on JavaScript. I wanted an interactive 3D technology usable by backend people that was as familiar as web forms are today.
Did I just invent a new industry role of backend 3D graphics programmer? I’m not sure I want to put that on my resume just yet, but it’s an interesting consideration: why shouldn’t server side developers be participating more in massively interactive worlds, with simpler tools to implement visualization and interactions. After all, they often have many adjacent skill sets that might be applied to other aspects of virtual worlds: identity systems, data storage, and collaborative networking. Why not integrate those with a good hypertext language, rather than requiring them to become overly specialized frontend JavaScript 3D developers?
Thoughts on a Web 1.0 metaverse and beyond
I imagine a potential future where many backend 3D developers are creating interconnected worlds with each other. I want to emphasize that even with the principles of a link and form I believe a huge variety of metaverses could be made, but … I think to make great-feeling metaverses, some additional technological considerations are needed that take into account people’s aesthetic needs.
Augmented reality
Augmented reality presents a new frontier, and we can see the implementation playing out right now. I think it’s important to be forward thinking from the beginning with HyperShape; I call the axis of my 3D objects <mv-space> in anticipation of worlds with multiple spaces. There are already implementations on Oculus Quest’s browsers for
anchor points (fixed points in 3D space you want the device to remember)
specific types of plane representative of real world environments (walls, tables, etc.)
HyperSpace will be built with the consideration of there being multiple spaces that could be attached to. In your house there might be many spaces managed by various people. You might have a chess board attached to your coffee table run by your friend’s server and your own calendar hanging on your wall run by your own infrastructure.
Form limitations
HTML forms have various inputs which provide values to form data, but one feature they lack is the ability to pass HTML headers along with their input data. Given the nature of virtual worlds, I thought it was crucial to be able to pass along contextual information about the avatar by default in order for the server to make informed decisions on what world to render next. In particular, the camera position and look-at position are passed as hidden elements to all form submissions.
Every form POST comes with form data items like these:
metaverse-camera-position 0,1.75,3
metaverse-camera-lookat 0,0,0
... other form data
I believe this is essential to enabling metaverse exploration in a way that is easy for servers.
There might be additional technologies required for form submission. I am exploring the potential of:
Using WebSockets or other networking technologies for real time submission of data.
The ability to declaratively auto-submit after a certain amount of time, or submit on an interval or by some other event (like movement of a camera).
How to handle authentication information passed along to forms in a scalable way (perhaps more complex than hidden input fields). Particularly, I'd like a way that doesn’t involve cookies, to account for various legalities.
Forms are the sole point of data submission in this hypertext language. HTMX exposed the limitations of forms in our web applications, and I believe there are things I can learn from HTMX that are specific to the concerns of metaverses.
View transition technologies
HTML recently gained the capabilities of the View Transitions API in Chrome. In short, this is a technology that lets you swap out pieces of HTML more gracefully without a lot of effort. HTMX has also been an innovator in figuring out ways to manipulate a DOM without the full page reloads that forms and links produce, and recently announced support for View Transitions.
I am keeping my eye on innovations in both these technologies to help think of ways to make the interaction with metaverse environments more graceful and visually appealing. HyperSpace currently can swap out hypertext, but it could do more such as graceful fadeouts/fadeins, smooth movement with animation curves, etc.
This is valuable because design considerations are often one of the main reasons why people turn to JavaScript. I want to limit the use of JavaScript in metaverses to only the most complex experiences.
Server-side patterns
When people face common server-side challenges, they tend to develop common architectures and best practices. As what happened with Web 2.0, I expect backend patterns to develop around rendering hypertext for virtual worlds. Already I’m foreseeing potential new architectures for several particular concerns:
Making server-side content be more informed by physics and scripted behavior
Allowing servers to collaborate in a unified presentation of digital land
A protocol for inventory collections and spawning
The solutions for this might include web servers hybridized with headless game engines, standardized protocols for one server discovering and sharing hypertext representations of terrain with another, and standards around inventory associated with web identities. This is beyond the scope of this article, but I'm eager to experiment.
Sound
Sound adds an entirely new dimension to virtual worlds, and I haven’t even begun to scratch the surface of how to represent it. Some considerations for sound:
3D positioned sound
Interaction sound effects
Realtime sound streams (such as voice)
Conclusion
Reflecting on the difficulties of interactive application development with JavaScript, exploring the history of Web 1.0 principles and creating HyperShape has been a joy. I was happy to see that the retro virtual world tech of my recent imagination is not only possible with existing web technologies — but also powerful. This is an exciting frontier to me, and I hope it will bring us closer to metaverse technology where:
Technology to create 3D worlds is more accessible to technologists than ever, particularly that backend developers and young web developers aren’t excluded from experience design implementation.
People are excited about using web technologies, and not feeling locked into JavaScript.
Individuals and businesses are given a sense of simplicity and lower investment during creation (ultimately resulting in money and time saved).
I don’t entirely know where this will go on two fronts:
I don’t know if HTMX will be validated in practice as solving a crucial problem of investment of time in frontend development, but there’s a ton of activity in the dev community that will teach us in the next year.
I don’t know what the fabric technology of the metaverse will be built on, but i’m optimistic for the web because it’s not a walled garden.
What I am sure of is that the cognitive load of HyperShape is low. So low that I put a tutorial on how to learn it in 4 brief steps below this article. Whether or not this library becomes a useful foundation, I hope this article helps in emphasizing a principle related to 3D interactive environments that should be considered in development of future metaverse technologies.
Learn HyperShape in 4 examples
3D model positioned in space that links to another page
<mv-space>
<mv-link href="https://en.wikipedia.org/wiki/Fox">
<mv-model
src="https://richardanaya.github.io/hypershape/dist/Fox.gltf"
position="0,.1,0"
scale=".005"
rotation="0,45,0"
></mv-model>
</mv-link>
</mv-space>
Play with the demo.
An ocean and HDRI environment light with a camera looking at the horizon
<mv-space>
<mv-camera position="0,1.75,3" lookat="0,1.75,0"></mv-camera>
<mv-light
type="hdri"
src="https://richardanaya.github.io/hypershape/dist/assets/kloofendal_43d_clear_puresky_4k.hdr"
backkground="true"
></mv-light>
<mv-water></mv-water>
</mv-space>
Play with the demo.
A login screen in a HUD
<mv-hud>
<mv-form action="/login">
<mv-input type="text" position="0,.2,0" name="email"></mv-input>
<mv-label position=".1,.2,0" text="Email"></mv-label>
<mv-input type="password" position="0,0,0" name="password"></mv-input>
<mv-label position=".1,0,0" text="Password"></mv-label>
<mv-input type="submit" position="0,-.2,0"></mv-input>
<mv-label position=".1,-.2,0" text="Login"></mv-label>
</mv-form>
</mv-hud>
Play with the demo.
Replace content with interactive buttons
<mv-space id="my_object">
<mv-model
src="https://richardanaya.github.io/hypershape/dist/Fox.gltf"
position="0,.1,0"
scale=".005"
rotation="0,45,0"
></mv-model>
</mv-space>
<mv-hud>
<mv-form action="/avocado" target="my_object">
<mv-input type="submit" position="0,0,0"></mv-input>
<mv-label position=".1,0,0" text="Avocado"></mv-label>
</mv-form>
<mv-form action="/fox" target="my_object">
<mv-input type="submit" position="0,-.2,0"></mv-input>
<mv-label position=".1,.-2,0" text="Fox"></mv-label>
</mv-form>
</mv-hud>
Play with the demo.