By Adam Rogers and Chris Steins
Planning (February, 1999)
In pre-web days, large corporations needed a standard way to set up electronic information for display and storage. They turned to Standard Generalized Markup Language, or SGML. A markup language uses tags, encoded terms that indicate to the system what each element of the document is supposed to be. SGML quickly begat spinoffsCthe telecommunications industry went for Telecommunications Interchange Markup, or TIM, for example. But SGML was wickedly complex and unwieldy.
In the late 1980s a new need arose. A physicist named Tim Berners-Lee was trying to figure out how to display documents on the embryonic Internet. He and his colleagues took the display tags from SGML and called the result HyperText Markup Language, or HTML. Unlike its antecedent, HTML was easy to use and soon got popular. By the early 1990s, says David Wascha, an XML evangelist at Microsoft, "all the big corporations knew they had to have something called a web page, even though they didn't know what it was."
But with the popularity of the web came an explosion of web sites, with a proportional loss in the ability to search for information. The solution is metadata. XML does exactly what HTML does, except instead of talking about font size or position on a page, XML describes the nature of an element, telling whether a string of digits is, for example, a parcel number or street address or set of coordinates.
Companies like Sun, Digital, and Netscape have all bought into the standard because they know businesses will use XML to sell goodswhich makes widespread XML use virtually inevitable.
SGML: Standard Generalized Markup Language; a language used to describe text and images in large databases
HTML: Hypertext Markup Language; offspring of SGML, used to describe displays--the size and color of typefaces, for example--on the World Wide Web
XML: Extensible Markup Language; another offspring of SGML, this one designed to give context for (and information about) data
PML: Planning Markup Language; proposed term for a profession-specific version of XML designed for use by planners
Metadata: Data about data; information that lets a user integrate input from disparate databases or search through them more efficiently.
Seen from a cab on the Brooklyn-Queens Expressway, the skyline of Manhattan unspools in the distance like the opening shot in a movie. The sight is majestic-but it has very little to do with how people actually get around Manhattan. To navigate the city without a car, you have to understand at least three overlapping systems-buses, cabs, and subways. In other words, you must integrate unrelated sets of information that have nothing in common except that they roughly connect the same dots.
This story is not about New York. It is about making sense of information.
We've built a world of the stuff, an infoscape to rival the New York skyline: the Internet. But in Manhattan, understanding the bus system is no help on the subway. Similarly, in our information universe, different databases can't really talk to each other-and the World Wide Web, for all its splendor, is a lot better at looking good than conducting a focused search.
This problem has been acute for urban planners, who deal with thousands of bits of overlapping information, from census data to land-use and transit patterns. But thanks to the ubiquity of the World Wide Web and a hunger for the profits it may bring, all that is about to change. A new way of organizing information is on its way. It's called Extensible Markup Language, or XML, and just about everyone who buys, sells, or manipulates data says it's going to revolutionize the way we exchange and use information, especially in urban planning.
XML is related to Hypertext Markup Language, or HTML, the language of the web. But HTML only describes the attributes of text--whether it should be boldface or italic, for example. Right now, "the only way to manage the information [on the web] is through text searches, and there's no context," says Steven Zamierowski of Interleaf, a publishing technology company. "If you search the web for Prince, you may get royalty, or the Artist Formerly Known As."
The reason is that HTML doesn't know anything about the documents it displays. Web search engines read every document, looking for the search terms people ask for. That's why a site like Yahoo, which catalogues by subject, is such a valuable index. But XML goes a step further. It tells the web browser software about the structure and type of information it's displaying, distinguishing content from format by adding "metadata"Cdata about the data. Essentially, XML adds context to text.
A planning example: Let's say we want to find all the brownfield sites on commercially zoned land in 10 different cities. Searching individual databases or the web would be tedious at best. However, XML can be used to define our published data using metadata "tags" such as: <Landuse>Commercial <Landuse> <Land Condition>Brownfield <Land Condition>
Then software tools could easily extract the sites we are looking for. If we also include a geographic component about the land parcels, we could visualize the information in a variety of waysCas geographic information system maps, three-dimensional displays, or even in virtual reality.
Planners have been hip to metadata since the Federal Geographic Data Committee issued its geographic metadata standards in 1994. But many users feel the committee's standards are too detailed. "The concept is good, but it's very difficult to implement," says Solomon Katz, a computer specialist at the Bureau of Land Management in Denver. XML offers a powerful and relatively simple alternative.
XML is just beginning to gain momentum among planners. One of the earliest practitioners, the Pennsylvania Spatial Data Access (PASDA) web site, started as an internal geographic information system for the Pennsylvania Department of Environmental Protection, but was opened to public access in 1997. The project quickly grew into a massive web library for GIS data and imagery related to the state's environment.
PASDA has created an innovative system for using XML to catalog its library of data. The federal data committee's metadata on PASDA was "rewritten" into XML, says Jason Cupp, PASDA's web site manager.
Using XML doesn't actually change the look or feel of PASDA. Nor does it change the kind of information available. But it does make the management of that information much easier. PASDA metadata comes tagged in various waysCin languages like SGML, HTML, and XML. But the site translates XML metadata into HTML, the "native tongue" just about everyone's web browser uses to produce a picture.
XML isn't just an obscure programmer's language. Recognizing that business users such as banks and retailers (like bookseller Amazon.com) will need to create these XML tags for their own industries, information technology heavyweights like Microsoft and Netscape have begun incorporating XML capabilities into their products. In fact, Microsoft's next release of Word, its popular word processing program, will be able to generate and index XML documents. Once Microsoft climbs on board, XML will become a de facto standard.
Naturally, a few lone XML outposts on the web don't mean much in the aggregate. But as more planning information becomes available on the web in XML, our ability to extend the ways we find and use planning information will increase dramatically.
Private industry knows the change is coming. Environmental Systems Research Institute, Inc., the leading GIS software company, expects to incorporate XML into several new products. "We're looking at XML as a way to transfer geographic information," says David Scheirer, ESRI's product marketing manager. The company is also working on an XML-based interface between servers and web browsers, and as a way to specify metadata for its GIS datasets.
In conjunction with the North Carolina Department of Commerce, the web publishing company Interleaf is testing a system to allow web users to submit requests for proposals via templates in Word. The Interleaf software invisibly converts the material into XML documents for the department's database; the software should be available within the next few months.
Okina Consulting, a publishing technology company, is integrating GIS data with information gleaned from global positioning system satellites, using a language derived from XML as the go-between. A GPS-XML union "could be a system for describing absolute locations on the planet's surface," says Orest Sha, principal at Okina. "Rather than deriving the legal address for an X-Y-Z coordinate, we can actually use a geographic address."
An even more ambitious project is just getting under way in San Francisco. CommerceNet, an electronic commerce industry consortium, is using an Advanced Technology Program grant from the U.S. Department of Commerce to help create a "Next Generation City." The idea is to use XML to enable one promising section of San Francisco, the 130-square block South of Market area, to become a model informational "city" of the future. By combining public sector GIS and planning information with private sector electronic information through XML, CommerceNet plans to create a Digital Community Network, or DCN.
"There is a tremendous opportunity in local markets to combine the data typing of XML, the location ability of GIS, and municipal information," says James Dills, chief strategist for CommerceNet and developer of the DCN concept. The DCN will "promote local business collaboration and sustain economic development in the area. Small businesses would be able to query local suppliers for components, for securing services, or to form buying cooperatives," he adds.
An initial goal of the Next Generation Cities community network is to create an XML-based registry for managing Internet access to public information. The South of Market project is expected to be launched early this year. Ultimately, CommerceNet plans to develop hundreds of such networks.
As for the future: Imagine a community planning process that includes a computer-generated, three-dimensional simulation of an area under development. HTML already allows for some pretty powerful manipulation of information in that kind of virtual environment. At UCLA's Virtual L.A. project, computer scientists have simulated 30 square miles of downtown Los Angeles. (See "It's a Bird, It's a Plane, It's a SuperSystem," July 1998.) The cyber-downtown allows planners to manipulate the built environment, and it links them to others via the web.
With XML, the power of the database skyrockets. Information about every element on the street could be displayed with a mouse click. That's possible with HTML, too. But an XML-enabled search engine could ask questions like, "Which streetlights haven't been changed this year?" That kind of search flummoxes today's web.
"Right now, the database is purely visual," says Scott Friedman, a computer science Ph.D. student at UCLA's Urban Simulation Lab, which is creating Virtual L.A. "There hasn't been a real demand or need for semantic information about what you're looking at. That's exactly what attaching metadata, through XML or whatever, would give you."
Other possibilities: A hospital could combine a visual database with a computer-generated database of computer network jacks or oxygen outlets. A civilian user could find all the police stations in his city that had bomb squads. "The dream of a huge visual database of Los Angeles, or any big city, is that a lot of people could use certain parts of the database for whatever application you can dream up," says Friedman.
None of this is to say that a transition to XML-based metadata will be easy. Not everybody wants to share databases (even though it might be more convenient if they did). And changes in technology are rarely smooth. "Consider that planners have been slow to adopt new technology; planning departments are still filled with ancient 286- and 386-based computers churning away," says Dan Tasman, a planner for the city of Aurora, Colorado, and developer of Cyburbia, a well-known planning web site.
On the other hand, planners now have the opportunity to develop a profession-specific version of XMLClet's call it Planning Markup Language, or PMLCthat would be the standard way to exchange all planning knowledge.
In a way, using XML does exactly what the combining of New York City's three separate subway systems did in the 1940s. Allowing transfers at remodeled, linked stations made the subway much more useful. The success of the Internet has proven the benefits of connecting. No one really doubts XML is coming; the question is how many ways will planners use it when it arrives.
Adam Rogers reports on science and technology for Newsweek magazine in New York. Chris Steins is principal of Urban Insight, a Los Angeles-based web development and Internet consulting firm providing services to the urban planning and real estate industries.