A small team from Monitor Group's LeveragePoint solutions group attended the third annual SemTech conference in San Jose from May 20 to 24. We were fortunate to be selected to present a case study on how we are applying semantic technology and to attend a great many of the sessions.
About the conference
SemTech, or the Semantic Technologies conference, has emerged as the most important venue for investigating actual business applications of the semantic web. There are many other conferences on this theme of course, particularly important are those organized by the World Wide Web Consortium (W3C), but most of these are focused on the underlying concepts and research or on standardization efforts. SemTech concentrates on the intersection of technology and business. Attendance has grown rapidly each year, from about 300 people in 2006 to well over 700 in 2007. Major technology players are showing up (especially Oracle, who were present in force) and about half of the attendees and a quarter of the sessions were large organizations that have begun using or are investigating using semantic technologies in real applications. All good signs. SemTech 2008 will be in San Jose again next May.
The general tenor: Was this the "watershed" event?
There was a great deal of discussion on the floor and at the bar as to whether SemTech 2007 was a "watershed" event for the semantics industry. By this people were asking is semantic technology finally ready for the big tent, where it will be widely applied, discussed, and maybe even understood.
Evidence for
There are many signs that semantic technology is maturing into an application platform that will dominate computing for the next few decades.
- There is a compelling need to better organize and share information and it is hard to see how to do this without at least a thin layer of semantics.
- There are major projects underway in core areas, especially the life sciences (see for example the many activities at the US National Institute of Health).
- Large organizations from the government to Fortune 1000 companies have begun exploratory projects.
- Startups in many different areas from search and discovery to management of personal identity and privacy are getting funded and making business progress. Especially worth following are Zoom Info, Siderean and Radar Networks in search and information management and the UK firm Garlik in personal identity management (Garlik CEO Tom Ilube gave some of the most compelling presentations at the conference).
- Tools are maturing rapidly with a good mix of proprietary and open source solutions for doing real work.
- Core vendors such as Oracle are showing up in force with real options. Oracle understands how this works at the database, integration and business level and is going to be a dominant player.
- The base standards RDF and OWL are reasonably mature and easy to understand and implement. There is good tool support.
- Other standards for search (SPARQL) and rules (SWRL) are making good progress and organizations such as the Object Management Group (OMG) are deeply engaged in developing supporting standards.
All of these are good signs. One of the Monitor Group people noted that "it is like being at the beginning of a revolution." True enough, a revolution some thirty years in the making (Minsky's classic work on frames was published in the early 1970s), but a revolution nonetheless. But this is still very early days.
Evidence against
Despite the energy, enthusiasm and very real progress, semantics and the semantic web are a long way from being main stream.
- There is no coherent, generally shared view of how semantic technologies will be used and to what degree RDF (the base technology) will be found on the open web. This is actually a good thing at this point, semantics is a powerful way of thinking that can be applied in many ways, and premature canalization would diminish this potential. But it is hard to imagine rapid, mainstream adoption until there is a set of coherent, mutually reinforcing approaches. The diversity of case studies that Mills Davis identifies in his latest Semantic Wave approach is a sign on the industries immaturity rather than maturity.
- The standards stack for semantics technologies is maturing rapidly, but the implementation stack is lagging behind. The foundational technologies, such as triple stores and modeling tools are in place, and there are a growing number of people, including those at Monitor Group, who know how to use them. But there is as yet no equivalent to LAMP (or WIMP if you prefer) or Ruby on Rails, and certainly nothing like J2EE or .NET.
- Although Oracle was present in force the other major vendors had only scouting parties. This was especially surprising for Hewlett Packard and IBM, as they have bench strength and have made real contributions. The major search and business intelligence firms sent only scouts (though in the case of Yahoo! a very influential and capable scout).
- I did not see any industry class implementations from Fortune 1000 companies or major government agencies. There is a lot of exploratory work and some production grade implementations, but these have yet to scale out or if they have it was not apparent.
- Best practices are hard to come by. What is the standard QA process? Gow rich does an ontology have to be in order to be effective? What is the RDF version of a sitemap? Do ontologies need to be connected to an upper level ontology? How and when should RDFa be coded into public sites? How does one infer an ontology from existing structures (there has been quite a bit of NLP (Natural Language Processing) work here, but the ontologies generated do not, to me, seem robust)? And so on. There is so much to be done.
- Most people, myself included, have trouble coming up with a concise explanation of semantic technologies that remains true to the vision while capturing the business significance.
So was this the "watershed"?
It seems premature to call SemTech 2007 a "watershed". There has been huge progress in the past year to be sure, and all signs point towards the emergence of a new generation of applications based on semantic technologies. As someone quoted in one of the sessions, "the future is here now, it is just not equally distributed" (William Gibson, author of Neuromancer and other cyberpunk novels).
The watershed may well come in 2008 or 2009. We will recognize it when there we have a set of clear value propositions that non-technical people believe, there are at least a few industrial scale implementations, and there are one or more well established implementation stacks wrapped in best practices.
There are several competing visions for the future of the semantic web. One assumes that is associated with Tim Berners Lee aspires to a World Wide Web liberally encoded with RDF and even OWL, semantic spiders and reasoners to make sense of it all. This seems a long way off. Just before SemTech 2007 Yahoo! blogger Mor Namaan published an insightful post "The Emerging Semantics Web (The Semantic Web is Dead)". He argued that we will not soon see a lot of RDF on the open web, but that what we are seeing are a growing number of features, such as social tagging systems, that can be interpreted and then leveraged through semantics. These emerging semantics can then be applied to search and other applications.
Another emerging application of semantics is the open data web, in which databases and other information sources are exposed. Again, the thought leader seems to be Tim Berners Lee, see his great introduction, but there are many competing visions on how this will actually come about, from Semantic MediaWikis to Freebase.
Semantic wikis may well emerge as the killer application for the semantic web. In addition to the Semantic MediaWiki, which is an extension of the MediaWiki software that runs Wikipedia, Visual Knowledge has an offering that is being used to run the Metaland community. Many other semantic wikis are also being developed.
But there are still many people pushing forward with more conventional applications of semantics - search, integration and modeling, and it was these areas that were most visible at SemTech 2007.
Major Applications
The three major application areas seem to be emerging fastest are search, integration and modeling. It is anyone's guess which of these will gain traction first and I suspect that it is synergistic combinations that will have the most power - a search system based on multiple evolutionary models that draws information from a wide and changing set of semantically described web services, for example (easier to say than to actually do given the current state of the art).
Search
Given the huge amount of information delivered by the major search engines, and their success at spinning money, it is hardly surprising that a new generation of search tools, based on semantics, is attracting attention. And of course a number of these already exist. This is one area I track closely using CrowdTrust, so I won't recap everything here. Key players such as Zoom, Siderean and Radar were all at SemTech 2007.
Integration
Many companies are working to bring Services Oriented Architecture (SOA) and semantics together. There is even an emerging standard, OWL-S, with which to do this. Some of the most interesting companies and compelling presentations at SemTech 2007 were in this area. One of the most interesting of these is Modus Operandi and its Wave suite of design and run-time semantic tools. This is a powerful approach that is likely to transform how web services integration is carried out and even the basic design of web services solutions and software as a service.
Another interesting development is the use of semantic modeling tools combined with a wiki to collaborate on the development of data ontologies. A number of US government agencies are doing this on Knoodl, a site that combines a wiki with semantic modeling tools. This is another approach that is worth exploring.
Modeling
Most semantic technologies emerged from the need to model the world so that artificial intelligence systems could reason about it. Modeling remains one of the core value propositions of semantic systems. The most advanced applications all make some use of formal models, whether to gather and organize information about very complex systems such as the genome and biological pathways, to model data for better integration, or to reason about and predict events. Semantic technologies are one of the most important tools available to model systems, better than UML for example, and they provide an important alternative to mathematical models. The OMG is doing some of the best work here, see for example the OMG Ontology Definition Metamodel, with the help of companies such as Sandpiper software.
Having an explicit model of an organization or business method is the key to rapid evolution, in which a semantic model provides the medium for "representing" in the represent-replicate-vary-select algorithm of evolution.
Key insights
The most important points I learned from SemTech 2007 are
- Fortune 1000 companies and government organizations are already exploring semantic solutions, act now or get left behind.
- The basic standards and technologies are mature enough for real, scalable applications, though it requires some work to sort things out.
- The first generation of Web 3.0 companies is in place and making rapid progress.
- Where are the major search and business intelligence companies? Where are the major tech players other than Oracle?
- Semantic wikis are about to pop and have the potential to realize at least part of Tim Berners Lee's web of open linked data.
- Reasoning engines can make useful predictions, and even when the predictions are not accurate they suggest things to look for.
The LeveragePoint Approach
At LeveragePoint, we are already making heavy use of formal models of action and content in our development stack (the Choice-Action-Measurement model or CAM) and writing a light version of this out into our delivery platform. We are also using the Semantic MediaWiki as a development and collaboration tool. We plan to continue this work over the coming year, and will likely add some semantic search and semantic web service integration. We look forward to SemTech 2008, to updating people on how we are doing with our learning plan wiki, and to presenting other advances that we may have made.

