My Photo

Relevant Sites

  • LeveragePoint
    Activate Embed Evolve Your Marketing Strategy
  • LeveragePoint for Value Management
    LeveragePoint for Value Management helps you to prove the Value of your products. Use it to uncover the hidden value drivers that truly differentiate your from the competition. Then, communicate that value in a format that truly resonates with your buyers.
  • LeveragePoint for Marketing Strategy
    Access proven marketing strategy practices that lead to top line growth. Develop your strategic capabilities as quickly and as deeply as you need. Do it all right now with an easy-to-use online product that’s always on.

June 26, 2007

Search Integration for Learning

At the AICC meeting in Sestri Levante (thank you Giunti) Tom King and I gave a presentation on Search as a Mode of Learning. Our goal is to call out search as a key area of interest for the integration of learning systems. We began by providing a high-level description of how search works and looking at several possible patterns for search integration. Of course integration is only part of the picture, the ability to preview content before committing is a related enabler of effective search, and we discussed several patterns for previewing as well.

I follow developments in the search space closely, and you can keep up with these by tracking my Crowdtrust memos tagged with 'search'.

Integration is emerging as a key issue for learning for a number of reasons.

  1. There are a rapidly growing number of resources available to support learning in all of its many guises, no one source has or will have all possible resources, unless perhaps some combination of search engines.
  2. Acceleration in the pace of business, globalization, and speed in the dissemination of research results all mean that the resource that was relevant yesterday may not be the best reference today.
  3. Personalization of learning also means that there will seldom be a one-to-one mapping of the best learning resource to a specific learning objective for all people.
  4. Integration of learning into other processes, from research and development, to analysis, to execution makes it important to have very flexible mechanisms for content integration.

Search and search integration are, to my mind, the best way to address all four of these challenges.

To set the stage for a discussion of search integration, Tom and I described search as having five basic steps.

  1. A spider (or crawler) goes out and searches the Internet, following links, and sends information back to its home base. The spider collects a variety of information, the normal natural language processing information about word stems and syntax, site and page metadata, in some cases information about presentation structure, and of course link structure. A rich mix of information about each page and its relation to other pages (and in some cases the level of information is even more specific, down to the paragraph on the page or lower). Modern spiders can even work through sites built using Adobe Flash, and work is progressing on audio and video files (images are still tricky).
  2. Back home, an index is built, organizing and compressing the information from the spiders and sometimes weaving in other information, human judgements and categorizations, or how users have reacted to previous searches. The index is optimized for search and for certain types of queries. There are many ways to build and to store this index, relational databases, tree structures, semantic datababses (there may even be some people out there using object databases, though I am not sure how this would help in the case of search, and I have a fondness for object data bases). One of the goals of the search world should be to promote many different approaches to indexing.
  3. Queries are used to search the index for specific information. The query could be a simple text string, perhaps using some regular expressions, or if a relational database is being used in the index a SQL query would be built. Semantic indexes are using either XQuery to search the XML representation or are beginning to use SPARQL to go directly after the RDF or OWL. In most cases, the mechanics of the query are hidden from the user, who enters a simple text string or fills out fields from which a query is built.
  4. The search results are then packaged up and presented to the user. Most often this is as a weighted list, with the most probable responses to the query at the beginning, and lower ranking results strung out below. More sophisticated systems try to categorize search results, and there are many experiments going on with visual representations of search results. Tag clouds are becoming an increasingly common way to display search results, as they make it easy to display several dimensions of relevance and people are getting used to this way of presenting information.
  5. Search results can then be organized by the user, reweighted, recategorized, commented on or tagged, and shared with other users. In advanced systems, these user actions are fed back to the search system that incorporates them into the index for use in judging future queries.

With these five steps in mind - spider, index, query, present results, organize results - we developed a set of patterns for discussing search integration for learning. This approach is based on Gregor Hohpe and Bobby Wolf's useful book Enterprise Integration Patterns and the supporting website. These patterns can be combined in various ways to provide compelling search integration solutions.

Pattern 1: Allow Spidering

This is the simplest and the most effective pattern. Allow spiders to go through your content and send information back to various indexing systems. There are a number of best practices to designing texts for ease of spidering, use clear metadata, make all relevant text available to the spider, label all media not accessible to the spider, design your links to support search. These are basically similar to search engine optimization (SEO) approaches, although the goal is subtly different. In search for learning the goal is to maximize the return of highly relevant search results for the specific learning context.

A couple of points to note here. The importance of links in modern search makes it important to have links and to think carefully about their design. The SCORM approach in which a SCO (Shareable Content Object) can not launch another SCO, which in some cases limits one's ability to weave patterns of links between SCOs, should probably be scrapped. Future versions of learning standards need to support rich linking of content and multiple navigation systems, including systems based on search (whether this search is visible to the learner or not). And in some situations, people may not want to expose their content to spiders, even friendly spiders that have been vetted and allowed in. In this case, other search integration patterns will need to be used.

Pattern 2: Publish Metadata

This is the pattern that has been adapted by most learning standards and learning systems to date - publish explicit metadata, either in a general format such as Dublin Core or a learning specific format such as the IEEE LTSC LOM (Learning Technology Standards Committee Learning Object Metadata). And to date it has worked poorly. Most of the systems that are supposed to read this metadata choke on it (the learning content management systems tend to do the best job) and the quality of the metadata itself varies from good, to good but irrelevant to just awful. For most of the learning content I have reviewed the quality of the metadata falls somewhere between awful and appalling.

There is a role for explicit metadata of course, it provides a content creator with a way to communicate specific information to users. And user generated metadata has an important role to play as well, especially when it is used in social bookmarking systems that can aggregate user keywords and comments. But alone it will never be enough tu support full search integration and address the four problems noted above: resource proliferation, rapid change, personalization, integration into multiple processes.

Pattern 3: Publish an Index

So if formal metadata is not enough, and for some reason you do not want to let spiders in (or you don't have direct access to the content and can not put your spiders in), then what do you do? The simplest approach may be to generate your own spider data and publish it. This is similar to publishing metadata, the difference is that the file produced includes a great deal of additional information (natural language processing data, presentation structure, link structure) and is meant to be included into an index rather than read directly by another human.

It may be helpful to develop some standards around this, or at the very least to recomend some spiders (preferably open source) that people can run against there own content and use to publish indexing information. This is an area that I would encourage the AICC, ADL and even the IMS (to pick the three most commonly referenced organizations involved in learning specifications) to look into, and perhaps provide sample spiders and outputs for reference purposes. But an excess of specification could limit innovation in an area that is developing very quickly, so the best approach at this time is simply to identify index publishing as a pattern and share code and best practices (of course an even better approach is to let many different spiders go through your content, so that it can be included in many different indexes with many different structures and searched by many types of queries).

Pattern 4: Publish a Syndication Feed (RSS or Atom)

What do you do if your content changes frequently, so that no formal metadata or occasional spidering will do it justice. Fortunately, this is a common situation and there is a common approach to solving it: content syndication using RSS or Atom. There are even proposals available for extending RSS and Atom to cover common metadata formats and they could easily include indexing files (which I am using to mean the files that contain the natural language processing, presentation and link structure data used in search).

In some of the more dynamic and adaptive approaches to learning content that are emerging the learning resource is not a piece of static content but an RSS/Atom feed, or even a combination of feeds, or a sophisticated packaging of search results.

Pattern 5: Federated Search

Finally, we get to the most popular proposal, some form of federated search. In federated search there are ways to pass search queries and results from one search system to another. There are several ways to do this, by exposing a conventional API, by using some form of web service, whether RESTful or WSDL/SOAP based, or even through one of the existing forms of database integration. The logic is often "I understand my own content better than anyone else and I know how to search it most effectively, so pass me a query and I will give you the most relevant results." In some limited number of cases this is even true, and federated search should be standardized and supported within the learning industry. Skillsoft in fact has begun to do this and has a search service as part of OLSA (Open Learning Services Architecture). Shota Aki had a very good presentation on this at the AICC meeting in Sestri Levante, see the presentations for Tuesday June 5.

I do not think this is the best integration pattern though. It assumes that (i) the content side search is as powerful as other search engines, and given the rapid pace of search engine development this is unlikely to be true in most cases. It also assumes that enough information can (or will be) passed in the search query for effective personalization of the search. Again, I find this unlikely. Finally, (iii) this pattern will not allow as much diversity and experimentation and its broad adoption will slow down innovation around search in the learning industry. The long-term result of this is likely to be that people will avoid search systems that are constrained to learning systems and rely on the open web, where evolution has been much more rapid. Indeed, given that many people default to Google when they need to learn something for work, one could say this has already happened, and that the learning industry is playing catch up.

Patterns for Previewing

Integration of search queries and results passing is only one part of search integration. Another important theme is previewing. When I am returned a set of results from a search query I often want to preview any specific result before committing to selecting it. Preview supports much better selection for the learner, who may hesitate to commit to a resource just on the basis of the information produced in the typical search result. This is especially so when there is some cost involved in accessing the resource, whether this be a time cost, a bandwidth cost, or an actual financial charge.

Pattern 1: Include a Sample

A short indicative sample is included in the search result. There are several sub patterns for obtaining this sample, the content provider can indicate this using description metadata of some kind or the spider can send back enough information for the search system to build a sample on its own. Sometimes both approaches are used. In some cases the index is full of rich information and different samples can be constructed for the same piece of content depending on the search.

Pattern 2: Provide Access to a Sample

In some cases it is difficult to include a sample in the search result. Their may be IP protection issues, or the media may simply not lend itself to this (video, a location in Second Life, etc.). In this case, access to some form of sample can be provided as part of the search result. Sub patterns are that the preview could be content placed on a specific sample site for this purpose, or direct access could be provided to the content with some restrictions, such as time or the extent of access.

Conclusions and Suggestions

Search is likely to play a growing role in learning, as it is in so many other areas. Everything from navigation on physical and virtual spaces, to business intelligence, to data integration is being rethought in terms of search and the organization of search results. As more and more aspects of the world, or physical and virtual worlds, are described or even describable (advanced search systems are able to generate their own descriptions) search and related technologies become powerful vehicles for accessing the most relevant content in any context. It is not too much to say that search will replace most content centric learning solutions in the not too distant future.

Learning places specific demands on search systems that we are only beginning to understand. The search needs to be constrained by learning objectives, the learners evolving mental models and the current context - where a person is in a work process or what is happening in a collaborative group. A great deal more work needs to take place here.

Given this, the learning industry has a long way to go before it should begin to mandate specifications or standards for search integration. What is more important is that we take an open approach to search, open in terms of how we design systems and open in terms of how we allow other systems to access them.

The best approach to search integration is the simplest pattern. Allow other systems to spider your content and design your content so that spiders can collect the most relevant information and support the best search results. This pattern allows for the most diversity and diversity of approaches is what we need now. No system is ideal in all circumstances and the world of search engineering is developing extremely rapidly, with advances in semantics and social technologies changing the rules of the game. If for some reason you can not allow spidering then use your own spiders (ideally use more than one so that you can experiment internally) and publish files for indexing systems.

Their is a need to develop a common way to pass queries and search results. Here Skillsoft has taken the lead with its Open Learning Services Architecture. The best resource I know of to learn about this approach is the presentation that Shota Aki made to the AICC in June, 2007 at Sestri Levante. It can be found among the June 5 presentations here.

Additional investigations into previewing for different types of media will also be valuable. If you have ideas here, please contact Tom King (the AICC blog is probably the best way to do this) or leave a message here.

By opening our systems and content to support better search we may find that we solve other integration problems as well, as search is rapidly evolving solutions to many types of problems.

June 16, 2007

Search as a Mode of Learning

On June 6, 2007 Tom King and I gave a presentation on Search as a Mode of Learning at the AICC event in Sestri Levante, Italy. We covered two main areas, 'search as a mode of learning 'and the need for 'common approaches to search integration.' This post focuses on the former.

Key points,

  • Learning can be understood as the development of effective search patterns
  • New approaches to search are proliferating
  • Search is being used for many new applications and has a growing role to play in learning
  • Learning must become part of the larger ecology of information organization and exchange

Learning as search

There are a number of metaphors one can use to understand learning and how it enables effective action: play, navigation and process integration all offer compelling perspectives, models of action such as OODA loops (Observe Orient Decide Act) can put the focus more solidly on action, and approaches such as Cognitive Task Analysis help us understand how people build mental models and use them to guide decisions. But in this post I want to explore search as a metaphor and see how effective current search systems are in supporting learning by individuals and teams.

A couple of years ago I was at a presentation by Elliott Masie in which he recounted a conversation with Bill Gates. Gates noted, correctly, that the vast majority of eLearning is authored using PowerPoint and suggested that this made PowerPoint the most important eLeanring application. Elliott countered that the real killer app for learning is Google.

There is a lot to be said for Elliott's response. These days most people turn to Google or another search engine when they want to get quick access to information. And beyond this, search has become one of the first things we turn to when we need deeper knowledge as well. But there is a lot of frustration as well. There is the obvious issue that most searchers generate too many hits and sifting through these to find the most relevant is frustrating and time consuming. More importantly, few search applications, if any, provide explicit support the key activities involved in learning.

What are these activities? One way to think about learning is as a process of

  1. Defining a problem space
  2. Exploring this space to uncover its patterns and relationships
  3. Developing paths through the space
  4. Recognizing patterns in the space

Effective action takes place when one can (i) rapidly recognize patterns and (ii) navigate to solutions. And of course, one wants to get so good at recognizing patterns that one is sensitive to when the environment is changing and the old responses and paths are no longer taking you where you expect them to.

One way to approach this is to think about how people find ther way around a landscape or a city. Urban planner Kevin Lynch has thought deeply about this and in his 1960 book The Image of the City. In this book he identified five patterns that people use to navigate around cities. These patterns are more generally applicable, and my suggestion is that a good search system, one the supports learning, will support all five of them. The five patterns as applied to cities are as follows.

  1. Landmarks - prominent buildings or other features, visible from a distance, that allow you to orient yourself.
  2. Neighborhoods - areas that are self similar in some way, such as the type of building, the activities carried out such as shopping, entertainment, business or residential, and the kids of people that occupy them.
  3. Barriers - Many cities are divided by barriers, some natural such as rivers or cliffs, others artificial like highways, and some social.
  4. Paths - Neighborhoods are linked and barriers sometimes crossed by paths, the major thoroughfares that help one to travel around.
  5. Nodes - Places where many paths come together are nodes.

Understanding these patterns helps you to find your way around a city, to know where you are, and to explore your way to new places. How do these apply to search and learning?

When we search we want to be able quickly orient ourselves, using results that are easily remembered and that help us find our way, these are the landmarks. There are some sites that we go to because we know that it will be easier to find what we are looking for if we start there. Amazon is a landmark in the search for books. We also want to know the general category of the search results, the neighborhood so to speak, and then follow the links, or paths, from one resource to the next, dipping back into the search results when we get stuck or need elaboration. Some results will be very rich, leading to many other resources (hopefully relevant), these are the nodes. And of course there are some areas that are simply not linked and do not show whose search results are disjoint. There are barriers between the two areas. But there are times when forging ones own links between two areas, search and urban planning perhaps, may lead to a recognition of deeper patterns.

Search systems already support some of these patterns under the covers. Google's PageRank method of ranking search results makes use of linking patterns that are defined in terms of paths and nodes. It even has a very coarse notion of neighborhoods in that it divides the search universe up into scholarly articles, blogs, books and so on, not that these are the most meaningful categories for most search users. Some other search services do somewhat better, providing categorizations of search results that help users define neighborhoods. Mondosearch is one example of this. But I am not aware of any search engine that provides good explicit exposure of all of Lynch's patterns, or that supports the most common individual and team search behaviors.

Common Search Behaviors

Before returning to the theme of search as a mode of learning I would like to look at some common search behaviors as these are also the basis for learning.

I look first at the most common individual behaviors, then go on to the less well understood area of collaborative search.

Find

The most common use of search. Find comes in a number of flavors, sometimes one is looking for a specific piece of data, the population of Tashkent or the temperature forecast for Kitimat next Tuesday, or a combination of data, the closest theatre at which a specific film is playing this evening. Other times you may be looking for a specific resource, an organization's website or a book. You can also be searching for information about a more general theme and trying to find the resource(s) that best fit that theme. This is hardly a complete ontology of finding, but it gives the general sense.

Refind or Navigational Search

Almost as common as 'find' is to 'refind'. This is search as a form of navigation. Rather than bother to remember a URL, or in many cases even when we remember the URL, it is simplest just to go to search and enter an associated term, usually one that has successfully located the resource we are going to in the past. I recall that research suggests that this type of navigational search accounts for about 30% of searches on major search engines, I am looking for the reference, and I suspect that the same is true for organizations that have decent search engines.

Explore

With exploration we enter search modes that are directly related to learning. Exploratory search happens when one is not sure what one is looking for or how concepts relate. For some learning styles it is the best way to begin learning (for people with other styles it will just seem random though). One can see each search as a kind of probe into the design space, send in a probe, follow links, let the terms suggest new searches, and eventually a sort of concept map emerges.

Going back to the metaphor of search and navigating through a city, this is like getting to know a new place by going out and wandering around. If several different probes turn up the same basic website (on the WWW in 2007 Wikipedia comes to mind) one begins to find landmarks. Exploring the results for different search terms also gives a feel for neighborhoods, 'Jaguar' will take you to the 'Apple' neighborood, the 'luxury car' neighborhood and the 'big cats' neighborhood. 'Search' will take to an interesting group of neighborhoods with many paths between them, there is a 'learning' neighborhood, a 'query' neighborhood, a 'search engine' neighborhood, a 'search engine optimization or SEO' neighborhood and so on. This is a richly connected part of the world with many paths and landmarks.

Relate

Further exploration of the paths combined with querying around them helps the searcher build relations between different resources. Some search sites do this explicitly, especially those that support scientific research or patent claims, where explicit citations are an important part of the landscape and are used to organize search results. Building up sets of relationships, based on paths and often oriented around landmarks, is the first step towards pattern matching.

Match Patterns

To my mind pattern recognition, followed by pattern matching, is central to learning, and it is the role of search in finding, refinding and recognizing is key. This is why I believe search to be a primary mode of learning, and not just a way to locate learning. It is the experience of searching that strengthens one's ability to find and then recognize patterns, and it is the patterns that guide our actions.

Patterns come in many forms, and an exploration of this would turn an already long blog post into a book. But some key patterns include search patterns!, process patterns, recognition patterns, filter patterns, decision patterns, observation (situational awareness) patterns, and so on. The pattern of represent-replicate-vary-select is one very general pattern that can also be applied back on to learning and search.

A higher level skill then pattern recognition is pattern blending, taking two or more separate patterns and bringing them together to create a new mode of understanding and filtering the world. This is something to explore more in future posts, but a good introduction can be found in the book Designing With Blends by Manual Imaz and David Benyon.

Do current learning, knowledge and performance support systems provide good support for search as a mode of learning? Do they support search for patterns as a mode of performance? My sense is that the answer in both cases is no.

Social search

Individual search is only a small part of the possible world of search, just as individual learning is only a small part of learning. The next phase of both search and learning is likely to focus more on its social dimensions. Social search gets little explicit support from most search systems, Amazon with its recommendation engine is much better, and the same is true of most corporate on-line learning (the University systems, with their threaded discussion groups, are much better). Google is beginning to get this with Google Co-op, which combines personal search with social search.

A few approaches to social search are noted below.

Infer relevance from social structures

This is the key insight of Brin and Page in designing Google's PageRank algorithm - the sites that link to a site, and the sites that link to these, are an indicator of relevance. It would be nice if Google made this more explicit, though one can tease out some of the relationships by using various Google Hacks. As noted above, sites dedicated to searching patents and scientific journals are somewhat better than this.

Weight (filter)

Another thing that Google does implicitly and that Amazon or Digg do explicitly is to weight search results based on which resources a person selects after a search. The more often a certain resource is selected the higher its placement in search results.

Share and edit search results

A stronger social search mechanism is to allow people to save searches, edit the results, make comments and provide rankings, and then to pass these around among one and other. As noted above, Google Co-op provides a very partial implementation of this and other systems are emerging that provide better support. Indeed, this is an active area of interest at Monitor Group's LeveregePoint solutions team.

Share and edit search paths

Social search will not really begin to support social learning until it goes beyond sharing of results to the sharing of paths. It is by navigating (and building) search paths, and then sharing, commenting and editing these, that people come to recognize and share patterns, which is an essential part of learning. This was actually part of the inspiration for ThoughtShare, a company that I helped found back in the late 1990s with John Dill and Brian Fisher at Simon Fraser University. ThoughtShare later morphed into a blogging tool and then a contextual advertising service called Qumana, but the initial goal was to provide people with an easy way to record, share and edit paths though the World Wide Web.

Search as a General Paradigm

Over the past few years search has emerged as a general paradigm for how we interact with the world and for how we build information and business systems. The best treatment of this is Ambient Findability by Peter Morville. The ideas in this post about how search is an important mode of learning were influenced by Morville's more general treatment. Search is also absorbing important business functions such as business intelligence and information integration.

One can think of business intelligence as a primitive form of search, one that uses SQL to pull data from structured databases and then organize it into various reports and dashboards. Valuable yes, but not something that easily lets users bring in the many different kinds of information needed to recognize patterns and make decisions. As a result, the combination of search and business intelligence is beginning to attract attention. There is good coverage, including the results of a survey uncovering the business drivers, in a recent article in Intelligent Enterprise.

Search is also emerging as a new? paradigm for data and information integration. A search system that included good pattern recognition is an alternative to more programmatic approaches to data integration such as extract transform load (ETL) or web services. It requires that the data be open to search, and in some cases the acceptance of fuzzy, dare I say 'organic' results. Some will object that data integration requires precision and reliability. In some cases yes, but it also requires adaptability and the ability to recognize emerging patterns. Existing approaches to integration, based on explicit connections, will not achieve this as this assumes that the connections are known, can be known, in advance.

-----

For the mathematically inclined, Pentti Kanerva's book Sparse Distributed Memory provides a thought provoking treatment of how memory, search and learning fit together. See especially the final chapter The Organization of an Autonomous Learning System with its emphasis on pattern sequences.

May 26, 2007

Thoughts on SemTech 2007

A small team from Monitor Group's LeveragePoint solutions group attended the third annual SemTech conference in San Jose from May 20 to 24. We were fortunate to be selected to present a case study on how we are applying semantic technology and to attend a great many of the sessions.

About the conference

SemTech, or the Semantic Technologies conference, has emerged as the most important venue for investigating actual business applications of the semantic web. There are many other conferences on this theme of course, particularly important are those organized by the World Wide Web Consortium (W3C), but most of these are focused on the underlying concepts and research or on standardization efforts. SemTech concentrates on the intersection of technology and business. Attendance has grown rapidly each year, from about 300 people in 2006 to well over 700 in 2007. Major technology players are showing up (especially Oracle, who were present in force) and about half of the attendees and a quarter of the sessions were large organizations that have begun using or are investigating using semantic technologies in real applications. All good signs. SemTech 2008 will be in San Jose again next May.

The general tenor: Was this the "watershed" event?

There was a great deal of discussion on the floor and at the bar as to whether SemTech 2007 was a "watershed" event for the semantics industry. By this people were asking is semantic technology finally ready for the big tent, where it will be widely applied, discussed, and maybe even understood.

Evidence for

There are many signs that semantic technology is maturing into an application platform that will dominate computing for the next few decades.

  • There is a compelling need to better organize and share information and it is hard to see how to do this without at least a thin layer of semantics.
  • There are major projects underway in core areas, especially the life sciences (see for example the many activities at the US National Institute of Health).
  • Large organizations from the government to Fortune 1000 companies have begun exploratory projects.
  • Startups in many different areas from search and discovery to management of personal identity and privacy are getting funded and making business progress. Especially worth following are Zoom Info, Siderean and Radar Networks in search and information management and the UK firm Garlik in personal identity management (Garlik CEO Tom Ilube gave some of the most compelling presentations at the conference).
  • Tools are maturing rapidly with a good mix of proprietary and open source solutions for doing real work.
  • Core vendors such as Oracle are showing up in force with real options. Oracle understands how this works at the database, integration and business level and is going to be a dominant player.
  • The base standards RDF and OWL are reasonably mature and easy to understand and implement. There is good tool support.
  • Other standards for search (SPARQL) and rules (SWRL) are making good progress and organizations such as the Object Management Group (OMG) are deeply engaged in developing supporting standards.

All of these are good signs. One of the Monitor Group people noted that "it is like being at the beginning of a revolution." True enough, a revolution some thirty years in the making (Minsky's classic work on frames was published in the early 1970s), but a revolution nonetheless. But this is still very early days.

Evidence against

Despite the energy, enthusiasm and very real progress, semantics and the semantic web are a long way from being main stream.

  • There is no coherent, generally shared view of how semantic technologies will be used and to what degree RDF (the base technology) will be found on the open web. This is actually a good thing at this point, semantics is a powerful way of thinking that can be applied in many ways, and premature canalization would diminish this potential. But it is hard to imagine rapid, mainstream adoption until there is a set of coherent, mutually reinforcing approaches. The diversity of case studies that Mills Davis identifies in his latest Semantic Wave approach is a sign on the industries immaturity rather than maturity.
  • The standards stack for semantics technologies is maturing rapidly, but the implementation stack is lagging behind. The foundational technologies, such as triple stores and modeling tools are in place, and there are a growing number of people, including those at Monitor Group, who know how to use them. But there is as yet no equivalent to LAMP (or WIMP if you prefer) or Ruby on Rails, and certainly nothing like J2EE or .NET.
  • Although Oracle was present in force the other major vendors had only scouting parties. This was especially surprising for Hewlett Packard and IBM, as they have bench strength and have made real contributions. The major search and business intelligence firms sent only scouts (though in the case of Yahoo! a very influential and capable scout).
  • I did not see any industry class implementations from Fortune 1000 companies or major government agencies. There is a lot of exploratory work and some production grade implementations, but these have yet to scale out or if they have it was not apparent.
  • Best practices are hard to come by. What is the standard QA process? Gow rich does an ontology have to be in order to be effective? What is the RDF version of a sitemap? Do ontologies need to be connected to an upper level ontology? How and when should RDFa be coded into public sites? How does one infer an ontology from existing structures (there has been quite a bit of NLP (Natural Language Processing) work here, but the ontologies generated do not, to me, seem robust)? And so on. There is so much to be done.
  • Most people, myself included, have trouble coming up with a concise explanation of semantic technologies that remains true to the vision while capturing the business significance.

So was this the "watershed"?

It seems premature to call SemTech 2007 a "watershed". There has been huge progress in the past year to be sure, and all signs point towards the emergence of a new generation of applications based on semantic technologies. As someone quoted in one of the sessions, "the future is here now, it is just not equally distributed" (William Gibson, author of Neuromancer and other cyberpunk novels).

The watershed may well come in 2008 or 2009. We will recognize it when there we have a set of clear value propositions that non-technical people believe, there are at least a few industrial scale implementations, and there are one or more well established implementation stacks wrapped in best practices.

There are several competing visions for the future of the semantic web. One assumes that is associated with Tim Berners Lee aspires to a World Wide Web liberally encoded with RDF and even OWL, semantic spiders and reasoners to make sense of it all. This seems a long way off. Just before SemTech 2007 Yahoo! blogger Mor Namaan published an insightful post "The Emerging Semantics Web (The Semantic Web is Dead)". He argued that we will not soon see a lot of RDF on the open web, but that what we are seeing are a growing number of features, such as social tagging systems, that can be interpreted and then leveraged through semantics. These emerging semantics can then be applied to search and other applications.

Another emerging application of semantics is the open data web, in which databases and other information sources are exposed. Again, the thought leader seems to be Tim Berners Lee, see his great introduction, but there are many competing visions on how this will actually come about, from Semantic MediaWikis to Freebase.

Semantic wikis may well emerge as the killer application for the semantic web. In addition to the Semantic MediaWiki, which is an extension of the MediaWiki software that runs Wikipedia, Visual Knowledge has an offering that is being used to run the Metaland community. Many other semantic wikis are also being developed.

But there are still many people pushing forward with more conventional applications of semantics - search, integration and modeling, and it was these areas that were most visible at SemTech 2007.

Major Applications

The three major application areas seem to be emerging fastest are search, integration and modeling. It is anyone's guess which of these will gain traction first and I suspect that it is synergistic combinations that will have the most power - a search system based on multiple evolutionary models that draws information from a wide and changing set of semantically described web services, for example (easier to say than to actually do given the current state of the art).

Search

Given the huge amount of information delivered by the major search engines, and their success at spinning money, it is hardly surprising that a new generation of search tools, based on semantics, is attracting attention. And of course a number of these already exist. This is one area I track closely using CrowdTrust, so I won't recap everything here. Key players such as Zoom, Siderean and Radar were all at SemTech 2007.

Integration

Many companies are working to bring Services Oriented Architecture (SOA) and semantics together. There is even an emerging standard, OWL-S, with which to do this. Some of the most interesting companies and compelling presentations at SemTech 2007 were in this area. One of the most interesting of these is Modus Operandi and its Wave suite of design and run-time semantic tools. This is a powerful approach that is likely to transform how web services integration is carried out and even the basic design of web services solutions and software as a service.

Another interesting development is the use of semantic modeling tools combined with a wiki to collaborate on the development of data ontologies. A number of US government agencies are doing this on Knoodl, a site that combines a wiki with semantic modeling tools. This is another approach that is worth exploring.

Modeling

Most semantic technologies emerged from the need to model the world so that artificial intelligence systems could reason about it. Modeling remains one of the core value propositions of semantic systems. The most advanced applications all make some use of formal models, whether to gather and organize information about very complex systems such as the genome and biological pathways, to model data for better integration, or to reason about and predict events. Semantic technologies are one of the most important tools available to model systems, better than UML for example, and they provide an important alternative to mathematical models. The OMG is doing some of the best work here, see for example the OMG Ontology Definition Metamodel, with the help of companies such as Sandpiper software.

Having an explicit model of an organization or business method is the key to rapid evolution, in which a semantic model provides the medium for "representing" in the represent-replicate-vary-select algorithm of evolution.

Key insights

The most important points I learned from SemTech 2007 are

  • Fortune 1000 companies and government organizations are already exploring semantic solutions, act now or get left behind.
  • The basic standards and technologies are mature enough for real, scalable applications, though it requires some work to sort things out.
  • The first generation of Web 3.0 companies is in place and making rapid progress.
  • Where are the major search and business intelligence companies? Where are the major tech players other than Oracle?
  • Semantic wikis are about to pop and have the potential to realize at least part of Tim Berners Lee's web of open linked data.
  • Reasoning engines can make useful predictions, and even when the predictions are not accurate they suggest things to look for.

The LeveragePoint Approach

At LeveragePoint, we are already making heavy use of formal models of action and content in our development stack (the Choice-Action-Measurement model or CAM) and writing a light version of this out into our delivery platform. We are also using the Semantic MediaWiki as a development and collaboration tool. We plan to continue this work over the coming year, and will likely add some semantic search and semantic web service integration. We look forward to SemTech 2008, to updating people on how we are doing with our learning plan wiki, and to presenting other advances that we may have made.

May 19, 2007

Leveraging semantics and Web 3.0

Monitor's LeveragePoint team is investing in semantic technologies. This may seem like an odd direction for a strategic consulting company focused on supporting its clients to grow in the ways that are meaningful to themselves, but there is method in our madness.

The LeveragePoint solutions are designed to deliver business outcomes, they are focused on driving results that will impact the top and bottom lines, market share and presence, and ultimately share value, and they have a track record of delivering this. Our current focus is on strategic marketing, see LeveragePoint for Strategic Marketing, and additional offerings around Strategic Pricing, Innovation and Strategic Analysis are in planning. So why apply semantics to this?

The short answer is that the LeveragePoint solutions are designed to help teams act on complex business problems, and that this requires an approach that supports dynamic change and configuration, clear communication and the connection of many concepts, processes and data sources. A longer answer must begin with a sketch of what semantic technologies are and why the bring significant leverage.

What are Semantic Technologies

Semantics involves the study of meaning. here are a few definitions culled using Google's define: function.

Definitions of semantics on the Web:

"is the study of meaning"

"The use of language in meaningful referents, both in word and sentence structures"

"meaning. If a computer understands the semantics of a document, it understands the meaning, rather than just interpreting a series of characters."

As LeveragePoint is concerned with communicating methods and results between team members and with the users of the results, the ability to clearly define the meaning of different pieces of content, of tasks, of data and how these fit together is at the heart of our solutions.

We do this using the new generation of semantic languages that have been developed for use on the Internet. The World Wide Web Consortium (W3C) takes a leading role in this work and has excellent resources available at its website.

The main Language that we are using at this point is the Resource Description Lanaguage, or RDF, one of the foundations of the W3c's semantic web program. RDF is deceptively simple. At its root is the notion of a triple, a set of three things linked in the form subject-predicate-object. As long as each of these things has a URI, it can become a subject, object, or even predicate in term, making it possible to build webs of connected meanings. Even a triple can become the subject of another triple. It looks something like this.

Triple_sketch_2 

Of course there is a great deal more to RDF than this. The best way to learn about this is to read the W3C's excellent introduction RDF Primer. For applications that require a more precise model, and the automated reasoning and inference this makes possible, the W3C has developed the Web Ontology Language OWL (this is not a mistake, it is a sly reference to a scene in Winnie the Pooh where Owl is spelt 'Wol"). This is a rich area of thought that we will return to many times.

Application at LeveragePoint

At LeveragePoint we use RDF to define all of the different components that go into our solutions. The different business objectives, methods, data, and the content describing them are broken down into the 'smallest coherent unit'. These units and, more importantly, the relationships between them are described in RDF. This does several things for us.

  1. It helps us think very clearly about how business outcomes are related to our different types of user and what outputs will support those business outcomes.
  2. It makes it possible to have highly configurable solutions, for different industries, in response to different events, even for each user.
  3. The RDF models are very useful as filters in search applications.
  4. By giving our models explicit representations we are able to evolve and improve them much more effectively.

We also use RDF to organize and describe our internal knowledge base and ongoing learning. And in the future, we expect to add it to the web services we provide, to simplify integration.

Wider Business Potential

Analyst Mills David has described the future potential of these approaches as a Semantic Wave. As semantic technologies are adopted they will change the way in which we organize and share information, and they will help us to uncover and to link areas of knowledge in new ways. This will transform search, making it much more intuitive, but more importantly will help us to think about much larger webs of relationships in a more coherent manner. Issues such as climate change, the impact of demographic change, how innovation is cultivated and transferred, will all become much more tractable to human thought, and more and more people will be able to think together, uncover their differences, and explore resolutions. The first big impact is likely to be in the life sciences, where extensive use is already being made of semantic technologies in projects such as Ontogene. Monitor is one of the first consulting companies to recognize the potential of the semantic web, what some have called 'the mathematics of meaning' and begun to apply it to business and organizational issues in a way that impacts business results.

May 15, 2007

Learning Plans

LeveragePoint is about enabling effective action, but the people who deliver LeveragePoint have to learn constantly to do this effectively. This means that we need to actively plann and share how we learn, what we are learning, and how we are putting it to work. We will share some of this in the LeveragePoint Applications blog.

At LeveragePoint, a learning plan has a number of pieces: Strengths, Learning Styles, a link to Organizational Objectives, Learning Goals and Learning Themes. The Learning Goals are supported by Mentors, Buddies, Resources and demonstrated by Evidence. All of these are woven together in a dynamic model that supports search and sharing.

We begin by introspecting on our strengths, and our learning styles. We then think about how these work together. We use the Gallup StrengthsFinder tool to discover our strengths, though no doubt other approaches couled be used.

For example, my strengths (Steven Forth) are Strategic, Achievement, Ideation, Input and Learning. My learning style is Abstract, Historical, Written, Modeling and Social. These two sets alone tell you quite a bit about me, and help my colleagues work with me. Putting them together, though, adds a layer of insight.

Mapping Strengths to Learning Style

Strategic thinking requires high-level models and an understanding of how and why things are as they are. This is necessary to effect change. My abstract, historical, modeling learning style both underlies and supports my strength in strategic thinking.

Achievement – the drive to excel. I spend a lot of time thinking and learning and working to apply what I am learning to test it and see if it will drive success.

Ideation – abstract thought and modeling drive new ideas, as does the urge to go back to the source and explore other paths and potentials. Generating and sharing ideas through conversations are key to both learning and my achievement.

Input – you only have to look at one of my tag clouds to know that I search out information and apply it to ideas and models.

Learning – well that is a bit recursive, and recursive structures are something that fascinates me. Google’s application of recursive structures to its technical architecture is something to learn from, and one can think of Chris Argyris’s double loop learning as a form of recursive structure.

Organizational Goals

Like most organizations, Monitor's LeveragePoint team has goals and these goals are an important input into our learning plans. Organizational goals are built on the SMART model (goals that are Specific, Measurable, Achievable, Relevant and Timely) and each business funciton has its own cascading goals. Standard stuff. Asking each person to explicitly link Learning Goals to Organizational Goals helps all of us understand the Organizational Goals and what it will take to achive them. We have also found that it helps to build commitment.

Learning Goals

Once we have shared something on how we learn, each person develops a set of three-to-seven Learning Goals. These are also meant to be SMART goals. A good Learning Goal has several parts.

  • The learning loal and sub-goals.
  • Its link to the organizational goals.
  • A mentor.
  • Learning buddies (optional).
  • Resources (these can be people, communities of practice, books, conferences, conversations, web resources, processes, etc.).
  • Evidence (internal and external).

Learning Themes

One interesting thing we have discovered in managing our own learning plans is the emergence of common learning themes. Themes are inferred from individual (or team) learning goals, and have common buddy groups, resoruces or evidence. The emergence of a learning theme is sometimes a leading indicator of new business opportunities.

What does this look like?

There are a number of pieces in these learning plans and a sketch may help show how they fit together.

Sketch_of_learning_plan

Click to Enlarge

This is no more than a sketch, and the model evovles with use as new concepts and relationships are uncovered. But it is a starting point.

Implementation

At Monitor's LeveragePoint group we manage our learning themes using a Semantic MediaWiki. MediaWiki is an open source softeware package that was originally developed to run Wikipedia. The Semantic MediaWiki adds in a semantic layer that supports RDF and allows users to jointly create a much more robust knowledge model. As we use semantic technologies extensively in our sollutions, this gives everyone on the team to get their hands dirty, so to speak, and learn by doing. Most people update this wiki several times a week, as new resources are used and as evidence accumulates. In fact, this wiki is slowly becoming our knowledge management system!

Presentation at SemTech 2007

As you can imagine, several of us share 'semantics" as a learning theme. This is a deep area with a lot happening and much to learn. One piece of evidence that we are progressing towards this goal is to make presentations at conferences on our work, and the first such presentation will be at SemTech 2007 in San Jose later this May.

Wikipedia

If you are interested in learning a bit more about learning plans, there is a useful article on Wikipedia.

May 12, 2007

Enabling Action

Monitor’s LeveragePoint solutions are built around enabling effective action. What does this mean?

Enabling – we assume that the individual and team are central. Our solutions help people to work more effectively, make better choices and then to scale the implementation of the choices. They do not pretend to replace people or the need for hard thought and real choices (a ‘real choice’ is a choice that prevents you from making some other choice, or as Pankaj Ghemawat says, strategy means commitment).

Effective – the choices and actions that follow from them have an impact on the organization. The impact is determined by measuring the outcomes of the choices and the outputs of the action (a double loop). Hypotheses are stated clearly and then tested. Effective action is action that changes in response to the outcomes and the environment. It is not static.

Action – as choices are made they have to be acted on. Even the situation where the ‘choice is not to act’ is a kind of action, based on hypotheses, and the outcome needs to be measured and the choice reflected on. In the CAM action is divided into tasks and their outputs which are explicitly linked to outcomes. The outcomes are linked back to the choices.

Our approach to enabling effective action has been captured in what we call our CAM, a formal model of Choice-Action-Measurement. The CAM is being developed together with our clients and users and is grounded in Jonathon Levy’s vision of real-time change management and user-centric learning. Its role in the LeveragePoint solutions is to provide an overall framework to support rapid configuration and evolution of action-focused solutions.

I will write more about the LeveragePoint CAM in future posts, but here I want to look back at some of the work that it is grounded in.

Action Science and Chris Argyris

One of the pleasures of working at Monitor’s Cambridge office is to occasionally meet Chris Argyris in the hall and to have the chance to engage with him, and at times to have one’s motives, assumptions and behaviors questioned at profound and uncomfortable levels. The man is rigorously honest. When we began to look for the foundations of a model of effective action the first place we looked was to the work of Chris Argyris. His notions of ‘Double Loop Learning’, ‘Ladder of Inference,’ ‘Theory Espoused vs. Theory in Use’ are all coded into our work. But more important than any specific tool is his overall approach to learning, to how people work (and fail to work together), and to listening to what is actually said.

Observe Orient Decide Act (OODA Loops) and Col. John Boyd

Another important part of the Monitor family is Chris Myer from Monitor Networks. We went to Chris early on in thinking about our CAM model and he suggested we look at the work of Col. John Boyd of the United States Air Force. Col. Boyd is one of the first thinkers on strategy to realize that in a networked world strategy has to be implemented real time. Being a fighter pilot no doubt drove this. He came up with a model of Observe-Orient-Decide-Act that inspired our own work.

Cognitive Task Analysis (CTA) and Gary Klein

About seven years ago business consultant and Japan expert Carl Kay introduced me to Gary Klein’s book Sources of Power. In this book Klein investigates how people and teams make decisions in high stakes situations – fire fighters, pilots, doctors in intensive care units. This work has informed a growing practice of cognitive task analysis (CTA) that researches how people act effectively to get things done.

Embodied Cognition and Ed Hutchins

One of my passions is sailing. "There is nothing- absolutely nothing-half so much worth doing as simply messing about in boats." -Ratty said to Mole in Kenneth Grahame's beloved 1908 classic, The Wind in the Willows. So I was delighted when cognition researcher Brian Fisher introduced me to Ed Hutchins classic work Cognition in the Wild. Hutchins work is focused on understanding how teams carry out complex tasks in the real world. The ‘real world’ in Hutchins book is that of coastal navigation by the US Navy and open ocean navigation by the Micronesians. In both cultures cognitive tasks are shared between people, complex methods are reduced to practice, and lives are at stake. There is now a compelling body of work on embodied cognition, and related areas such as the role of metaphor and blends in understanding the world, but it is to Ed Hutchins that I turn when I need to go deeper.

All of these are deep themes and there is much more to explore. I hope to return to each of these ‘points of origin’ over time.

Links

Commitment by Pankaj Ghemawa
When I was deciding whether or not to move from Vancouver and join Monitor, our Chief Content Officer Alan Kantrow gave me a copy of this book. Reading it helped me to think deeper about my own decision to move to Cambridge and what I was committing to at Monitor. Knowing what you are committing to includes thought about what you will not be able to do going forward. 

Action Science by Chris Argyris, Robert Putnam and  Diana McLain Smith
This is one of many excellent books by Chris Argyris that set out his thoughts on how people and organizations learn.

Blur by Stanley M. Davis and Christopher Myer
Many of our world systems are at inflection points, which is why real-time change management has become such a compelling value proposition. This book captures the drivers of this change better than most. “The Speed of Change in the Connected Economy.”

Observe Orient Decide Act or OODA Loops
Col. John Boyd published only two papers on OODA loops, but they are compelling. The Wikipedia article is an excellent introduction.

Sources of Power by Gary Klein
The best introduction to cognitive task analysis because it shows what motivated the work and describes real people in real situations.

Carl Kay’s blog is here.

Cognition in the Wild by Ed Hutchins
Read this book.

Brian Fisher is one of the most innovative thinkers on the psychology of human computer interaction. Learn more here.

And of course The Wind in the Willows.

May 07, 2007

Welcome

Welcome to the LeveragePoint Applications blog. In this blog I will cover the various methods and technologies that lie behind the LeveragePoint solutions. Emerging applications of new methods and technologies to performance will also be explored. My role at The Monitor Group is to provide technical leadership for the LeveragePoint solutions.

I joined the group in the fall of 2006, drawn by Jonathon Levy's vision of the potential to greatly improve human performance in all areas of life by providing support for action and learning in the context of action. This vision is deeply rooted in an approach that values people as people who act together in communities, and are well able to take responsibility for their own actions.

I was also attracted by the community of thought leaders that The Monitor Group has brought together. These include people responsible for the development of leading-edge methods in competitive strategy, such as Michael Porter, organizational development, such as Chris Argyris, and scenario building - the entire team at GBN, led by Eamonn Kelly. Monitor also has the most powerful approaches to marketing, in its GrowthPath offering and to Strategic Pricing, that I have encountered. The opportunity to work in such an environment was too good to pass up.

My own background is varied. In the 1980s I lived in Japan, working at first as a translator (Japanese to English) and then in various media and software development projects, with a focus on software localization and multilingual computing. In the 1990s I was based in Vancouver, BC, where I led companies involved in software localization, multimedia and website development and knowledge management. From 2001 I became involved in the learning industry, where I focused on middleware to integrate the many different types of software application and content used to support learning. I am glad to be at Monitor, where our focus is directly on how to help organizations grow in the ways that are meaningful to themselves, and to do this by delivering cutting-edge software that helps people and teams directly.

To get a better feeling for my interests, you can explore my tags at CrowdTrust.net http://crowdtrust.net/user/Steven+Forth/memos