Making Sense of Microposts #2012, 16 avril, Lyon, France

Cette année, pour la deuxième fois, nous organisons l’atelier “Making Sense of Micropostes” qui vise à reunir des acteurs de recherche et de l’industrie travaillant avec les tweets, check-ins et d’autres contributions utilisateur à faible effort (microposts par la suite). Aujourd’hui, avec l’expansion et la maturité du Web 2.0 où les interactions des utilisateurs deviennent de plus en plus frequentes et de plus en plus abrégées, le traitement des microposts et l’extraction de sens à partir d’eux est importante plus que jamais. Nous aimerions notamment inviter les start-ups qui proposent des services dans ce domaine de soumettre des demonstrations de leurs applications. La raison de plus pour les start-ups françaises de soumettre cette année est que #MSM2012 aura lieu en France, à Lyon, au sein du congrès WWW (le plus grand congrès lié au domaine du Web).

Pourquoi soumettre vos demos à #MSM2012:
- pour exposer vos idées et vos prototypes à un regard critique du monde de la recherche. Chaque démo sumis recevrai au moins 2 avis détaillés des chercheurs experimentés dans le domaine.
- pour rencontrer des autres acteurs, de recherche et d’industrie, devenir mieux connecté au sain de votre domaine d’activité, monter des partenariats avec des institutions de recherche du monde entier qui travaillent sur des problématiques similaires aux vôtres.
- s’informer des travaux existantes et des initiatives potentiellement pertinentes pour votre activité.

Nous vous attendons nombreux.

Pour toute information concernant le #MSM2012 visiter http://socsem.open.ac.uk/msm2012/
Pour les renseignements concernant le prix d’inscription: http://www2012.wwwconference.org/?page_id=2317

Why the Era of Web Search is Coming to an End?

The growing quantity of information that is coming to users from many different channels (twitter, Facebook, e-mail, etc.) and creates more and more an effect of saturation on the user’s attention. In such a situation it is reasonable to think that the user would be less and less likely to feel like in need for information and turn to Web search. It is thus more likely, that the user will prefer the information to fight for his attention than to explicitly perform the search.

If we try to look at the users and data from the supply/demand market perspective it becomes obvious that the supply of information is getting much bigger then the demand. I have found the data about the number of internet users worldwide, as well as the data about the quantity of Web information. This allowed me to calculate a Information per User index (graph shown bellow). Although the number of Web users is exponentially growing, the growth of information quantity is even steadier. It is therefore normal to expect the information to fight for the attention of the user, and not the user to look for the information.

Implications of this difference in growth are numerous. Firstly, the way we interact with information is likely to change. It is reasonable to expect the users to favor modes of acquiring information through recommendations and other means for information to reach them, and not through explicitly looking for information. Secondly, it would be interesting to observe if the information surplus and the inability of users to consume all the information would lead to the fall of information production, and thus to an “information crisis” comparable to the phenomenon of economic crisis. Our ability to manage the growth and the use of information will have major impact on the future of Web technologies.

Pertinence et Découverte : sont-elles opposées dans le combat contre la saturation numérique ?

« Ouvre les yeux », dit un voix dans le film d’Alejandro Amenabar dont le titre reprend cette phrase. Ensuite, cette voix nous confronte à une révélation d’une réalité, autre que celle qu’on vient d’observer – une réalité à laquelle on ne s’attendait pas, mais qui a dû exister à notre insu et qui explique les ambiguïtés de la réalité perçue.

Quand on utilise le Web aujourd’hui, on se trouve assez souvent dans une situation pareille, où on a l’impression d’être bien informé, alors que les informations pertinentes peuvent nous échapper facilement. Au tout début du Web, dans les années 90, le Web était une source de découvertes. On s’en servait pour découvrir de nouvelles personnes et des inconnus, complètement inattendus; à développer des nouveaux centres d’intérêt, à s’enrichir. Avec les outils de type Web 2.0, la quantité du contenu sur le Web s’est augmentée exponentiellement, mais la capacité des utilisateurs à consommer cette nouvelle richesse a ses limites. Confronté à cet afflux énorme d’informations sur le Web, les grands acteurs (notamment des sites de réseaux sociaux) ont développé des stratégies de filtrage et de personnalisation qui permettent aux utilisateurs de se focaliser sur des sous-ensembles du Web les plus pertinents. Permettre à l’utilisateur de se servir d’une manière pertinente de cet énorme espace d’informations est la raison d’être de ces sites.

Sur Facebook, par exemple, on ne voit que des informations publiées par nos amis. Même si ces informations sont effectivement susceptibles d’être les plus pertinentes, elles ne donnent pas, de loin, une vision complète des informations auxquelles on peut s’intéresser. En se focalisant sur des informations filtrées, un utilisateur reste privé des découvertes. Le besoin naturel des utilisateurs de suivre le chemin de leur curiosité, de découvrir des nouvelles choses, d’être enrichi, est de plus en plus négligé dans les services Web qu’on utilise aujourd’hui.

Même les fonctionnalités qui sont mises en place afin d’améliorer l’expérience des utilisateurs en leur suggérant des choses pertinentes, restent assez naïves. Sur Amazon, par exemple, un client qui veut acheter un livre précis va recevoir des offres de livres les plus fréquemment achetés par des gens qui ont acheté ce même livre. Cette fonctionnalité est en effet utile et elle facilite la navigation dans les milliers d’articles proposés par Amazon, mais elle nous ouvre les yeux uniquement sur ce que les gens très similaires à nous ont pu chercher. On ne peut pas vraiment faire de découvertes enrichissantes qui permettront d’explorer un monde au delà de nos limites connues comme on a pu faire au début du Web.

D’une certaine manière ces limites peuvent se présenter comme inhérentes au filtrage et à la personnalisation. Elles peuvent sembler inévitables dans le combat contre la saturation numérique. Il est donc difficile d’imaginer un système qui soit capable de parcourir la masse d’informations croissante sur le Web et de proposer à l’utilisateur de découvrir quelque chose de complètement nouveau, d’inconnu et de surprenant, qui soit en même temps pertinent. Néanmoins, chez hypios, nous refusons de croire que la fin de l’époque ludique des découvertes sur le Web est arrivée. Pour cette raison nous avons cherché les moyens d’ouvrir les yeux des utilisateurs sur des possibles découvertes, en gardant néanmoins un grand degré de pertinence.

Les solutions possibles, utilisant le Web Sémantique, seront presenter dans un prochain billet de blog.

Different Flavors of Relatedness

In earlier blog posts I talked about Semantic Proximity of concepts and using Linked Data to derive a notion of semantic relatedness. Driven by a more theoretic part of my thesis I was lead to consider other different ways to compute relatedness of concepts, such as those based on co-occurrence in texts, or those relying on the social graph. While we may speak of different performance of those approaches in different situations, there is nothing that would stop us from combining them. Obviously, if two different notions identify a pair of concepts as mutually related, then we can be more certain about their relatedness. But there is an additional richness in combinations as different combinations of different notions might result in different types of relatedness. The following image represents different types of relevance notions and the classes of relatedness emerging from their combinations.

Different notions of relevance

Social Relevance comes out of social connections or similarity between people. The systems that use this notion rely on the assumption that a person is likely to be interested in what the person’s fiends are interested in. Facebook suggests friends of our friends as people we might be interested in befriending. It also shows content liked by our friends as relevant to us. Other systems construct user profiles and, in the lack of any information about friendship deduce the information about similar people, and use those profiles of similar people to recommend things (in a way similar to what Amazon does).

Advantages

  • The basic assumption of this approach is strongly confirmed by the actual human practices [find studies that show that]. People often like to know what their friends are interested into. Friendships and connections contribute to the development of interests and therefore recommendations based on this assumption are likely to be judged as desired.
  • Known by users and easy to understand why something is recommended to them.

Disadvantages

  • Often difficult to construct due to intrasparency of the social graph. It is difficult to obtain social graph information, and this approach is mostly applicable only for Social Networks who have access to such data.

Content Relevance comes out of co-occurrence of concepts/terms in texts. The basic assumption behind this approach is that if two terms or concepts appear frequently together in texts, or similar concepts sets, they are likely to be related and relevant for one another. Such an approach is used by Google AdWords to look into terms that co-occur in search queries and suggest relevant terms for advertising campaigns, or for Google Suggest that proposes useful additional keywords in Web search.

Advantages

  • Relatively easy to obtain a corpus on the Web, which makes this method highly accessible.
  • Tools for performing it are available as open source.
  • Widely used and known by developers.

Disadvantages

  • The quality of recommendations depends heavily on the corpus used, and its fitness for the recommendation domain and scenario.
  • Relatively easy to influence the results by producing content with an intention to enforce false relevance. Content farms represent a threat to the approach if the Web content is used unrestrictively.

Semantic Relevance comes out of relations of concepts explicated in some semantic knowledge base/graph. Approaches using WordNet, DBpedia and similar knowledge bases have been proposed, mostly in research, to establish a notion of semantic relatedness and use those knowledge bases for concept suggestion.

Advantages

  • The approach is based on the meaning, and therefore likely to provide insight into more complete and less expected recommendations then statistics-based approaches.

Disadvantages

  • The quality of recommendations depends heavily on the chosen knowledge base, and its fitness for the recommendation domain and scenario.
  • The availability of knowledge bases usable in this approach is not high, and for some cases the application of this method would have to involve a construction of a specific knowledge base.

Combined Approaches

Once we have outlined the 3 basic notions of relevance it is interesting to look at their possible combinations. Being grounded in different basic assumptions, the 3 basic approaches produce qualitatively different suggestions of related concepts. We look at those differences and provide an overview of their possible combinations, by trying to predict the qualitative nature of recommendations that the combined approaches would be able to provide.

Social, Content and Semantic relevance

Concepts that are considered relevant by all 3 notions of relevance, are likely to be the most highly relevant concepts, almost the same as the initial input concepts.

Social and Semantic relevance, non-Content

Concepts that are both related by meaning, and are used by connected and similar people would indicate the things used by a same circle of people and that are related by meaning. Recommendations based on this combined notion can help define communities of practice, and especially point to the concepts that are not often used in the same context, but rather used by the same and similar people in different contexts.

Social and Content relevance, non-Semantic

Concepts that often co-occur in content and are used by people who are connected, are likely to define common situations and contexts that a particular community usually faces. The co-occurrence in texts indicates that the concepts are used in the same context (the one that the text is about), and the additional relevance achieved by connected people indicates that this context is actually used by people who know each other (or who may otherwise be considered as similar). However, because of the lack of semantic relations between the concepts, it is not likely that the people are connected by their domain of knowledge and activity, but rather by other interests and affinities.

Semantic and Content relevance, non-Social

Concepts that are both related by their meaning and co-occur in content are likely to represent similar or interdependent things that are often mentioned together because of their functional interdependence.

Social, non-Content, non-Semantic

Concepts that are relevant only in the social sense, with no semantic relevance and that do not co-occur in content, are likely to be interest associations – things that similar and like minded people are interested in, but are so different that they may rarely be referred to in the same context. Relevance in this sense might, for instance, result from the fact that people interested in Football often befriend people interested in Biology.

Content, non-Social, non-Semantic

Concepts related only by co-occurrence in content, without any semantic similarity and without a community using them together, are likely to define a vocabulary of situations and contexts that people who are not like-minded nor connected can face.

Semantic, non-Social, non-Content

Concepts related only by meaning, and not used by similar/connected people, and not co-occurring in content, are likely to be related concepts that a common user would not think of as related but would recognize them as such. They lack of joint use makes such semantic connections often overlooked, possibly even by experts – as those relevance relations do not take part in defining the communities of practice.

 

Amy Winehouse

I generally do not care much for music. Weirdly enough, I can imagine life without it and I am not touched by the majority of music I hear. There was however one (and so far only one) artist who managed to deeply touch me with her music – Amy Winehouse. Her voice, her songs, her attitude, her style – everything was so beyond everything else we can see that one can only stop, listen and admire.

She was a truly exceptional artist, and one of the rare people who could push the boundaries of what we know and make a step forward. Since she died, I wonder every day what kind of society have we built if exceptional people can just die of overdose as if their illness was their own responsibility. It is a society that assumes no responsibility whatsoever, and yet we are paying a lot for all of its institutions to function. What is the true purpose the health system, I wonder? To whom does the society serve?

Discovering the Unknown Relevant Keywords

Research approaches for keywords suggestion have been around for quite some time. The need to help the users chose their keywords for tagging, web search and similar task lead to the development of a number of ways to suggest relevant keywords. Today, with the advent of web advertising, the finding relevant keywords has got a completely new dimension, as suggesting keywords no longer means just helping the user navigate on the Web, but also means driving the relevant visitors to your Web page. More and more services offer to suggest you the relevant keywords that cost less in advertising campaigns and that can pull you more traffic. However, there is an important dimension that those approaches have been missing out, and that significantly improve the way we discover new relevant keywords – it is their meaning. In this blog post I talk about how we use this important dimension for our keyword discovery needs at hypios, and report about the interesting results we have had.

hyProximity

The existing keyword suggestion approaches rely on (a) co-occurrence of terms in text corpora; (b) co-occurrence in search results; (c) controlled taxonomies such as Open Directory Project (ODP), and controlled vocabularies such as Wordnet. The approaches (a) and (b) both provide quite limited potential for discovery of unknown keywords, as they are based on co-occurrence. In other words, they try to look at terms that someone else has already used in combination with your initial terms, and suggest them. This approach does not allow to discover terms that are rarely used in combination with your initial terms, but that are very close in meaning. This is important, as the language we use on the Web is highly dependent our own community of practice/thought. Going beyond the terms used by people similar to us, is very difficult if we rely solely on co-occurrence. Approaches of type (c) have more potential as they they do not use co-occurrence based statistics, but rely on taxonomies and vocabularies. However, ODP is a Web directory, and thus the relations between terms are defined by Web browsing practice. There might be semantic relations between terms, which are not commonly browsed together, and thus would not appear in ODP. Wordnet is on the other hand more oriented at finding synonyms, and remotely related terms fall ourside of its scope.

For these reasons, we have turned to a Semantic Web-based approach, using DBPedia – a Semantic Web version of Wikipedia, to discover relevant terms. In DBPedia, terms – concepts, are grouped in categories by their meaning. As such this source of encyclopedic knowledge should enable the discovery of the keywords that are semantically related, but that an average user might not even know about.

Our system uses the distance between two terms in the graph of DBPedia semantic concepts, to calculate their semantic relatedness, called hyProximity. The shorter the distance in the graph, the higher the hyProximity. The more links the two concepts share, the higher the hyProximity will be.

Case Study

We have used hyProximity in our own use-case in hypios, and have obtained very interesting results. Our standard procedure, when we have a new innovation problem on hypios is to take the keywords related to the problem, and look for experts in our giant, cross domain, 900.000 expert base. Finding keywords relevant to the problem, that do not appear in the problem text is important in order to reach the relevant experts in most diverse domains, who might be able to bring an innovative solution. We have used hyProximity to obtain additional keywords for expert search, and compared those keywords with what we get from AdWords KeywordTool for the same inputs. We identified 1802 experts using the keywords directly present in the problem text; 2849 experts with hyProximity keywords, and 2061 experts using the keywords from AdWords keyword tool. The most interesting phenomenon is that the overlap between the experts identified by hyProximity and AdWords keywords is very low. Finally, we measured the interest expressed by the identified experts (through their response to our e-mails). The response rate obtained in the hyProximity group was 10% grater then with the AdWords keywords, and 19% grater then with the keywords present directly in the text.

This result leads to a conclusion, that there is a significant number of semantically related keywords, that fall completely out of scope of the co-occurrence based keywords suggestion approaches. If you trust that the non-semantic keyword suggestion approaches are giving you all the relevant keywords, then you are missing out a lot of relevant traffic.

We are preparing a research publication and a public beta version of our tool, and will be disclosing more experiences with using semantic technologies for keyword discovery soon.

Making Sense of Microposts 2011

===============================================================

Workshop: Making Sense of Microposts (#MSM2011)
at ESWC 2011

http://research.hypios.com/msm2011

29/30 May 2011. Heraklion, Crete

===============================================================

THEME
——-

Making Sense of Microposts: Big things come in small packages

Twitter, Facebook Like, Foursquare, and similar low-effort publishing services reduce significantly the effort required to participate on the Web. Enormous quantities of small user input are being piped into the data streams of the Web, leading to a rate of growth never before witnessed. We refer to such user input as “Microposts”; these can range from ‘check-in’ at a location on a geo-social networking site, through to a status update on a social networking platform.
The very large amount of disparate, heterogeneous data that results requires new techniques to glean knowledge from it and provide useful services and applications sitting atop the amalgamation of the semantically rich data.

This workshop will examine, broadly:
* information extraction and leveraging of semantics from Microposts;
* making use of Microposts’ semantics in innovative ways;
* social studies that guide the design of appealing and usable new systems based on this type of data, by leveraging Semantic Web technologies.
The workshop is unique in that it aims to stimulate discussions between the Semantic Web community and researchers in other fields, particularly the Social Sciences, to build more effective and usable tools and techniques for their myriad end users. The interdisciplinary approach aims to help also to break down the barriers to the use of Semantic Web technologies and the very rich semantic data derived from user-generated content.

TOPICS OF INTEREST
——————

Topics of interest include, but are not limited to the areas below. We especially encourage submissions from an interdisciplinary perspective, examining the use of semantic information extracted from microposts from both Semantic Web, Social Sciences and other perspectives.

* Microposts and Semantic Web technologies
o Knowledge Discovery and Information Extraction
o Factual Inference
o Ontology/vocabulary modelling and learning from Microposts
o Integrating Microposts into the Web of Linked Data
* Social/Web Science studies
o Analysis of Micropost data patterns
o Motivations for creating and consuming Microposts
o Relevance of Microposts and factors that influence them
o Community/network analysis of Micropost dynamics
o Ethics/privacy implications of publishing and consuming Microposts
* Context
o Utilising context (time, location, feeling)
o Contextual inference mechanisms
o Social awareness streams and Online Presence
o Event Detection
* Applying Microposts
o User profiling/recommendation/personalization approaches using Microposts
o Public opinion mining
o Trend prediction
o Expertise finding
o Business analysis/market scanning
o Emergency systems
o Urban sensing and location-based applications

WORKSHOP STRUCTURE
——————

A keynote address will be used to open the day and guide initial discussions. This will be followed by paper presentations and open forum discussions based on the topics presented. A poster and demo session will be used to trigger further, more in-depth interaction between participants. The workshop will be concluded with a brief panel discussion and summary, with an aim to form a more permanent discussion group.

SUBMISSIONS
———–

* Full papers: 12 pages
* Short and position papers: 6 pages
* Demos: 2 pages
* Mock-up interfaces: 2 page description AND one of:
– storyboard (max A3)
– video (90 second limit)

Written submissions should be prepared according to the Springer LNCS Publications format (see: http://www.springer.com/computer/lncs?SGWID=0-164-6-793341-0), and should include author names and affiliations.
Submission is via the EasyChair Conference System, at: https://www.easychair.org/conferences/?conf=msm2011. Where a submission includes additional material submission should be made as a single, unencrypted zip file that includes a plain text file listing its contents.

Each submission will receive, in addition to a meta-review, at least 2 peer reviews, with full papers up to 3 peer reviews.

IMPORTANT DATES
—————-

Submission deadline: 4 March 2011
Author Notification: 1 April 2011
Camera-ready deadline: 15 April 2011

Proceedings published (CEUR): 15 May 2011
Workshop – TBA: 29/30 May 2011

CONTACT
——-

E-mail: msm.orgcom@gmail.com
Facebook Group: http://www.facebook.com/#!/home.php?sk=group_180472611974910
Twitter hashtag: #msm2011

ORGANIZERS
———-

Matthew Rowe, KMi, The Open University, UK
Milan Stankovic, Hypios/University Paris-Sorbonne, France
Aba-Sah Dadzie, University of Sheffield, UK
Mariann Hardey, University of Durham, UK

PROGRAM COMMITTEE
———-
* A. Elizabeth Cano, University of Sheffield, UK
* Alexandre Passant, DERI, Galway, Ireland
* Andres Garcia-Silva, UPM, Spain
* Bernhard Schandl, University of Vienna, Austria
* Brian Loader, University of York, UK
* Claudia Wagner, Joanneum Research, Austria
* Dan Mercea, University of York, UK
* Danica Radovanovic, University of Belgrade, Serbia
* David Beer, University of York
* Elena Simperl, University of Innsbruck, Austria
* Eric T. Meyer, Oxford Internet Institute
* Fabien Gandon, INRIA, France
* Guillaume Ereteo, INRIA, France
* Harald Sack, University of Potsdam, Germany
* Harith Alani, KMi, Open University, UK
* Jelena Jovanovic, University of Belgrade, Serbia
* Jennifer Jones, University of the West of Scotland
* John Breslin, NUIG, Ireland
* Jon Hickman, Birmingham City University, UK
* Mischa Tuffield, Garlik, UK
* Oscar Corcho, UPM, Spain
* Pablo Mendes, Freie Universität Berlin
* Philipe Laublet, Universite Paris-Sorbonne, France
* Sofia Angeletou, KMi, The Open University, UK
* Raphael Troncy, Eurecom, France
* Robert Jaeschke, University of Kassel, Germany
* Sergei Sizov, University of Koblenz, Germany
* Shenghui Wang, Vrije University, Holland
* Uldis Bojars, University of Latvia, Latvia
* Victoria Uren, University of Sheffield, UK
* Yves Raimond, BBC, UK
* Ziqi Zhang, University of Sheffield, UK
* (additional PC members are still being confirmed)

Why Users Leave their Data on the Web ?

In one of my previous posts I talked about how value is created on the Web, and how users, by interacting with the services provided on the Web generate value (mostly through generating data and giving attention i.e. marketing space). Even more time ago, I wrote about why some social networks take off and other don’t (at least about some of the possible reasons of this).

However, a key to understanding of those phenomena related to user interaction with the Web and the success/failure of some Web services, is the understanding of motivations of users to use the Web and interact. As a step towards such an understanding I have created a Web pyramid of needs, strongly influenced by Maslow’s general pyramid of needs. The idea is to try to explain the hierarchy of user needs (and find examples of Web services and their features that satisfy those needs).  The basic needs are represented in the lower parts of the pyramid, and should normally indicate more frequent needs. The higher needs usually come later, once the basic ones are met.

The pyramid is still unstable and may be subject to change as new understandings emerge :) but comments are welcome at all stages of its development.

The name is Web. Semantic Web.

A couple of weeks ago Tim Berness-Lee published a new article for Scientific American, called Long Live the Web. This provoked a number of reactions and interpretations that have been circulating around the Web. Most of them include that accusation that TimBL supposedly made about Facebook and how it is putting the Web in danger. Then, there are people saying that companies who rely on users’ online data to target them and make profit are also a danger to the Web. No matter how hard I tried I simply couldn’t find such claims in the text, and this blog post is about my own impressions about this new milestone article.

Facebook: A danger for the Web?

Facebook is indeed a walled garden, largely limiting the possibility of a user to export its data and migrate elsewhere. It is bad. True. But imagine being on a plane, where they serve you rotten food, ans show targeted commercials all over the plane. You regret ever taking this plane, and you want to switch to a different plane. Can you really do that? While you are using the plane’s service, the plane acts as a walled garden. There is no interchangeability between planes. In order to make money, Facebook needs a certain commitment from the user that he will continue to use the service. The commitment is achieved through the user data. (With planes it is even worse, as the commitment is made by user’s life directly). If you regret choosing Facebook, then you should be able to go out, join another network, but the data that you created there for the period of your life where you used it will be lost. Similar is with the plain. You cannot get the moments you lost in RyanAir back. You can only choose more wisely next time.

Although worth of every respect, the ideal of horizontal interoperability is, in my opinion, a utopia. By horizontal interoperability, I mean key data exchange between systems that serve the same purpose. On the other hand, vertical data interoperability is a more realistic dream, as there may be a clear economic benefit of one Web system to exchange data with complementary systems. For instance, this happens when I allow a movie recommendation system to see what other movies have I liked on Facebook. Both Facebook and the movie site can benefit from this.

Again, in the actual TimBL’s article it is different to find the place where he accuses Facebook of anything. He does say walled gardens are bad. Some of them are even on the limit of being classified as “non Web”, but they are not putting the Web in danger. They are just showing how still young the Web is and how we still need to find ways to use it that are acceptable for all actors in both social and economic way. Facebook, even if problematic in many ways, is not putting the Web in danger. It is pushing it to its limits, expending it, experimenting – searching for a model to provide a useful service and earn money. This pushes some problems to the surface, yes. As any research does.

Accessible user data: A danger for the Web?

There is a part of the article dealing with snooping. Although clearly written, I have seen interpretations in newspaper articles saying how companies that use publicly available user data are putting the Web in danger. This is simply not true. The Web is a large information space. It is normal that there is a lot of data about all of us. The fact that the data is more easily accessible then with legacy media is a plus, not a danger. Many companies use it to provide the users with useful services (like importing contacts, etc.). This is no real danger. TimBL gives a good example of danger “Life insurance companies could discriminate against people who have looked up cardiac symptoms on the Web. Predators could use the profiles to stalk individuals.”. But in this example, the real danger is not in the fact that there is the information about your searches online. The danger is in making a conclusion that since you look for cardiac symptoms, you must be having problems. This inference is simply not sound and making it is pure stupidity. The danger is thus in not in data accessibility, but in stupidity of potential data consumers, and the power that is given to them despite their stupidity.

Stupidity is not a new problem, and certainly not a problem that has anything to do with the Web. We encounter stupidity every day in public administration. You cannot stop people from making incorrect inferences, but you can protect citizens from being victims of stupid inferences made by governments and administration. Currently people in public administration have the prover to the lives of ordinary citizens a hell, based on their (often unsound) inferences. The solution is not to stop people from sharing data about them, from being social and adapting the Web to their use. The solution is only in a true change in society and the way it distributes power. By increasing the accessibility of data online, the Web is bringing this fundamental problem of our society to our attention, and this is one of the reasons why the Web is one of the greatest discoveries ever – because of the impact it yet has to make to the society and its democratization.

It is clear from both examples (as well the rest of the TimBL’s text) that the Web has pushed the boundaries of society, and that the organization of today’s society is struggling to keep up with such a powerful tool that the citizens now have.

Apart from those two topics that made such a confusion, there are, in my opinion, some more interesting parts that give outlook for the future, and underline the truly magnificent nature of the Web:

Separation of levels

Web is not Internet. It runs on top of Internet. This has always fascinated me. It means basically, that we could plug the Web one day on top of something else (but compatible). Like some new Internet that would be run on bacteria instead of electricity. Or something totally unimaginable. Just a taught of it makes the Web so powerful.

Semantic?

10 years have passed from the original Semantic Web article, and in today’s “Web” article, the Web is just the Web. The “Semantic” has dropped off. Two possible reasons exist: (1) the Semantic Web dream is not realistic, and (2) there is no other Web then the Semantic Web. I tend to believe the reason (2) is more realistic. It is hard to imagine the Web advance without typed links, and without the possibility to link everything. Thus a bit of semantics is needed, and is an integral part of the Web vision, as the Social Web is a natural part of the Web.

The fact that the word “Semantic” is not so prominent in the text, makes me think that maybe the technical aspects of the Web and its design are less of a concern today, but the relation of the Web and the society is a true challenge for the next decade of the Web era?

Semantic Web at WebDeux.Connect, Paris

Last Friday I had the pleasure to present Semantic Web to a lively community of Parisian entrepreneurs, blogger, tech geeks and start-up people at Webdeux.Connect 2010. Presenting there was a revealing experience in two ways:

1. It made clear to me how far the start-up world actually is from the Semantic Web research, and how there are people totally unaware of its existence (not to mention potential).

2. How quickly, in talks with the start-up people, we come to concrete cases where Semantic Web can make a true difference.

I guess it means that we, the Semantic Web research community, have to make more effort to show not just how Semantic Web can be used and applied to create better technology, but how this better technology can make a change in industry and on the market. This is the real message that speaks to the companies, to people who are on the frontline of creating new technology.

In other words there is a big potential in approaching the research and start-up communities here, and I hope more of it will happen in the future.

You may also watch the video interview I gave for a leading french technology website – FrenchWeb.fr

Entretien avec Milan Stankovic, chercheur @Hypios from frenchweb on Vimeo.

 
Powered by Wordpress. Design by Bingo - The Web Design Experts.
viagra