Cyberspace. A consensual hallucination experienced daily
by billions of legitimate operators.

William Gibson, Neuromancer.

A life spent making mistakes is not only more honorable, but more useful than a life spent doing nothing.

George Bernard Shaw.

François/phnk

‘Semanticizing’ and ‘Patternizing’ the Web: Methodological Remarks

1 janvier 2006 · Web/design

John Allsopp has a very interesting survey report on WebPatterns and WebSemantics. My own survey is quoted as an attempt to identify recurrent structures in web design architectures.

Yet John’s own survey is much more interesting in may ways. Not only his observations are much more finely tuned than my own analytic criteria, his sample is also much, much larger. Thus his stats actually mean something and may be considered as representative.

When I started writing my own article, I quickly faced a methodological obstacle in the sense that I did not (and still do not) know enough about search engine to write my own crawler. I hence switched to a qualitative approach of web design structures, focussing on a small group of highly influential professionals.

My work brought only a few critics, and those precisely pinpointed the fact that highly influential is a very subjective statement, which is true. While I was writing the draft for my second survey (unpublished and quite abandoned as of now), I tried to back up my first intuitions with quantitative indicators, such as XFN relationship networks or conferences appearings.

John went around this problem and wrote a crawler. His sample data is hence a large one (n=1315), although we do not and cannot know out of how many N web sites we are extracting this sample. Another question should be where to crawl, and this means writing precise heuristics. Last but not least, gathering results usually raises additional questions : what do we need to know now? For example, why do maximum occurence rates stay are so low, even on extremely logical standard values (e.g. header and footer respectively concern less than 7 and 5 percent of the sample)?

If I were to launch a new survey, I would also care less about the number of variables than about their co-occurence. Indeed, as observed previously, about 6.7% of all web sites feature a footer and about 4.7% feature a header. Are these the same websites?

This last problem temporarily closes my list of issues about methods of data collection and subsequent analysis on the Web. Whatever our understanding of web design structures, we will face these methodological issues because they are inherent to the way the Internet is built, and that is: messily. Just as for society, there is no map of all individuals. Social sciences usually go for telephone numbers, physical addresses, job categories and household taxinomies, which we currently lack on the Web, despite the Google and Technorati efforts.

We also need a way to make our research cumulative. This means separate surveys have less value than when connected to other results, either similar or different in scope, nature and outcomes. ‘Capitalizing’ our findings would mean joining the data and building occurence schemes between surveys, which cannot be done without precise standards.

Those interested in the quantitative or qualitative data question and the larger problem of representativity will gain to read Designing Social Inquiry: Scientific Inference in Qualitative Research, an excellent book co-authored by Gary King from the Social Science Statistics Blog. First chapter and contents are available online.

,

Référence : François, ‘Semanticizing’ and ‘Patternizing’ the Web: Methodological Remarks, Boîte Noire, 1 janvier 2006.
Accessible en ligne : http://phnk.com/blog/design/semanticizing-patternizing/.

Discussion

2 commentaires :

Thanks for the thoughtful analysis of my survey, Francois. Perhaps we can talk more and I could tweak the crawler to return more information that you are interested in.

john allsopp, 7 janvier 2006

Thank you for your offer, I’ll keep it in mind. I’m way too busy for the coming months but I will update my work one day, I’ll email you then.

François, 7 janvier 2006

Laisser un commentaire :