Statistical analysis of web documents: a proposal and a case study