This thesis evaluates the potential use of computational techniques to help leverage community knowledge in order to stop hoax emails. Techniques assessed include weighted term based document matching utilising Lp similarity measures and a form of unweighted phrase based matching. These are tested for their effectiveness regarding their ability to identify hoaxes, cope with the evolution of hoaxes and the effects of user added content, and for their low false positive rates. The overall system that these techniques are tested to be a part of supports not only hoax identification, but also user education and feedback. It is observed that the simpler techniques Care the best without generating a large number of false positives, namely phrase based matching and document similarity with an L 1 metric. L2 measurement is also shown to perform reasonably effectively regarding this application, but suffers from over sensitivity when highly similar hoaxes are compared. A detailed look is taken at hoax emails and the reasons behind their propagation, and a concept system from which all that has been described can be utilised has been outlined. The effectiveness of some of the techniques trialed coupled with a good understanding of both hoax emails and the benefits and limitations that exist when utilising this community content indicate that this fusing of community and technology has great potential.
History
Publication status
Unpublished
Rights statement
Copyright the Author-The University is continuing to endeavour to trace the copyright owner(s) and in the meantime this item has been reproduced here in good faith. We would be pleased to hear from the copyright owner(s)