Author
: Shipra Mittal
Affiliation
: Department of Computer Science &
Engineering, National Institute of Technology
Country
: India
Category
: Computer Science & Information
Technology
Volume,
Issue, Month, Year : 6, 4, November,
2016
ABSTRACT
With
the increasing growth of Internet and World Wide Web, information retrieval
(IR) has attracted much attention in recent years. Quick, accurate and quality
information mining is the core concern of successful search companies.
Likewise, spammers try to manipulate IR system to fulfil their stealthy needs.
Spamdexing, (also known as web spamming) is one of the spamming techniques of
adversarial IR, allowing users to exploit ranking of specific documents in
search engine result page (SERP). Spammers take advantage of different features
of web indexing system for notorious motives. Suitable machine learning
approaches can be useful in analysis of spam patterns and automated detection
of spam. This paper examines content based features of web documents and discusses
the potential of feature selection (FS) in upcoming studies to combat web spam.
The objective of feature selection is to select the salient features to improve
prediction performance and to understand the underlying data generation
techniques. A publically available web data set namely WEBSPAM - UK2007 is used
for all evaluations.
Keyword
: Web Spamming, Spamdexing, Content
Spam, Feature Selection & Adversarial IR
For
More Details :
https://airccj.org/CSCP/vol6/csit65103.pdf
No comments:
Post a Comment