Adaptive sampling for thresholding in document filtering and classification [An article from: Information Processing and Management] Buy on Amazon

https://www.ebooknetworking.net/books_detail-B000RR84EU.html

Adaptive sampling for thresholding in document filtering and classification [An article from: Information Processing and Management]

PublisherElsevier

Book Details

PublisherElsevier
ISBN / ASINB000RR84EU
ISBN-13978B000RR84E4
MarketplaceFrance  🇫🇷

Description

This digital document is a journal article from Information Processing and Management, published by Elsevier in . The article is delivered in HTML format and is available in your Amazon.com Media Library immediately after purchase. You can view it with any web browser.

Description:
Document filtering (DF) and document classification (DC) are often integrated together to classify suitable documents into suitable categories. A popular way to achieve integrated DF and DC is to associate each category with a threshold. A document d may be classified into a category c only if its degree of acceptance (DOA) with respect to c is higher than the threshold of c. Therefore, tuning a proper threshold for each category is essential. A threshold that is too high (low) may mislead the classifier to reject (accept) too many documents. Unfortunately, thresholding is often based on the classifier's DOA estimations, which cannot always be reliable, due to two common phenomena: (1) the DOA estimations made by the classifier cannot always be correct, and (2) not all documents may be classified without any controversy. Unreliable estimations are actually noises that may mislead the thresholding process. In this paper, we present an adaptive and parameter-free technique AS4T to sample reliable DOA estimations for thresholding. AS4T operates by adapting to the classifier's status, without needing to define any parameters. Experimental results show that, by helping to derive more proper thresholds, AS4T may guide various classifiers to achieve significantly better and more stable performances under different circumstances. The contributions are of practical significance for real-world integrated DF and DC.
Donate to EbookNetworking
Prev
Next