Towards Ontology-Based web text Document Classification

Document Type : Original Article


Egyptian Armed Forces, Egypt.


The data on the web is generally stored in structured, semi-structured and un- structured formats; from the survey the most of the information of an organization is stored in unstructured textual form .so, the task of categorizing this huge number of unstructured web text documents has become one of the most important tasks when dealing with web. Categorization, Classification, of web text documents aims in assigning one or more class labels, Categories, to the un-labeled ones; the assignment process depends mainly on the contents of the document itself with the help of using one or more of machine learning techniques. Different learning algorithms have been applied on the content of text documents for the classification process. In this paper experiments uses a subset of Reuters-21578 dataset to highlight the leakage and limitations of traditional techniques for feature generation and dimensionality reduction, showing the results of classification accuracy, and F-measure when applying different classification algorithms.