Project Objectives and Scope
The objectives of the term project is that you will have a good understanding of the given research topic, provide insight into its solution and a well defined strategy for its solution. You should treat the term project as if you were doing the initial background study for further in-depth research. In other words, the report should demonstrate an understanding of and an insight into the problem such that given enough time, you could carry it to its logical conclusion and complete the research.
Project Description
The project has two parts: an in-depth literature review and an implementation of a classification problem. For groups that with 1 (i.e., individual project) or 2 student(s), only literature review is required, see details in Deliverable section.
• Literature review. It describes the problem domain with proper problem definition, and a survey of existing work. The research topic of this term project is Web Mining and Content Analysis.
The sub-topics include: a. Crawling and indexing Web content; b. Web recommender systems and algorithms; c. Summarization of Web data; d. Data, entity, event, and relationship extraction; e. Knowledge acquisition and automatic construction of knowledge bases; f. Large-scale graph analysis. Please pick one of them.
You should be looking at the proceedings of conferences such as WSDM, WWW, SIGIR, ICDM, KDD, SIGMOD, VLDB, ICDE, … and at journals such as IEEE TKDE, Data Mining and Knowledge Discovery Journal, Journal of Intelligent Information Systems, Intelligent Data Analysis, World Wide Web, VLDB Journal, Knowledge and Information Systems and many others. Most of these publications can be obtained through DBLP: https://dblp.uni-trier.de/db/index.html. This is not meant to be a complete list or may not even be the most important ones from your perspective. Please do your research and find the relevant papers to your chosen topic. Our hope is that by the time you complete the project, you’ll have a good idea of what the area is about and what the most important publications are.
• Classification Implementation: You need to implement C4.5 Decision Tree algorithm in c++ and perform classification on the following dataset: https://www.kaggle.com/ uciml/red-wine-quality-cortez-et-al-2009. You are required to split the dataset into 80% training and 20% test.

WeCreativez WhatsApp Support
Stuck with your assignment? When is it due? Chat with us.
👋 Hi, how can I help?