Web mining is the application of data mining techniques to extract the knowledge available in Web data. It includes Web documents, hyperlinks between documents, usage logs of web sites, etc. Nowadays its trend to extract data from different sources and organizes them for better usage
Firstly it was a ’process-centric view’, which defined Web mining as a sequence of tasks. Second was a ’data-centric view’, which defined Web mining in terms of the types of Web data in the mining process. The attention paid to Web mining is in research of software industry, and Web-based organizations. It is the chance to capture them in a systematic manner, and identify directions for future research..
Web data mining consists of 3 following tasks
- Resource finding: It involves the task of retrieving intended web documents. It is the process by which the data either from online or offline text resources are available on web.
- Information selection and pre-processing: It involves the automatic selection and preprocessing of specific information from retrieved web resources. This process transforms the original retrieved data into information. The transformation could be renewal of stop words, stemming or it may be aimed for obtaining the desired representation such as finding phrases in training corpus.
- Generalization: It automatically discovers general patterns at individual web sites as well as across multiple sites. Data Mining techniques and machine learning are used in generalization 4. Analysis: It involves the validation and interpretation of the mined patterns. It plays an important role in pattern mining. A human plays an important role in information on knowledge discovery process on web.
Many businesses have been adopting the process of data mining to catch up with others. Business taking important information through data mining is widely used for decision making purposes. Here are some recent trends in Data Mining are:
- Multimedia Data Mining: It is one of the latest processes to catch up because of growing ability to capture useful data from different sources. Different sources include audio, text, hypertext, video, images etc. and data is transformed into a numerical representation in different format.
- Ubiquitous Data Mining: This involves mining of data from mobile devices to get information of individuals. These having several challenges like complexity, cost privacy, etc. these methods has a lot more opportunities to be enormous in these type of industries.
- Disturbed Data Mining: Though data mining has gained popularity as it involves mining huge amount of information stored in different company location. To extract this data highly sophisticated algorithms are used to provide proper insights and reports based on them.
- Satial And Geographic Data Mining: The new type of data mining includes extracting information from environment, astronomical, and geographical data as image is taken from space. These data mining can reveal various aspects such as distance and topology which is used in geographical information system and navigation too.
Time Series and Sequence Data Mining: These type od data mining studies about cyclical and seasonal trends. It is helpful in analyzing random events occur outside normal events. It is mainly used by retail companies to access customer buying and their behaviors.