Parvatibai Chowgule College

Madgaon, India

Parvatibai Chowgule College

Madgaon, India

Time filter

Source Type

Adhikari A.,Parvatibai Chowgule College | Adhikari J.,Narayan Zantye College
Intelligent Systems Reference Library | Year: 2015

Many multi-branch organizations transact from different branches, and the transactions are stored locally. The number of multi-branch companies as well as the number of branches of a multi-branch company is increasing over time. Thus, it is important to study data mining on related data sources. A global exceptional pattern describes interesting individuality of few branches. Therefore, it is interesting to identify such patterns. The gist of the chapter is given as follows: (i) Type I and type II global exceptional frequent itemsets in multiple data sources are presented. (ii) The notion of exceptional sources for a type II global exceptional frequent itemset is discussed. (iii) Also the type I and type II global exceptional association rules in multiple data sources are discussed. (iv) An algorithm for synthesizing type II global exceptional frequent itemsets is designed. Experimental results are presented on both artificial and real datasets. We also compare this algorithm with the existing algorithm theoretically and experimentally. The experimental results show that the proposed algorithm is effective. © Springer International Publishing Switzerland 2015.


Adhikari A.,Parvatibai Chowgule College | Adhikari J.,Narayan Zantye College
Intelligent Systems Reference Library | Year: 2015

With the advancement of technologies, mass storage devices are now capable of storing more data. Also, they have become cheaper. Moreover varieties of data collection channels are now available in the market. Data mining is an emerging field of study, and has been applied to various domains. Some new patterns such as conditional pattern, arbitrary Boolean expression induced by itemset, type I global exceptional itemset and type II global exceptional itemset are discussed in this book. Also, some association measures viz., A1, A2, association rules induced by item and quantity, overall association between items, heavy association rule, exceptional association rule, simi1 and simi2 and influence of an item on another item, are reported in different chapters. © Springer International Publishing Switzerland 2015.


Adhikari A.,Parvatibai Chowgule College | Adhikari J.,Narayan Zantye College
Intelligent Systems Reference Library | Year: 2015

Frequent items could be considered as a generic type of patterns in a database. In the context of multiple data sources, most of the global patterns are based on local frequency items. A multi-branch company transacting from different branches often needs to extract global patterns from data distributed over the branches. Global decisions could be made effectively using such patterns. Thus, it becomes important to cluster local frequency items in multiple databases. In this chapter, an overview of the existing measures of association is presented. For the purpose of selecting the suitable technique of mining multiple databases, a survey of the existing multidatabase mining techniques is presented. A study on the related clustering techniques is also covered here. We present the notion of high frequency itemsets (HFISs), and an algorithm for synthesizing the supports of such itemsets is designed. It has been shown that the existing clustering technique clusters a set of items at a low level, since it estimates association among items in an itemset with low accuracy, and a new algorithm for clustering local frequency items is designed. Due to the suitability of measure of association A2, on its basis association among items in a high frequency itemset is synthesized. The soundness of the clustering technique has been shown. Numerous experiments are conducted using five datasets, and the results concerning different aspects of the proposed problem are presented in the experimental section. The effectiveness of the proposed clustering technique is more visible in dense databases. © Springer International Publishing Switzerland 2015.


Adhikari A.,Parvatibai Chowgule College | Adhikari J.,Narayan Zantye College
Intelligent Systems Reference Library | Year: 2015

Multi-database mining using local pattern analysis could be considered as an approximate method of mining multiple large databases. Assuming this point of view, it might be required to enhance the quality of knowledge synthesized from multiple databases. Also, many decision-making applications are directly based on the available local patterns present in different databases. The quality of synthesized knowledge/decision based on local patterns present in different databases could be enhanced by incorporating more local patterns in the knowledge synthesizing/processing activities. Thus, the available local patterns play a crucial role in building efficient multi-database mining applications. We represent patterns in a condensed form by employing a so-called ACP (antecedent-consequent pair) coding. It allows one to consider more local patterns by lowering further the userdefined characteristics of discovered patterns, like minimum support and minimum confidence. The ACP coding enables more local patterns participate in the knowledge synthesizing/processing activities and thus the quality of synthesized knowledge based on local patterns becomes enhanced significantly with regard to the synthesizing algorithm and required computing resources. To secure a convenient access to association rule, we introduce an index structure. We demonstrate that ACP coding represents rulebases by making use of the least amount of storage space in comparison to any other rulebase representation technique. Furthermore a technique for storing rulebases in the secondary storage is presented. © Springer International Publishing Switzerland 2015.


Adhikari A.,Parvatibai Chowgule College | Adhikari J.,Narayan Zantye College
Intelligent Systems Reference Library | Year: 2015

A large class of problems is concerned with temporal data. Identifying temporal patterns in these datasets is a fully justifiable as well as an important task. Recently, researchers have reported an algorithm for finding calendar-based periodic pattern in a time-stamped data and introduced the concept of certainty factor in association with an overlapped interval. In this chapter, we have extended the concept of certainty factor by incorporating support information for effective analysis of overlapping intervals. We have proposed a number of improvements of the algorithm for identifying calendar-based periodic patterns. In this direction we have proposed a hash based data structure for storing and managing patterns. Based on this modified algorithm, we identify full as well as partial periodic calendar-based patterns. We provide a detailed data analysis incorporating various parameters of the algorithm and make a comparative analysis with the existing algorithm, and show the effectiveness of our algorithm. Experimental results are provided on both real and synthetic databases. © Springer International Publishing Switzerland 2015.


Adhikari A.,Parvatibai Chowgule College | Adhikari J.,Narayan Zantye College
Intelligent Systems Reference Library | Year: 2015

Frequent itemsets determine major characteristics of a transactional database. An arbitrary Boolean expression can be thought as a generalized form of a query. It offers important knowledge to an organization. It is important to mine arbitrary Boolean expressions induced by frequent itemsets. In this chapter, we have introduced the concept of generator of an itemset, and showed that every Boolean function can be synthesized by its generator. The concept of conditional pattern has been introduced in Chap. 2. We discussed a simple and elegant framework for synthesizing generator of an itemset and designed an algorithm for this purpose. Experimental results are provided on four different databases © Springer International Publishing Switzerland 2015.


Adhikari A.,Parvatibai Chowgule College | Adhikari J.,Narayan Zantye College
Intelligent Systems Reference Library | Year: 2015

In view of answering queries provided in multiple large databases, it might be required to mine relevant databases en block. In this chapter, we present an effective solution to clustering multiple large databases. Two measures of similarity between a pair of databases are presented and study their main properties. In the sequel, we design an algorithm for clustering multiple databases based on an introduced similarity measure. Also, we present a coding, referred to as IS coding, to represent itemsets space efficiently. The coding of this nature enables more frequent itemsets to participate in the determination of the similarity between two databases. Thus the invoked clustering process becomes more accurate. We also show that the IS coding attains maximum efficiency in most of the cases of the mining processes. The clustering algorithm becomes improved (in terms of its time complexity) when contrasted with the existing clustering algorithms. The efficiency of the clustering process has been improved using several strategies that is by reducing execution time of the clustering algorithm, using more suitable similarity measure, and storing frequent itemsets space efficiently. © Springer International Publishing Switzerland 2015.


Adhikari A.,Parvatibai Chowgule College | Adhikari J.,Narayan Zantye College
Intelligent Systems Reference Library | Year: 2015

Many multi-branch companies transact from different branches. Each branch of the company maintains a separate database over time. The variation of sales of an item over time is an important issue, and therefore, we present the notion of stability of an item. Stable items are useful in making numerous strategic decisions of the company. We have discussed two measures of stability of an item. Based on the degree of stability of an item, an algorithm is designed for finding partition among items in different data sources. Then the notion of the best cluster is introduced by considering average degree of variation of a class, and designed an algorithm to find clusters among items in different data sources. The best cluster is determined by average degree of variation in a cluster. Experimental results are provided for three transactional databases. © Springer International Publishing Switzerland 2015.


Adhikari A.,Parvatibai Chowgule College | Adhikari J.,Narayan Zantye College
Intelligent Systems Reference Library | Year: 2015

Most of the real market basket data are non-binary in the sense that an item could be purchased multiple times in the same transaction. In this case, there are two types of occurrences of an itemset in a database: the number of transactions in the database containing the itemset, and the number of occurrences of the itemset in the database. Traditional support-confidence framework might not be adequate for extracting association rules in such a database. In this chapter, we introduce three categories of association rules. We introduce a framework based on traditional support-confidence framework for mining each category of association rules. We present experimental results based on two databases. ©.Springer International Publishing Switzerland 2015.


Adhikari A.,Parvatibai Chowgule College | Adhikari J.,Narayan Zantye College
Intelligent Systems Reference Library | Year: 2015

The model of local pattern analysis provides sound solutions to many multi-database mining problems. In this chapter, we discuss different types of extreme association rules in multiple databases viz., heavy association rule, high-frequency association rule, low-frequency association rule, and exceptional association rule. Also, we show how one can apply the model of local pattern analysis systematically and effectively. For this purpose, an extended model of local pattern analysis is presented. The extended model has been applied to mine heavy association rules in multiple databases. Also, we justify why the extended model works more effectively. An algorithm for synthesizing heavy association rule in multiple databases is given. Furthermore, we show that the algorithm identifies whether a heavy association rule is high-frequency rule or exceptional rule. Experimental results are provided for both synthetic and real-world datasets and a detailed error analysis is carried out. Furthermore, we present a comparative analysis by contrasting the proposed algorithm with some of those reported in the literature. This analysis is completed by taking into consideration the criteria of execution time and average error. © Springer International Publishing Switzerland 2015.

Loading Parvatibai Chowgule College collaborators
Loading Parvatibai Chowgule College collaborators