+1 (208) 254-6996 essayswallet@gmail.com
  

[238] by Brin et al. Because of the limitation of confidence, Brin et al. [238] had proposed the idea of using interest factor as a measure of interesting- ness. The all-confidence measure was proposed by Omiecinski [289]. Xiong et al. [330] introduced the cross-support property and showed that the all- confi.dence measure can be used to eliminate cross-support patterns. A key difficulty in using alternative objective measures besides support is their lack of a monotonicity property, which makes it difficult to incorporate the mea- sures directly into the mining algorithms. Xiong et al. [328] have proposed an efficient method for mining correlations by introducing an upper bound function to the fcoefficient. Although the measure is non-monotone, it has an upper bound expressign that can be exploited for the efficient mining of strongly correlated itempairs.

Fabris and Fleitas [249] have proposed a method for discovering inter- esting associations by detecting the occurrences of Simpson’s paradox [309]. Megiddo and Srikant [282] described an approach for validating the extracted

Don't use plagiarized sources. Get Your Custom Essay on
[238] by Brin et al. Because of the limitation of confidence, Brin et al. [238] had proposed the idea of using interest factor as a measure of interesting- ness.
Just from $13/Page
Order Essay

 

 

396 Chapter 6 Association Analysis

patterns using hypothesis testing methods. A resampling-based technique was also developed to avoid generating spurious patterns because of the multiple comparison problem. Bolton et al. [237] have applied the Benjamini-Hochberg

[236] and Bonferroni correction methods to adjust the p-values of discovered patterns in market basket data. Alternative methods for handling the multiple comparison problem were suggested by Webb [326] and Zhang et al. [338].

Application of subjective measures to association analysis has been inves- tigated by many authors. Silberschatz and Tuzhilin [307] presented two prin-

ciples in which a rule can be considered interesting from a subjective point of view. The concept of unexpected condition rules was introduced by Liu et al. in 12771. Cooley et al. [243] analyzed the idea of combining soft belief sets using the Dempster-Shafer theory and applied this approach to identify contra- dictory and novel association patterns in Web data. Alternative approaches include using Bayesian networks [269] and neighborhood-based information

[2a5] to identify subjectively interesting patterns. Visualization also helps the user to quickly grasp the underlying struc-

ture of the discovered patterns. Many commercial data mining tools display the complete set of rules (which satisfy both support and confidence thresh- old criteria) as a two-dimensional plot, with each axis corresponding to the antecedent or consequent itemsets of the rule. Hofmann et al. [263] proposed using Mosaic plots and Double Decker plots to visualize association rules. This approach can visualize not only a particular rule, but also the overall contin- gency table between itemsets in the antecedent and consequent parts of the rule. Nevertheless, this technique assumes that the rule consequent consists of only a single attribute.

Application fssues

Association analysis has been applied to a variety of application domains such as Web mining 1296,3L71, document analysis 1264], telecommunication alarm diagnosis [271], network intrusion detection 1232,244,275], and bioinformatics

1302, 3271. Applications of association and correlation pattern analysis to Earth Science studies have been investigated in [298, 299, 319].

Association patterns have also been applied to other learning problems such as classification1276,278], regression [291], and clustering1257,329,332]. A comparison between classification and association rule mining was made by Freitas in his position paper [251]. The use of association patterns for clustering has been studied by many authors including Han et al.l257l, Kosters et al. 12721, Yang et al. [332] and Xiong et al. [329].

 

 

Bibliography 397

Bibliography [223] R. C. Agarwal, C. C. Aggarwal, and V. V. V. Prasad. A Ttee Projection Algorithm

for Generation of Flequent ltemsets. Journal of Parallel and Distri,buted Computing (Speci,al Issue on Hi,gh Performance Data Mining),61(3):350-371, 2001.

12241 R. C. Agarwal and J. C. Shafer. Parallel Mining of Association Rules. IEEE Transac- t’ions on Knowledge and, Data Eng’ineering,8(6):962-969, March 1998.

12251 C. C. Aggarwal, Z. Sun, and P. S. Yu. Online Generation of Profile Association Rules. In Proc. of the lth IntI. Conf. on Knowled,ge D’iscouerg and, Data Mining, pages 129- 133, New York, NY, August 1996.

[226] C. C. Aggarwal and P. S. Yu. Mining Large Itemsets for Association Rules. Dafa Engineering B ullet’in, 2l (l) :23-31, March 1 998.

12271 C. C. Aggarwal and P. S. Yu. Mining Associations with the Collective Strength Approach. IEEE Tfans. on Knowled,ge and Data Eng,ineer’ing, 13(6):863-873, Jan- uary/February 2001.

[228] R. Agrawal, T. Imielinski, and A. Swami. Database mining: A performance perspec- tive. IEEE Transactions on Knowledge and Data Eng’ineering, S:9L4 925, 1993.

1229] R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In Proc. ACM SIGMOD IntI. Conf. Management of Data, pages 207-216, Washington, DC, 1993.

f230] R. Agrawal and R. Srikant. Mining Sequential Patterns. ln Proc. of Intl. Conf. on Data Engineering, pages 3-14, Taipei, Taiwan, 1995.

1231] K. Ali, S. Manganaris, and R. Srikant. Partial Classification using Association Rules. In Proc. of the ?rd Intl. Conf. on Knowledge Discouery and, Data M’ining, pages 115 118, Newport Beach, CA, August 1997.

12321 D. Barbarii, J. Couto, S. Jajodia, and N. Wu. ADAM: A Testbed for Exploring the Use of Data Mining in Intrusion Detection. SIGMOD Record,,30(4):15 24,2001.

[233] S. D. Bay and M. Pazzani. Detecting Group Differences: Mining Contrast Sets. Dota Min’ing and Know ledg e Dis cou ery, 5 (3) :2L3-246, 200I.

[234] R. Bayardo. Efficiently Mining Long Patterns from Databases. In Proc. of 1998 ACM- SIGMOD Intl. Conf. on Management of Data, pages 85-93, Seattle, WA, June 1998.

[235] R. Bayardo and R. Agrawal. Mining the Most Interesting Rules. In Proc. of the Sth Intl. Conf. on Knowledge Discouerg and Data Min’ing, pages 145-153, San Diego, CA, August 1999.

[236] Y. Benjamini and Y. Hochberg. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal Rogal Statistical Society 8,57 (1):289-300, 1995.

1237] R. J. Bolton, D. J. Hand, and N. M. Adams. Determining Hit Rate in Pattern Search. In Proc. of the ESF Etploratory Workshop on Pattern Detect’i,on and Discouery in Data Mini,ng, pages 36-48, London, UK, September 2002.

[238] S. Brin, R. Motwani, and C. Silverstein. Beyond market baskets: Generalizing associ- ation rules to correlations. In Proc. ACM SIGMOD IntI. Conf. Management of Data, pages265-276, Tucson, AZ, 1997.

[239] S. Brin, R. Motwani, J. Ullman, and S. Tsur. Dynamic Itemset Counting and Impli- cation Rules for market basket data. In Proc. of 1997 ACM-SIGMOD IntI. Conf. on Management of Data, pages 255 264, T\rcson, AZ, J:lrre L997.

1240] C. H. Cai, A. tr\r, C. H. Cheng, and W. W. Kwong. Mining Association Rules with Weighted Items. In Proc. of IEEE Intl. Database Eng’ineering and Appli,cations Sgmp., pages 68-77, Cardiff, Wales, 1998.

 

 

398 Chapter 6 Association Analysis

[241] Q. Chen, U. Dayal, and M. Hsu. A Distributed OLAP infrastructure for E-Commerce. In Proc. of the lth IFCIS IntI. Conf. on Cooperatiue Information Systems, pages 209- 220, Edinburgh, Scotland, 1999.

12421 D. C. Cheung, S. D. Lee, and B. Kao. A General Incremental Technique for Maintaining Discovered Association Rules. In Proc. of the Sth IntI. Conf. on Database Systems for Aduanced Appl’ications, pages 185-194, Melbourne, Australia, 1997.

[243] R. Cooley, P. N. Tan, and J. Srivastava. Discovery of Interesting Usage Patterns

from Web Data. In M. Spiliopoulou and B. Masand, editors, Aduances in Web Usage AnalEsis and User ProfiIing, volume 1836, pages 163-182. Lecture Notes in Computer Science, 2000.

12441 P. Dokas, L.Ertoz, V. Kumar, A. Lazarevic, J. Srivastava, and P. N. Tan. Data Mining for Network Intrusion Detection. In Proc. NSF Workshop on Nert Generation Data

M’ini,ng, Baltimore, MD, 2002.

1245] G. Dong and J. Li. Interestingness of discovered association rules in terms of neighborhood-based unexpectedness. In Proc. of the 2nd, Paci,fi,c-Asia Conf. on Knowl- ed,ge Discouery and Data Min’i,ng, pages 72-86, Melbourne, Australia, April 1998.

[246] G. Dong and J. Li. Efficient Mining of Emerging Patterns: Discovering Tbends and Differences. In Proc. of the 5th Intl. Conf. on Knowledge Discouery and Data M’ining, pages 43-52, San Diego, CA, August 1999.

12471 W. DuMouchel and D. Pregibon. Empirical Bayes Screening for Multi-Item Associa-

tions. In Proc. of the 7th IntI. Conf. on Knowledge D’iscouerg and, Data Mining, pages

67-76, San Flancisco, CA, August 2001.

[248] B. Dunkel and N. Soparkar. Data Organization and Access for Efficient Data Mining. In Proc. of the 15th Intl. Conf. on Data Engineering, pages 522-529, Sydney, Australia, March 1999.

12491 C. C. Fabris and A. A. Fleitas. Discovering surprising patterns by detecting occurrences of Simpson’s paradox. In Proc. of the 19th SGES Intl. Conf. on Knowledge-Based, Systems and” Applied Artificial Intelligence), pages 1,48-160, Cambridge, UK, December 1999.

[250] L. Feng, H. J. Lu, J. X. Yu, and J. Han. Mining inter-transaction associations with templates. In Proc, of the 8th IntI. Conf. on Inforrnation and Knowled,ge Managemept, pages 225-233, Kansas City Missouri, Nov 1999.

[251] A. A. Freitas. Understanding the crucial differences between classification and discov- ery of association rules a position paper. SIGKDD Erplorations,2(l):6549, 2000.

12521 J. H. Friedman and N. L Fisher. Bump hunting in high-dimensional data. Statisti.cs aniL Computing, 9(2):123-143, April 1999.

Order your essay today and save 10% with the discount code ESSAYHELP