LIANG Jian-fang(),LIANG Jian-ming(),WANG Jian-ping()
1 Clothing and Art Design College,Xi’an Polytechnic University,Xi’an 710048,China
2 School of Computing and Informatics,Arizona State University,Scottsdale AZ 85259,USA
With Internet shopping gaining more attention and momentum,electronic commerce in China got the unprecedented development.According to the statistics released by IResearch Consulting Group,transaction scale of the Chinese online shopping has reached 773.56 billion Yuan RMB,accounting for 4.3% of total retail sales of social consumer goods during 2011.Of the total,the network shopping users went up to 187 million.A total of 26.5% of the market share belongs to clothing,shoes and hats,bags and suitcases,standing first on the list of all variety.Regarding the transaction subject,transaction scale of B/C was up to 179.11 billion Yuan RMB,accounting for 23.2% of China’s overall network shopping market scale.Unfortunately,despite many technology advances in online product presentation,the ability to expose consumers’ senses to various aspects of the product is still limited.As demonstrated in some previous academic research,consumers perceive higher risk associated with online shopping for apparel than with in-store shopping[1],resulting in a large number of returns over the past two years in Chinese online shopping.Thus,a better understanding of the online apparel consumer is now in the forefront of most retail strategies.
With respect to network consumption psychology and behavior,although experts have carried on intensive research and discussion from different perspectives[2-6],only a small quantity of literatures is related to apparel[7-18].Researches on apparel online shopping mainly focuse on the study of perceived risk[9-10],try-on technology[11-13],satisfaction evaluation[14-15],consumer’s behavior,and stimulation measure[16-18]in order to reduce product risks,increase enjoyment in online shopping and promote sales.Compared with foreign country,there are a few works and a bigger research gap in our country.Overall,not only research contents and study depth were not enough,but also research methods were single and conservative in China.
In the face of plenty of customer back-end data stored in real time,it becomes more difficult for clothing e-commerce companies to use traditional research method than to use data mining tools.But the current theory for data mining mainly concentrated on the field of computer technology application[19-20],particularly lack of clothing online consumer behavior research with the help of the very useful data mining tool.
In view of this circumstance,a typical B/C clothing enterprise in China will be chosen as research object,and then the target experiments database will be set up based on web service logs of sample enterprise.The purpose is to find out the implicit rules and consumption tendency existed in course of online apparel shopping and provide some references to enterprises by using clustering algorithm and Apriori algorithm of Clementine Data Mining Software.
Data mining,which provides a basis for studying apparel online consumption behavior,refers to the process of discovering hidden valuable knowledge and rules by analyzing large disorderly data stored in the database.Research findings will offer benefits to enterprises to find the useful information from the large,incomplete,vague,and clutter data,and improve the underlying value and actual application value of data.According to the aim of this study,the research scheme is designed as following stage.
The first stage is to select typical domestic B/C clothing electronic commerce enterprise and obtain target experimental database by collecting data and preprocessing based on web service logs.In the second stage,clustering analysis algorithm is used to analyze online apparel consumer segmentation,feature extraction,and buying preferences.The third stage aims to dig up the hidden relationship between product categories that consumers often buy together with the help of association rule algorithm[21-23].The fourth stage,according to the actual application,filters in different and unassociated data and model,finds interesting model,and provides relevant suggestions to sample enterprises so as to guide activities of the actual apparel e-commerce.Accordingly,the process of data mining is shown in Fig.1.
Before analysis of clothing online consumer behavior,first is to obtain a lot of historical and dynamic integration data.This study was supported by an influential B/C clothing E-commerce company in Guangdong province,China.Based on the Web server logs,the historical data of customers in 2011 were collected.Just then,through the data preprocessing and variables selection,the experimental database was established,which contained about 2000 data records.
The previous research[24-26]indicated that the apparel online consumer behavior was influenced by economic factors,social factors,psychological factors,and personal factors.However,on the premise of consistent objective factors,consumer behavior is mainly influenced by psychological and personal factors from the consumer perspective.These two basic and important subjective factors include demographic dimension,psychological dimension,and purchase behavior dimension.Among them,the demographic dimension describes the external characters of individual and family,such as gender and age.Psychological dimension reflects consumers’ attitude,way of life and personality,such as traveling and living.Purchase behavior dimension describes consumer buying habits and purchasing clothing items,such as the purchase frequency and clothing category.
In this study,the multidimensional cube data model is used to set up the model of clothing online consumer behavior.Generally,a multi-cube can be defined as a four-dimensional group
(d1,d2,…,dn),diis the appellation of dimension,fromdomdim(i);Mrepresents measure sets,M=(m1,m2,…,mk),miis the appellation of measurement,fromdommeasure(i);Arepresents attribute sets,A=(a1,a2,…,at),aiis the appellation of attribute,fromdomattr(i);f:D→Ais one-to-many mapping from dimension sets to attribute sets.The constraints are as follows: (1)D∩M= 0,i.e.,the dimension sets and the measure sets do not have the intersection; (2) to anyi,j(i≠j),f(di)∩f(dj)=0,it means any two attribute sets in different dimension sets have nothing in common.
According to discussion above,multi-cube of the index systemCbbased on apparel online consumer behavior can be represented as the following:
Cb=
Here,D1=(Demographic,Psychological,Behavior),and it means that apparel online consumer behavior can be explained from demographic dimension,psychological dimension and purchase behavior dimension.M1=(Actual,Tendency),and apparel online consumer behavior can be reflected from consumption actual situation and consumption tendency.It can be seenD1∩M1= 0.A1=(Sex,Age,Edlevel,Job,Race,Income,Homeval,Marital,Numkids,Aprtmnt,Mobile,Travtime,Purchase,Amount,Frequency,Recency,Accessory,Bag,Jeans,Knitwear,Hat,Scarf,Coatee,Coat,T-shirt,Waistband,Dress,Trenchcoat,Vest,Skirt,Shorts,Sweater,Shoes,Casualshirt,Formalshirt,Trousers,Outdoor).The meanings and values of attributes sets (37 variables) ofA1are shown in Table 1.f1(Demographic)={Sex,Age,Edlevel,Job,Race,Income,Homeval}.f1(Psychological)={Marital,Numkids,Aprtmnt,Mobile,Travtime}.f1(Behavior)={Purchase,Amount,Frequency,Recency,Accessory,Bag,Jeans,Knitwear,Hat,Scarf,Coatee,Coat,T-shirt,Waistband,Dress,Trenchcoat,Vest,Skirt,Shorts,Sweater,Shoes,Casualshirt,Formalshirt,Trousers,Outdoor }.It can be seen that there are no variables in common amongf1(Demographic),f1(Psychological),andf1(Behavior),i.e.,for anyi,j(i≠j),f1(di)∩f1(dj) =0.
Table 1 37 variables in attribute sets A1
(Table 1 continued)
The result of descriptive statistical analysis for 37 data variables in attribute sets is shown in Table 2.It can be seen that the mean of consumer’s personal characters variables are basically consistent with the normal distribution law.Generally speaking,most consumers in sample B/C apparel enterprise are female,around the age of thirty,college degree or above.On average,purchasing amount of most consumers within one year is about 522.47 Yuan RMB,the annual purchase frequency was 2.27.Consumers who choose casual wear are more than those who choose formal wear.The purchasing amount of T-shirt and casual shirt is relatively large,and tie-in products such as hat also have a very good sales similarly.All information indicates that consumers of sample enterprise tend to be leisure life style and pay more attention to the overall effect of clothes and ornaments.In other words,the phenomenon of the joint purchase universally exists in course of online apparel shopping.
Table 2 Result of descriptive statistical analysis of variables in attribute sets
(Table 2 continued)
Clustering analysis is one of the important algorithms of data mining,which can divide the data into different classes or clusters based on the attribute information of data object or the relationship between objects.As a result,objects in the same cluster have high similarity,while objects in the different cluster have obvious difference.For the experimental database in this study,to take research aims into account,K-means of clustering algorithm in Clementine Data Mining Software is selected.The
Table 3 Statistical results of mean value of consumer characters in each cluster
Through data statistics analysis,the total purchasing amount of T-shirt,jeans,dress,skirt,causal shirt,formal shirt,sweater,coatee,and tie-in products is very large,accounting for 84.32% of total.So the following will take these 9 kinds of clothing above-mentioned as an example,analyze and sum up different purchase preferences of consumers for each cluster (in Fig.2).
In summary,the characters and clothing buying preferences for eight clusters is concluded,as shown in Table 4.
Fig.2 Percentage of catalog apparel purchases in each cluster accounted for total amount
Table 4 Characters and clothing buying preferences for eight clusters
Association rule describes the correlation between different data records,and it is one of relatively full-blown technology in data mining.Correlation analysis refers to the mining process of association rule,and the purpose is to find the pertinency between two or more variables in the same affair.When the clothing electronic commerce enterprise has a large number of transaction,clothing products that consumers often buy together at the same time will be recorded in the back-end database.In this case,correlation analysis will help us to find out association between different products purchased by consumers,as well as the relationship between the product hierarchical structures existed in all layers.
The following will divide 21 kinds of apparel shown above into three categories,i.e.,casual wear(e.g.,knitting (wool) shirt,jeans,outdoor clothing,sweater,vest,shorts,casual shirts,T-shirt,casual dresses,casual short coat); formal wear(e.g.,formal shirt,trousers,short skirts,trenchcoat,coatee); and tie-in products(e.g.,accessory,bag,waist band shoes,hat,scarf).Then,web analysis in the Clementine Software is used to map correlation network diagram among the layered products.For the sake of convenience,the most relevant correlation lines are marked by thick line in Fig.3.Meanwhile,using Apriori algorithm,under the 60% of minimum confidence,association rules will be found among tie-in products and casual wear,formal wear and tie-in products,casual wear,formal wear and tie-in products respectively.
(1) Association rules analysis between casual wear and tie-in products
The correlation network diagram between casual wear and tie-in products containing 16 kinds of clothing is shown in Fig.3.Accordingly,the following 22 association rules are developed by means of Apriori analysis.
Fig.3 Correlation network diagram between casual wear and tie-in products
Rule 1: Consumers who buy T-shirt are likely to buy scarf at the same time,i.e.T-shirts => Scarf [S=38.46 %,C=62.59 %,L=1.21].
Wherein,“S” represents correlation support,“C” stands for correlation confidence,“L” refers to correlation lift.It means consumers who buy T-shirts and scarves at the same time account for 38.46% of the total number.If scarves are recommended to the consumers who buy T-shirt,the rate of success is 62.59 %,the possibility is raised 1.21 times.
Rule 2: T-shirts => Shorts [S= 35.90%,C=60.61%,L=1.15].
Rule 3: T-shirts => Bags [S= 44.05%,C=71.03%,L=1.18].
Rule 4: Sweater => T-shirt [S=13.32%,C=62.59%,L=0.43].
Rule 5: Sweater=> Scarves [S=15.49%,C=89.85%,L=1.62].
Rule 6: Sweater =>Bags [S=20.51%,C=89.57%,L=1.29].
Rule 7: Sweater =>Shorts [S=43.32%,C=81.92%,L=1.10].
Rule 8: Outdoor clothing => Sweater [S=47.33%,C=88.66%,L= 1.89].
Rule 9: Outdoor clothing => Jeans [S=18.46%,C=87.31%,L=1.08].
Rule 10: Outdoor clothing => Scarves [S=43.32%,C=92.77%,L=1.27].
Rule 11: Jeans => T-shirt [S=47.42%,C=88.91%,L=1.02].
Rule 12: Jeans => Bags [S=15.90%,C=61.59%,L=1.14].
Rule 13: Jeans => Scarves [S=11.54%,C=85.81%,L=1.06].
Rule 14: Outdoor clothing => Vest [S=21.54%,C=85.81%,L=1.06].
Rule 15: Casual shirt => Vest [S=41.54%,C=85.51%,L=1.09].
Rule 16: Recreational skirt => Bags [S=57.64%,C=85.21%,L=1.36].
Rule 17: Outdoor clothing => Bags [S=31.44%,C=83.31%,L=1.34].
Rule 18: Shoes => Bags [S=14.54%,C=84.34%,L=1.46].
Rule 19: Shorts => Bags [S=13.54%,C=83.21%,L=1.07].
Rule 20: Bags => Scarves [S=17.54%,C=86.61%,L=1.75].
Rule 21: Shorts => Scarves [S=24.55%,C=75.64%,L=1.76].
Rule 22: Outdoor clothing =>T-shirt [S=33.54%,C=84.81%,L=1.36].
Among all rules above,Rules 4,5,12,13,18,19,and 20 are with lower support degree and higher confidence coefficient.It suggests the association rules are correct,but they don’t happen very often.It is worth noting that jeans and T-shirt are the top sales,since they can be easily matched with shoes,bags,scarves,and other tie-in products or sweat shirts and outdoor clothing,and consumers often buy them together.
(2) Association rules analysis of formal wear and tie-in products
In the same way,14 association rules are discovered among 11 kinds of products included in formal wear and tie-in products,specific as follows.
Rule 1: Trenchcoat => Formal shirt [S=40.37%,C=84.19%,L=1.09].
Rule 2: Trenchcoat => Coatee [S=13.32%,C=92.23%,L=1.12].
Rule 3: Trenchcoat => Bags [S=36.02%,C=89.11%,L=1.02].
Rule 4: Trenchcoat => Scarf [S=45.20%,C=90.74%,L=1.76].
Rule 5: Formal shirt => Scarves [S=39.87%,C=87.11%,L=1.38].
Rule 6: Formal shirt => Bags [S=33.33%,C=90.33%,L=1.75].
Rule 7: Formal shirt => Shoes [S= 2.26%,C=81.13%,L=1.81].
Rule 8: Bags => Shoes [S=14.12%,C=93.27%,L=1.92].
Rule 9: Bags => Scarves [S=41.34%,C=78.21%,L=1.22].
Rule 10: Coatee => Bags [S=40.74%,C=68.34%,L=1.78].
Rule 11: Coatee => Shoes [S=39.44%,C=88.34%,L=1.98].
Rule 12: Coatee => Scarves [S=39.44%,C=88.34%,L=1.93].
Rule 13: Coatee => Formal shirt [S=43.45%,C=89.17%,L=1.23].
Rule 14: Skirt => Hats [S=11.74%,C=88.34%,L=1.98].
Of the total above,it can be seen also that Rules 2,7,8,and 14 are correct,but they don’t happen very often.Consumers who buy trenchcoat or coatee always buy tie-in products together so as to have a well match effect,resulting in a good sale.
(3) Association rules analysis of formal wear,casual wear,and tie-in products
Similarly,to exclude existing association rules above,additional 7 important association rules are discovered among 21 kinds of products included in formal wear,casual wear,and tie-in products,specific as follows.
Rule 1: Trenchcoat =>T-shirt [S=36.32%,C=82.98%,L=1.08].
Rule 2: Trenchcoat => Jeans [S=38.32%,C=85.21%,L=1.23].
Rule 3: Formal shirt => Dress [S=41.32%,C=89.62%,L=1.39].
Rule 4: Formal shirt => Casual dress [S=12.12%,C=87.11%,L=1.41].
Rule 5: Coatee => T-shirt [S=45.92%,C=84.23%,L=1.03].
Rule 6: Coatee => Jeans [S=39.32%,C=81.41%,L=1.13].
Rule 7: Coatee => Casual skirt [S=34.32%,C=88.14%,L=1.09].
It can be seen also that Rule 4 is correct,but it doesn’t happen very often.Obviously,T-shirt and jeans are the essential clothing for consumers,they are always bought with trenchcoat or coatee together.
To sum up all analysis above,43 association rules in total among the formal wear,casual wear,and tie-in products are concluded.However,12 association rules among them do not happen very often,and the rest of 31 association rules are remained at last.
In summary,some valuable information hidden behind a large number of transaction data of sample enterprises is found through data mining,and we have the following conclusions.
First,consumers of the selected enterprise in this study are divided into eight clusters,and each has distinct behavior character and clothing purchase preference.
Second,43 association rules among the formal wear,casual wear,and tie-in products are explored by using association rule algorithm.Excluding 12 association rules that do not happen very often,the remaining 31 have a great value to B/C clothing enterprises.
Consequently,for the 8 clusters of consumers,the B/C apparel e-commerce enterprise should further refine the target market according to characters of each cluster,and allocate apparel goods appropriately according to purchase preference of each cluster.At the same time,considering the association rules between different apparel categories,combination sales,bundling sales,and cross sales strategies should be developed in the future promotion of apparel activity.Of course,the principle of combination should obey the rules above-mentioned.For example,for some routine,flexible clothing categories,such as T-shirt and jeans,no matter it is casual wear or formal wear,both can gain a very good outward match appearance.Therefore,fair price and lasted sales strategy should be adopted for a long run.But for the poor sales of goods,combination sales method should be used,i.e.,match them with the best-selling or last-selling apparel,sale as a set rather than individual,then will improve their purchases sales respectively.As for bags and scarves that are not always bought in separate,conducting bundling sales or cross recommendation is likely to develop the new market share,improve sales efficiency,and enlarge profits space of the B/C electronic commerce enterprises.This finding will help to better understand the nature of online apparel consumption behavior and make a good progress in personalization and intelligent recommendation strategies.
[1] Biswas D,Biswas A.The Diagnostic Role of Signals in the Context of Perceived Risks in Online Shopping: Do Signals Matter More on the Web?[J].JournalofInteractiveMarketing,2004,18(3): 30- 45.
[2] Chang Y,Chen A D.Consumer Online Shopping Intention Forecasting Based on Intuitionistic Fuzzy Reasoning[J].InternationalJournalofDigitalContentTechnologyandItsApplications,2012,6(16): 540-547.
[3] Zhao X J,Shi C X,Gan S Q,etal.Self-concept Evaluation of Online Shoppers: Proof from Experience-Based Serial Reproduction Study[J].InternationalJournalofDigitalContentTechnologyandItsApplications,2012,6(17): 9-17.
[4] Li X L,Zhao R,Xiao Y.B2C E-Commerce Websites Evaluation System on Users’ Experience Basis[J].InternationalJournalofAdvancementsinComputingTechnology,2013,5(2): 563-570.
[5] Kim J,Fiore A M,Lee H H.Influences of Online Store Perception,Shopping Enjoyment,and Shopping Involvement on Consumer Patronage Behavior towards an Online Retailer [J].JournalofRetailingandConsumerServices,2007,14(2): 95-107.
[6] Zhou L N,Dai L W,Zhang D S.Online Shopping Acceptance Model—a Critical Survey of Consumer Factors in Online Shopping [J].JournalofElectronicCommerceResearch,2007,8(1): 41- 62.
[7] Jones C,Kim S.Influences of Retail Brand Trust,Off-line Patronage,Clothing Involvement and Website Quality on Online Apparel Shopping Intention [J].InternationalJournalofConsumerStudies,2010,34(6): 627- 637.
[8] Yoh E,Damhorst M L,Sapp S,etal.Consumer Adoption of the Internet: the Case of Apparel Shopping [J].Psychology&Marketing,2003,20(12): 1095-1118.
[9] Lee Z C,Paul D.Customer Perceptions of E-Service Quality in Online Apparel Shopping[C].Global Conference on Business and Finance,Hawaii,2012: 629- 634.
[10] Almousa M.Perceived Risk in Apparel Online Shopping: a Multi Dimensional Perspective[J].CanadianSocialScience,2011,7(2): 23-31.
[11] Kim J,Forsythe S.Adoption of Virtual Try-on Technology for Online Apparel Shopping[J].JournalofInteractiveMarketing,2008,22(2): 45-59.
[12] Merle A,Senecal S,St-Onge A.Whether and How Virtual Try-on Influences Consumer Responses to an Apparel Web Site[J].InternationalJournalofElectronicCommerce,2012,16(3): 41-64.
[13] Kim J,Forsythe S.Hedonic Usage of Product Virtualization Technologies in Online Apparel Shopping [J].InternationalJournalofRetail&DistributionManagement,2007,35(6): 502-514.
[14] Lee H H,Damhorst M L,Campbell J R,etal.Consumer Satisfaction with a Mass Customized Internet Apparel Shopping Site[J].InternationalJournalofConsumerStudies,2011,35(3): 316-329.
[15] Myers C A,Mintu-Wimsatt A.Exploring Antecedents Influencing Internet Shopping Satisfaction: the Case of the Apparel Industry[J].InternationalJournalofBusinessandSocialScience,2012,3(8): 1-9.
[16] Jacobs B,de Klerk H M.Online Apparel Shopping Behaviour of South African Professional Women: the Role of Consumers’ Apparel Shopping Scripts[J].InternationalJournalofConsumerStudies,2010,34(3): 255-264.
[17] Song Z J,Kong X M,Wang Y F.Understanding the Link between Consumer Decision-Making Style and Online Apparel Purchasing[J].JournalofSoftware,2011,6(10): 2068-2075.
[18] Ha Y,Lennon S J.Online Visual Merchandising (VMD) Cues and Consumer Pleasure and Arousal: Purchasing versus Browsing Situation[J].Psychology&Marketing,2010,27(2): 141-165.
[19] Lei J H,Yang X F.Construction the E-Commerce Trading Platform Based on Rough Set Data Mining Technology [J].JournalofConvergenceInformationTechnology,2013,8(3): 460- 469.
[20] Xi J.User Behavior Analysis and Mining Based on Web Log[D].Shanghai: Donghua University,2011: 23-29.(in Chinese)
[21] Xie B C.Application Practice of the Clementine Data Mining[M].Beijing: China Machine Press,2008: 213-289.(in Chinese)
[22] Sheng Y Y,Yan R W,Wang J R,etal.Research Multi-dimensional Association Rule Mining Based on Apriori Algorithm [J].ScienceTechnologyandEngineering,2009,9(7): 1734-1737.(in Chinese)
[23] Liu R,Chen X H.Consumer-Action Analysis in Mobile Enterprises on Data Mining [J].ComputerApplicationsandSoftware,2006,23(2): 60- 62.(in Chinese)
[24] Lennon S J,Kim M,Johnson K K P,etal.A Longitudinal Look at Rural Consumer Adoption of Online Shopping[J].Psychology&Marketing,2007,24(4): 375- 401.
[25] Krishna C V.Determinants of Consumer Buying Behaviour: an Empirical Study of Private Label Brands in Apparel Retail[J].TheXIMBJournalofManagement,2011,8(2): 43-56.
[26] Jafri H.Psychological Capital and Innovative Behaviour: an Empirical Study on Apparel Fashion Industry[J].TheJournalContemporaryManagementResearch,2012,6(1): 42-52.
Journal of Donghua University(English Edition)2013年6期