亚洲免费av电影一区二区三区,日韩爱爱视频,51精品视频一区二区三区,91视频爱爱,日韩欧美在线播放视频,中文字幕少妇AV,亚洲电影中文字幕,久久久久亚洲av成人网址,久久综合视频网站,国产在线不卡免费播放

Chinese Calligraphy Word Spotting Using Elastic HOG Feature and Derivative Dynamic Time Warping

2014-03-14 02:15:34YongXiaZhiBoYangKuanQuanWang

Journal of Harbin Institute of Technology(New Series) 2014年2期

Yong Xia，Zhi－Bo Yang，Kuan－Quan Wang

(School of Computer Science and Technology，Harbin Institute of Technology，Harbin 150001，China)

1 Introduction

Chinese Calligraphy is a type of writing art of 3000 years old and valuable civilization legacy which is widely collected in museum and library.Most of them are highly valuable and rarely referenced.So many institutions are interested in providing access to scanned versions ofthese collections.Therefore， efficient indexing or retrieval techniques are indispensable.Chinese calligraphy is a very special style of handwriting，and some outstanding specifications are as follows:

1)Variation:Calligraphy is written with brush and ink，which leads to bigger thickness variations than those written with pen or pencil，and calligraphy in different dynasties has different writing styles，and most of them never have been used nowadays.

2)Degradation:Calligraphy is often significantly degraded due to faded ink，stained paper and other adverse nature factors.

3) Deformation: Calligraphy reflects calligrapher’s personality，often purposely written in an unusual way，such as dry stroke.

Also，traditional handwritten character recognition (HCR)techniques fail when applied to Chinese calligraphy，which usually recognize words characterby-character and also enough corpuses are needed in advance.Word spotting is the text retrieval technique without recognition proposed by Manmatha et al［1］.Although recently，various word spotting methods are provided in Refs.［2－5］，only limited success can be obtained.These methods are not able to directly resolve the problem of Chinese calligraphic image spotting.

Generally，a typical word spotting system consists ofthree main modules:pre-processing， feature extraction and feature matching.Among them，feature extraction is one of the most important factors for achieving high retrieval performance，because feature with strong discriminative information can be well classified even using the simplest classifier.In this paper，we propose novel character feature of Elastic Histogram of Oriented Gradient(EHOG)，which was inspired by local gradient histogram feature descriptor［11］.In our experiments，it is proved to be superior to the prevailing features used in Chinese calligraphy spotting，such as Gabor feature［8］and Shape feature［10］.Furthermore，Derivative Dynamic Time Warping(DDTW)is adopted for calligraphy word matching，which is very successful in gesture recognition.To the best of our knowledge，no similar work in Chinese calligraphy word spotting has been discovered，so this investigation is signi-ficant and valuable.

The remainder of this paper is arranged as follows.Related work is introduced in Section 2.In Section 3，our proposed EHOG feature and DDTW matching algorithm are described in detail.In Section 4，experimental results are given and discussed.Finally，the important conclusions are drawn in Section 5.

2 Related Work

Numerous success has been achieved in handwritten Chinese character recognition in recent decades.However，Chinese calligraphy is very special，and there are few worksthatwe can referto.Generally，calligraphy retrieval can be divided into two groups， Content-based key word spotting and recognition-based retrieval， corresponding to its solution.This paper focuses on content-based key word spotting approach.The content-based retrieval has two main procedures， feature extraction and feature matching.

Forfeature extraction，Rath and Manmatha proposed a set of feature methods for word image matching［6－7］，theirextracting targetwas English manuscript which was composed of letters only in horizontal ways，while Chinese character was composed of strokes both in horizontal and vertical way.Gabor feature，which was proposed by Daugman［8］，has a wide application in character recognition，and it is suitable for extracting the joint information in two dimensional spatial and frequency domain.Japanese character is much less than Chinese character in numbers，original from China and now it is similar to Chinese character in many aspects. Terasawa introduced eigenspace method for Japanese handwriting［9］，but their experiment object at least was not as complex as Chinese calligraphy.Zhuang［10］retrieved Chinese calligra-phic character images using approximate correspondence point(APC)algorithm，and the feature of character was represented by its contour points’shape.

As for feature matching，Dynamic Time Warping (DTW)is a widely used matching technique of time series analysis， originally proposed for speech recognition by Sakoe and Chiba［17］.For better handling character retrieval，Zhang and Zhuang proposed twodimension DTW for Chinese calligraphic character matching［18］.Keogh et al.extended it to Derivative Dynamic Time Warping(DDTW)by elimi-nating conditions in which DTW produces pathological results［15］.Zhuang proposed an APC matching algorithm specifi-cally for shape feature［10］.

3 Our Proposed Method

In this Section，we first propose a novel EHOG feature for Chinese calligraphic words spotting.Then，we introduce the corresponding matching algorithm，DDTW，adopted in our experiments.

3.1 Preprocessing

The materials for our study were digital images of calli-graphy collection， which were obtained by scanning.Some preprocessing steps needed to be performed at the beginning.

1)Remove man-made seals;background was removed instead ofthe totalbinarization because grayscale information is important for gradient-based feature.

2)Images were smoothed by Gaussian filter in order to robust to noise.We set deviation parameter to σ=2.

3)Word-based segmentation in Ref.［12］rather than line-based segmentation was used.Because they were written by brush and the space between characters always existed.On the other hand，even if line-based segmentation was possible and convenient in this case，this method caused the increase of the dimensionality of feature vectorsand made the computationaland matching cost expensive.

4)Generally，linear normalization，nonlinear normalization orelastic meshing techniques were applied before feature extraction.Although nonlinear normalizing was more commonly used， some experiments show that elastic meshing performs better in Chinese handwritten recognition［13］.Therefore，Gabor feature and EHOG feature in this paper were based on elastic meshing.

3.2 Feature Extraction

HOG(Histogram of Oriented Gradients)first proposed by N.Dalal in 2005 CVPR［14］，used in human detection task.Similar to SIFT，this technique counts occurrences of gradient orientation in localized portions of an image.But differs in that it is computed on a dense grid of uniformly spaced cells，and uses overlapping local blocks， makes a redundant expression for improved accuracy.Another difference is that SIFT needs to detect the key points which usually are not stable in our task since the great variations are commonly existed in Chinese calligraphy characters with different writing styles.

In orderto suitforthe description ofthe characteristics of Chinese calligraphy characters，a novel feature descriptor named as EHOG， a modification of HOG，is proposed.Fig.1 gives a flow chart of the EHOG feature extraction proc-edure，where Gxand Gyrepresent horizontal gradient and vertical gradient，separately.

Unlike original HOG，as shown in Fig.1，EHOG divides image into non-uniform cells based on elastic meshing technique presented in Ref.［13］ after preprocessing.The advantage of elastic meshing is to partition the input character image with imaginary grids according stroke intensity.As there exist many types of variations in calligraphy character，such as position，size and inclination.Elastic meshing is much reasonable for feature extraction than uniform cells.Therefore the same strokes from two same characters are more likely to have the same order regions，eventually have similar feature descriptor.

Fig.1 EHOG feature extraction procedure

In HOG，each pixel calculates a weighted vote for the orientation histogram according to its gradient，and the votes are accumulated into orientation bins over cells.For the image I(x，y)，horizontal and vertical gradient components Gxand Gyare determined as:

and

Then，the gradient magnitude m and direction θ are obtained for the pixel with coordinates(x，y)as:

and

where∠is a function that returns the direction of the vector(Gy，Gx).

Finally，to reduce aliasing，votes are interpolated bilinearly between the neighboring bins.In other words，each pixel contributes to the closet bins with amount m(x，y)just as shown in Fig.2.

Fig.2 Orientation bins for T=12 and angle difference of θ(x，y)to the two closet bins

The orientation bins are evenly spaced over 0°－180°(“unsigned”gradient)or 0°－360°(“signed”gradient).In our task，assuming T orientation bins over signed gradient.Because in human detection，the wide range of clothing and background colors presumably makes signsofcontrastsuninformative.However，characters darkerthan background and characters brighter than background are rarely mixed in one calligraphy image.Suppose image is divided into M×N cells with elastic meshing technique， then the histogram with M×N×T bins can be obtained.

In Ref.［14］，the normalized block descriptors were set as HOG descriptors.A block was the larger spatial regions that grouping m×n spatially connected cells.A block descriptor had m×n×T dimensional vectors，each of which was a concatenation of the histogram components of the m×n cells.These blocks typically were overlapped(As shown in Fig.3，when the block size was set as 2×2，(7－2+1)×(7－2+1) unique blocks were obtained)，meaning that each cell contributed more than once to the HOG descriptors.Thus，as mentioned above， overlap resulted in redundant expression was the salient characteristic of the HOG feature.Considering there were(M－m+1)× (N－n+1)unique blocks existing，finally(M－m+ 1)×(N－n+1)×MNT dimensions were conca-tenated to a serial vector as EHOG descriptor.

Fig.3 An example of EHOG

In our EHOG feature，due to the size of these cells are not equal，therefore L2-normalization is applied to each unique block. After that，experimentally，a performance gain is obtained when scaling all blocks so that their components sum to 1.This improvement is due to the fact that characters are not in the uniform size in our experiments.

3.3 DDTW Matching

Matching begins once the sequence of feature vectors is extracted by the feature descriptors discussed above.An example of matching based on DTW and DDTW is given in Fig.4.Note that DTW failed to align two central peaks because they are slightly separated in the Y－axis.

Fig.4 An example of matching based on DTW and DDTW

DTW algorithm:To align two feature sequences Q and S，of length n and m respectively，where Q=q1，q2，…，qi，…qnand S=s1，s2，…，sj，…sm.DTW constructs a n－by－m matrix， each matrix element(i，j) corresponds to the alignment between the points qiand sj.The warping path W is a contiguous set of matrix elements that defines a warping between Q and S.The kthelement of W is defined as wk=(i，j)k:

where W starts in w1=(1，1)and finishes in wK=(n，m).The warping cost is defined as:

In other words，DTW calculates this path in a dynamic programming approach using:

Although DTW has been successfully used in finding similar sequences，it may produce pathologic results.DTW try to explain 2-dimensional character image’s variability in the Y-axis by warping X－axis (the time series).Situations always happen for two identical Chinese characters written in different time that a valley in one feature sequence much deeper than the corresponding valley in the other series，and a rising trend slowerthan otherone.Forexample considering two data points qiand sjwhich have identical values，but qiis part of a rising trend and sjis already part of a falling trend.DTW considers a mapping between these two points ideal，although intuitively we would prefer not map a rising trend to a falling trend.Keogh et al.proposed Derivative Dynamic Time Warping in Ref.［15］，which does not consider the Y－value of the data points，but rather consider the higher level feature of shape.They obtain shape information by considering the firstderivative of sequence.Thus，they replace d(qi，sj)in Eq.(7)with d(qi'，sj)where

In this way， alignmentis based on shape characteristics(slope，peaks)rather than simple values.

4 Experiment

This section first describes experiments which compare the retrieval performance of our proposed feature with three competitive features，the original HOG feature［14］， Gabor feature［8］and Shape feature［10］，on different feature matching algorithms，DTW［17］， DDTW［15］， and APC matching［10］，respectively.Then，experimentsare conducted to evaluate the performance of our proposed feature using different matching methods and their combination.

4.1 Dataset＆Experimental Setup

The lack of public Chinese calligraphic character dataset makes the experiments hard to proceed.In order to well evaluate the proposed approach，we collect a dataset containing 17 volumes of calligraphy collection from the Internet and library，which are scanned by 300 dpi.The documents totally contain 14302 characters，aftersegmen-tation，sub-images corresponding to certain keywords are manually labeled carefully.The sample page used in the experiment is displayed in Fig.5.

Fig.5 A sample page image:"Xian ju fu"written by Zhao Mengfu

Considering the variation in writing style of Chinese characters and to make the test samples representative，we randomly select query images for the test from ten most frequent characters(‘其’，‘也’…，from 131 to 632 positive examples are available for each character)，and the same characters as the query image indicated are the objective in the spotting task，where the features of calligraphic words are extracted using EHOG and matching is done by DDTW in our approach.

4.2 Parameter Selection for EHOG

As for the feature extraction of EHOG，the character image is divided into 5×5 blocks，and for each cell in each block，the variation of the width and height from 1 to 7 is considered for test in order to get the optimal parameters.When the numbers of bin for orientation is 4，the results for key word spotting are given in Fig.6.As observed in Fig.6，when the width and height of cell are set as 2 and 3 separately，the performance is the best.Furthermore，when the width or height of cell comes to 7，the result is the worst.Therefore，the original HOG’s redundant expression is proved to be effective in key word spotting，the same as in human detection task.The optimal number of bins for orientation is found to be 12 by parameter variation tests(as shown in Fig.7).

Fig.6 Mean average precision for the different cells in height(H)and width(W)

Fig.7 Mean average precision for the different bins

4.3 Evaluation of Word Spotting

Based on near optimal parameters for width and height of cell in block，the retrieval performances attained by using EHOG and HOG are given in Table 1.In this test，the case image is divided into 7×7 cells for each block，and the number of bin for orientation is set as 8.From this table，we can see that EHOG significantly outperforms original HOG in most cases.This shows that our modification to HOG is very suitable for calligraphy character retrieval.

Table 1 Mean average precision for the comparison between EHOG and HOG

DTW is most popular feature matching algorithm in key word spotting.But DDTW was more effective than DTW in gesture recognition［16］.So，in our experiment，both DTW and DDTW are used for performance evaluation.With the optimal parameters obtained in the previous step，experiments are carried out to compare the performance of EHOG feature against Gabor feature and Shape feature.The results are given in Table 2.

Table 2 Mean average precision(mAP)for the different features and corresponding matching methods

The comparison results shown in Table 2 indicate thatourproposed EHOG feature demonstratesa significant improvement in Chinese calligraphic word spotting when comparing with Gabor feature and Shape feature.Furthermore，comparedtoDTW，DDTW produces better alignment between two feature series by taking shape characteristics into account.

In addition，we try to combine DTW and DDTW to improve feature matching performance.For every query character，EHOG features are extracted followed by retrieving similar characters by DTW.Then DDTW is used to re-rank these retrieved characters.As partial experiment results indi-cated in Table 3， the performance greatly improves in all query characters.This improvement proves the fact that shape characteristic is important in classifying similar images.But in this case，the computational cost is more expensive.

Table 3 Mean average-precision for some keywords before and after re-ranking

An example for the result of key word spotting is given in Fig.8.A Chinese word sample‘之’from the collection is set as query word，and the top seven word with the highest similarity is provided as retrieval result.A computer with Intel Core 2 Duo CPU 3.0GHz is used for test.EHOG and DDTW are used for word spotting.The time for a word query in this dataset needs about 10 seconds.In the future work，the pruning algorithm or other optimization methods will be considered for speed improvement.

Fig.8 Retrieved Result for"Xian Ju Fu"using EHOG，the leftmostcolumn isthequery image，the following are retrieved characters ranked first to seven

5 Conclusions

In this paper，a novel EHOG feature descriptor is proposed，which obtains superior performance in word spotting task when compared with Gabor feature and Shape feature.EHOG makes a modification of HOG descriptor according to the characteristic of Chinese calligraphy character.Experiment results also confirm the effectiveness of EHOG’ s over-lapping normalization and redundant expression techniques，and EHOG extract feature according to stroke intensity，which can effectively reflect calligraphy characteristic.The DDTW algorithm is an extension of the regular DTW algorithm which takes feature shape into account when finding the optimal synchronization between two series，as expected，shows an advantage over DTW.

The drawback of the EHOG feature is its high dimen-sionality which has a negative impact on the computational cost especially when first matching by DTW and then re-ranking by DDTW.Actually，we try to use PCA for dimensionality reduction，but resulting in significant decreasing performance.So，our future work will focus on exploring other techniques for dimensionality reduction or pruning algorithms.

［1］Manmatha R，Han Chengfeng.Word Spotting:A new approach to indexing handwriting.Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE，1996.631－637.

［2］Saabni R，El-Sana J.Keywords image retrieval in historical handwritten Arabic documents.JournalofElectronic Imaging，2013，22(1):013016－013016.

［3］Liang Y，F(xiàn)airhurst M C，Guest R M.A synthesised word approach to word retrievalin handwritten documents.Pattern Recognition，2012，45(12):4225－4236.

［4］Rodriguez-Serrano J，Perronnin F.A model-based sequence similarity with application to handwritten word-spotting.IEEE Transactionson Pattern Analysisand Machine Intelligence，2012，34(11):2108－2120.

［5］Rodriguez-Serrano J A，Perronnin F.Synthe-sizing queries for handwritten word image retrieval.Pattern Recognition，2012，45(9):3270－3276.

［6］Rath T M，Manmatha R.Features for word spotting in historical manuscripts. Proceedings of International Conference on Document Analysis and Recognition.Piscataway:IEEE，2003.218－222.

［7］Frinken V，F(xiàn)ischer A，Manmatha R.A novel word spotting method based on recurrent neural networks.IEEE Transactions on Pattern Analysis and Machine Intelligence，2012，34(2):211－224.

［8］Daugman J G.Two dimensional spectral analysis of cortical receptive field profiles.Vision Research，1980，20(10): 847－856.

［9］Terasawa K，Nagasaki T，Kawashima T.Eigenspace method for text retrieval in historical document images.Proceedings of International Conference on Document Analysis and Recognition.Piscataway:IEEE，2005.437－441.

［10］Zhuang Y，Zhang X.Retrieval of Chinese calligraphic characterimage.Advancesin Multimedia Information Processing-PCM.Berlin:Springer，2005.17－24.

［11］Rodriguez J A，Perronnin F.Local gradient histogram features for word spotting in unconstrained handwritten documents.Proceedings of the International Conference on Frontiers in Handwriting Recognition.Montreal:IAPR，2008.7－12.

［12］Rath M，Manmatha R.Word spotting for historical documents.Int.J.Document Analysis and Recognition，2007，9(2):139－152.

［13］Wu T，Ma P.Feature extraction by hierarchical overlapped elastic meshing for handwritten Chinese character recognition.Proceedings of International Conference on Document Analysis and Recognition.Piscataway:IEEE，2003.529－533.

［14］Dalal N，Triggs B.Histogram of oriented gradients for human detection.Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Piscataway: IEEE，2005.886－893.

［15］Keogh E，Pazzani J.Derivative dynamic time warping.Proceedings of the 1st SIAM International Conference on Data Mining.Chicago:SIAM，2001.1－11.

［16］Holt A，Reinders T.Multi-dimensional dynamic time warping for gesture recognition.Proceedings of the 13th Annual Conference of the Advanced School for Computing and Imaging.2007.300.

［17］Sakoe H，Chiba S.Dynamic programming algorithm optimization for spoken word recognition.IEEE Transactions on Acoustics，Speech，and Signal Processing，1978，26 (1):43－49.

［18］Zhang X，Zhuang Y.Dynamic time warping for Chinese calligraphic character matching and recognition.Pattern Recognition Letters，2012，33(16):2262－2269.

Journal of Harbin Institute of Technology(New Series)2014年2期

Journal of Harbin Institute of Technology(New Series)的其它文章: Experimental Research of Electronic Devices Thermal Control Using Metallic Phase Change Materials; Effect of Quenching Parameters on Mechanical Property of Ultra High Strength Steel BR1500HS Based on Response Surface Methodology; Use of Tetrazolium Salt INT for Estimation of Biological Activity of Activated Sludge Cultivated in SBR Process; Numerical Simulation of Gas-Solid Flow in Square Cyclone Separators with Downward Exit; Convex Set Theory for Reliability Assessment of Steel Beam with Bounded Uncertainty; Research of 6-DOF Serial-Parallel Mechanism Platform for Stability Training of Legged-Walking Robot