Shuang-Mei Wang
A Virtual Walk along Street Based on View Interpolation
Shuang-Mei Wang
—Image morphing is a powerful tool for visual effect. In this paper, a view interpolation algorithm is proposed to simulate a virtual walk along a street from start position to end position. To simulate a virtual walking view needs to create new appearing scene in the vision-vanishing point and disappearing scene beyond the scope of view. To attain these two aims we use two enhanced position parameters to match pixels of source images and target images. One enhanced position parameter is the angular coordinates of pixels. Another enhanced position parameter is the distances from pixels to the vision-vanishing point. According to the parameter values, pixels beyond the scope of view can be “moved” out in linear interpolation. Result demonstrates the validity of the algorithm. Another advantage of this algorithm is that the enhanced position parameters are based on real locations and walking distances, so it is also an approach to online virtual tour by satellite maps of virtual globe applications such as Google Earth.
Index Terms—Enhanced position parameter, feature region, linear interpolation, scope of view, virtual walk.
Online virtual globe applications such as Google Earth and Maps, Microsoft Virtual Earth, and Yahoo! Maps and scene pictures allow users to explore realistic models of the Earth[1]–[4]. To improve online virtual tour, it is necessary to serve sequential scene images to users while they are using digital online maps. So a large number of images need to be acquired, stored, and transmitted with high cost and difficulty. In this work we propose a view interpolation algorithm to simulate a virtual walk by using only two images. Users can walk along a virtual street with a real walking distance.
To simulate a virtual walking view needs to create new appearing scene in the vision-vanishing point and disappearing scene beyond the scope of view. To attain these two aims, we use linear interpolation to generate pixel coordinates based on street boundaries. First two pictures of a street are input as source image and target image. Street boundaries are extracted as feature lines. According to feature lines, a whole image is divided into two regions. The vision-vanishing points of images are obtained by calculating intersection points of boundaries. Based on feature regions and vision-vanishing points, the pixel position can be represented by two enhanced position parametersαandρ. We use linear interpolation to calculate pixel coordinates of in-between image. In each interpolation, the enhanced position parameterρis used to value the distances between pixels and vision-vanishing point. So pixels beyond the scope of view can be “moved”out. We use the nearest neighbor algorithm and color linear interpolation to render unmatched pixels. Result demonstrates the validity of this algorithm. Since enhanced position parameters are based on real locations and distances of satellite maps by Google Earth, this method can be an approach to online virtual tour.
A visual walk along a street includes two main actions: 1) walking forward (the visual angle keeps forward and view depth changes), and 2) stopping-and-looking around (the visual angle changes and view depth keeps constant). The second action can be implemented by recent techniques such as scene matching[2], image mosaic etc. In this work we consider the first action.
We take two pictures of each street. One is the source image and the other is the target image. We record the real distance between the source image and the target image. Previous image processing includes feature line detection and vision-vanishing point (VVP) calculation in both source image and target image. The street boundaries are marked as feature lines or extracted by contour detection. The VVP is obtained by calculating the intersection point of feature lines as shown in Fig. 1.
Fig. 1. Previous processing: (a) input image and (b) feature lines and VVP.
Fig. 2. Source image: (a) feature regions and (b) inner pixel of feature region.
According to VVP of both source image and target image, the VVP of in-between images is generated by linear interpolation as follows:
whereRsrcis VVP of source image andRdstis VVP of target image.
According to detected feature lines of source and target image, the feature lines of in-between images can be generated by linear interpolation.
whereQsrcis the end point of feature lines of source image andQdstis the end point of feature lines of target image. So the feature line of in-between images can be generated according toRmidandQmid.
3.1 Position Parameter
According to VVP and feature lines, the source image can be divided into two regions:and the polygon (the rest of image) shown in Fig. 2 (a).
In the source image, the position of any pixelPsrcin one of these two regions can be represented by two parametersas follows:
3.2 Enhanced Position Parameter
In this part, we use enhanced position parameters to match source image pixels and target image pixels.
Parametersrcα of target image pixels can be calculated according to VVP and feature lines of target image as follows. This is similar tosrcα .
To improve pixels matching, we consider the following transformation process.
Fig. 3. Overflowed pixel of target image.
Fig. 4. Camera image-forming principle.
When user walks forward along a street, the visual angle keeps forward and view depth changes. It gives user an impression that original input image is continuously zoomed in with new appearing details at VVP. During the“zooming-in” process, pixels move in a “diverging” way from VVP until beyond the scope of view. These pixels need to be moved out of image border. As we can see in Fig. 3,Qsrcis one original border pixel of source image but it overflows in the target image (Qdst).
We value the enlarging scalebased on camera image-forming principle, as shown in Fig. 4.
Lens position 1 is the position of the source image. Lens position 2 is the position of the target image.Lis the real distance from start position to end position.
Here three camera parameters are used:lis the focal length,αis the optical angle andkis the projection reduction scale.
The enhanced position parameterdstρ is
According to the enhanced parameters, the pixels of staring image and target image have one-to-onecorrespondence, that is, two pixels with the same enhanced position parameters are matched.
3.3 Linear Interpolation
According to matched pixels of source imagePsrcand pixels of target imagePdst, the position of in-between pixelsPmidcan be generated by linear interpolation.
3.4 Color Interpolation
We use the nearest neighbor algorithm to render unmatched pixels and use color linear interpolation[5],[6]to improve image:
We take photos of campus streets and record the real walking distance of each street. There are three input data: 1) source image, 2) target image, and 3) distance between the start position and the end position.
We test the proposed view interpolation algorithm in comparison with the traditional color interpolation algorithm.
Fig. 5 is the result sequence of the simple color interpolation algorithm. As we can see the interpolated images have double image (one object—“the tree”disappears and appears at the same time in two different positions). The result is not “real” enough.
Fig. 5. Color interpolation algorithm.
Fig. 6. Result images.
Fig. 6 is the result sequence of the view interpolation algorithm. Sequence 1~3 are three different campus streets. The real location and length of each street correspond to the marked lines of satellite images by Google Earth.tis an interpolation weighting factor. At the beginning of image sequencetis 0, corresponding to the source image. In the end of image sequencetis 1, corresponding to the target image. The increment oftis 0.01. We choose interpolated images of differenttto illustrate.
In the first image, the sequencetis 0-0.35-0.43-0.64-0.79-1. In the second image, the sequencetis 0-0.35-0.43-0.83-0.94-1. As we can see, the center region details are coming out little by little.
In the third image, the sequencetis: 0-0.39-0.45-0.66-0.89-1. As we can see, the two sides regions which are beyond the scope of view are “moved” out.
By combining these two morphing, the result sequences present a virtual views to user that street is continuously zoomed in with new appearing scene in the visionvanishing point, and the two sides are passed by at the same time. That is a virtual view of walking along street.
If the increment of weighting factortis small enough, the result image sequence can be more coherent. This method can also be improved by combining other feature based image morphing technologies to generate better image sequences.
Since the location and real walking distance correspond to the marked lines of satellite maps of Google Earth, in the future work this method may be implemented on the client side to provide online virtual tour by maps.
In this work, the goal is to simulate a virtual walk along a street from a source image to a target image. Traditional image morphing based on simple color interpolation has disadvantage of generating double images. Most recent known image morphing technologies are based on correspondence feature primitives such as mesh nodes, line segments, curves, and points. But feature training, detection, and matching between source images and target images have high operational cost.
A view interpolation algorithm is proposed to simulate a virtual walk by using only two images. We use linear interpolation to generate pixel coordinates. Enhanced position parameters are used to match pixels of source and target images and also used to value the distances between pixels and vision-vanishing points. So pixels beyond the scope of view are “moved” out of image. As we can see in result sequences, the center region details are coming out little by little and the regions which are beyond the scope of view are “moved” out.
Since enhanced position parameters are based on real distance between the start position and the end position of the satellite maps by Google Earth, users can walk along a virtual street with a real distance. This algorithm is an approach to online virtual tour by maps.
[1] P. Cho and N. Snavely, “3D exploitation of 2D ground-Level & aerial imagery,” inProc. of IEEE Applied Imagery Pattern Recognition Workshop, Washington DC, 2011, pp. 1–8.
[2] J. Sivic, B. Kaneva, A. Torralba, S. Avidan, and W. T. Freeman, “Creating and exploring a large photorealistic virtual space,” inProc. of IEEE Computer Vision and Pattern Recognition Workshops, Kasturi, 2008, pp. 1391–1407.
[3] D. Rother, L. Williams, and G. Sapiro, “Super-resolution textureing for online virtual globes,” inProc. of IEEE Computer Vision and Pattern Recognition Workshops, Kasturi, 2008, pp. 1–8.
[4] E. Y. Kang and I. Yoon, “Smooth scene transition for virtual tour on the world wide web,” inProc. of the 6th Int. Conf. on Computational Intelligence and Multimedia Applications, 2005, Las Vegas, pp. 219–224.
[5] G. Wolberg, “Image morphing: a survey,”The Visual Computer, vol. 14, no. 8–9, pp. 360–372, 1998.
[6] T. Beier and S. Neely, “Feature-based image metamorphosis,”Computer Graphics, vol. 26, no. 2, pp. 35–42, 1992.
Shuang-Mei Wangwas born in Sichuan, China in 1980. She received the B.S. degree in computer science and technology from Sichuan Normal University, Chengdu in 2002 and the M.S. degree in computer application technology from the University of Electronic Science and Technology of China, Chengdu, in 2005. She is currently a lecturer with Sichuan Normal University, Chengdu. Her research interests include computer graphics, image processing, and computer vision.
t
May 17, 2013; revised July 8, 2013.
S.-M. Wang is with the College of Fundamental Education, Sichuan Normal University, Chengdu 610101, China. (e-mail: shinemewang @163.com).
Color versions of one or more of the figures in this paper are available online at http://www.intl-jest.com.
Digital Object Identifier: 10: 3969/j.issn.1674-862X.2013.04.013
Journal of Electronic Science and Technology2013年4期