Rearch Summary

更新时间:2023-07-03 13:14:16 阅读: 评论:0

Rearch Summary
by Jun Yang
My areas of interest include multimedia information retrieval, Web arching, digital library, computer vision, and multimedia databa, which share a common theme – the management of multimedia data. This interest is originated from the insight that, without effective access and management tools, the pervasive and expanding multimedia information will become more frustrating and less valuable to end urs. Starting from the 3rd year of my undergraduate study, I have worked in the broad area of multimedia for 4 years in 4 different rearch institutions, including Microsoft Visual Perception Lab of Zhejiang University, Siemens in Vienna, Austria, Microsoft Rearch Asia, and Dept of Computer Engineering and Information Technology at City University of Hong Kong. I have published over 15 refereed papers in international conferences and journals or as book chapters, and built veral prototype systems. In this document, I summarize my rearch achievements by domains with reference to my reprentative publications.
1. Multi-modal Information Retrieval
This work is motivated by the conrvative “one system, one media” framework obrved among existing multimedia information retrieval systems, i.e., each system can deal with only a single type of media bad on a single type of knowledge (e.g., content-bad image retrieval). To remedy this limitation, we advocate “one system for all” framework by proposing multi-modal information retrieval, where the keyword “multi-modal” is defined at three levels: (1) multiple types of media data (text, images, videos, etc) are retrieved in an integrated manner; (2) multiple sources of knowledge are explored; and (3) multiple retrieval approaches/techniques are employed. Two rearch projects along this direction are described below: • Octopus – an aggressive arch mechanism for multi-modal information [1]: Octopus is a mechanism for aggressive retrieval of multi-modal data (i.e., a mixture of text, images, videos, etc) in an integrated manner. It is bad on a multifaceted knowledge ba constructed on a layered graph model (LGM), which describes the relevance relationships among media objects deduced from low-level features, contextual knowledge (e.g., hyperlinks), and ur-system interactions. Link analysis technique, an extensively ud technique in Web arching, is applied to explore the LGM to arch for relevant media objects for ur queries. Furthermore, an incremental relevance feedback technique is propod to update the knowledge ba
by learning from ur-system interactions, therefore enhancing the retrieval performance in a “hill-climbing” manner. Octopus advocates a highly flexible retrieval scenario, where urs are free to submit any media (objects) as query example and receive any media (objects) as results.
Our recent work has addresd the interface design [15] of Octopus.
• CoSEEM – a cooperative arch engine for multimedia in digital libraries[2,3]: As the predecessor of Octopus, CoSEEM is a retrieval framework for multimedia information in digital libraries.
It focus on the u of uniform mantic descriptions (keywords) to retrieve various types of media objects in an integrated manner. A learning-from-elements strategy is propod to propagate and update descriptive keywords associated with media objects, and a cross-media arch mechanism is devid to arch for media objects by combining their low-level features and mantic descriptions.
2. Semantics and Content bad Image Retrieval
Due to the “mantic gulf” between low-level features and high-level ur queries, Content-bad Image Retrieval (CBIR) is still of limited practicability in general ttings, while its killer applications in specific
domains are yet to be found. In view of this, my rearch on CBIR emphasizes exploring the role of human-computer interaction to achieve mantics-bad and personalized image retrieval. The specific techniques/devices that I have investigated for this general goal include lexical thesaurus [4,5], ur profiling [6], graphic-theoretic model [7], and “peer indexing” [8, 9], as summarized below:
• Thesaurus-aided image retrieval and browsing [4,5]: This approach explores the power of lexical thesaurus (specifically, WordNet) to support fuzzy match in keyword-bad image retrieval. By examining the similarity between different keywords, our approach is able to match a keyword query with images annotated by different but relevant keywords (e.g., matching a query of “animal” with images annotated with “tiger”), which is not supported by simple keyword matching. Although WordNet has been extensively ud in IR, our approach is unique in terms of a combination of mantic keywords and low-level image features for image retrieval. Moreover, we propo a dynamic mantic hierarchy, which can be automatically constructed from WordNet to support image navigation by mantic subjects.
• Personalized image retrieval bad on ur profiling [6]: The objective of this work is towards personalized image retrieval bad on a synergy of relevance feedback techniques and information filtering/recommendation techniques. Specifically, a “common profile” and a t of “ur profiles” are constructed from ur feedbacks to model the common knowledge and the personal views of individual urs respectively. Our profile-bad image retrieval approach enables “learning from others” by exploring the common profile, as well as “learning from history” by exploring ur profiles. Therefore, the retrieval results generated by our approach strike the balance between matching the commonn of the entire ur community and catering for the personal interests of each individual ur.
•    A graphic-theoretic model for image retrieval [7]: In attempt to remedy the limitation of traditional “non-memory” relevance feedback techniques, we have propod a graphic-theoretic model for incremental relevance feedback in image retrieval. A two-layered graph model is introduced to memorize the mantic correlations (among images) progressively derived from ur feedbacks, and link analysis technique is adopted to explore the graph model for image retrieval. This approach outperforms traditional approaches in both short-term (intra-ssion) and long-term (inter-ssion) performance.
• Data and ur-adaptive image retrieval bad on “peer indexing” [8]: Peer indexing is bad on an intuitive idea – indexing an image by its mantically related peer images. The peer index of an image, as a list of weighted peer images, can be acquired from ur feedbacks by a suggested learning strategy.
Due to the analogy between a keyword and a peer image as a “visual keyword”, mature techniques in the IR area (e.g., TF/IDF weighting scheme, cosine similarity metric) are applied to image retrieval bad on peer indexing in cooperation with low-level image features. Our recent work along this direction has focud on data and ur-adaptive image retrieval [9] by applying two-level peer indexing.
3. Vector-bad Media (Flash™) Management
Recent years witness the phenomenal growth of Flash, a new format of vector-bad animation t forth by Macromedia Inc., which has over 440 million of viewers worldwide. This remarkable popularity justifies the need of investigating the management issues of Flash, which are critical to the better utilization of the enormous Flash resource but unfortunately overlooked by the rearch community. We therefore propo FLAME [10,11], namely FL ash A ccess and M anagement E nvironment, which covers a variety of management issues of Flash animations.
Currently, FLAME consists of three functional components, including (1) content-bad retrieval component, which address the indexing, retrieval, and query specification of Flash animations by exploring
their content characteristics on their embedded media ingredients, spatio-temporal features, and ur interactions; (2) classification component, which automatically classifies Flash animations into pre-defined categories, such as MTV, commercial advertiment, cartoon, e-postcard, bad on their content characteristics; and (3) gmentation component, which partitions long Flash animations into shot/scene structures defined similarly to their counterparts in video gmentation. Further issues to be explored under FLAME include Flash arch engine, copyright protection, and sample-bad Flash authoring.
半身美女4. Multimedia Databa
My primary goal in this area is to apply databa technology to address the efficiency and scalability
problem that plagues data-intensive multimedia information systems. One specific problem is the “mantic gap” between mantics-intensive multimedia applications and conventional databas, which are inadequate to model the context-dependent mantics of multimedia data. We have managed to propo MediaView [12,13] as an extended object-oriented view mechanism to bridge this mantic gap. Specifically, this mechanism captures the dynamic mantics of multimedia using a modeling construct named media view, which formulates a customized context where heterogeneous media objects with related mantics are characterized by mantic properties and relationships.
Another proposal is a lf-adaptive mantic schema mechanism (SSM) for multimedia databas [14]. The SSM is implemented bad on an object-oriented data model, in which class are organized into a mantic hierarchy. As its unique feature, SSM supports adaptive evolution of a schema in the form of expansion with new class and/or compaction by removing inefficient class, when the conditions of predefined ECA-rules are satisfied. This lf-adaptive evolution strategy allows a data schema to be automatically optimized for each particular multimedia application, (esp. multimedia retrieval systems), thereby achieving a dynamic, application-specific balance between modeling capability and efficiency.
杞子菊花茶5. Video-bad Human Animation
合字开头的成语To overcome the shortcomings of conventional human animation techniques, we have propod a video-bad human animation approach [16]. Given a video clip containing human motion, we first recognize and track the human joints with the aid of Kalman filter and morph-block matching in a quence of video frames. From the recognized human joints, we construct the corresponding 3-D human motion skeleton quence under the perspective projection, using camera calibration techniques and human anatomy knowledge. Finally, a motion library is established by annotating multiform motion attributes, which can be browd and queried by animators. This approach has the advantages of rich source materials, low computational cost, efficient production, and realistic animation result.
6. Video Segmentation
Segmentation of video clips rves as the basis of video indexing and retrieval. We have developed a prototype system for parsing video clips, especially news videos, into a quence of shots and scenes. The shot boundaries are detected by examining the difference between the color histograms of concutive frames using “twin-comparison” algorithm, which is robust in detecting gradual transiti
坎坷的反义词ons (zoom, fade in/out, dissolve, etc). Particularly, for news videos with a prior model of the temporal video structure, we group the gmented shots into higher-level units such as news stories, weather forecast, and commercials. Reference:
1. Jun Yang, Qing Li, Yueting Zhuang, “Octopus: Aggressive Search of Multi-Modality Data Using Multifaceted Knowledge
Ba”, Proc. of 11th Int'l Conf. on World Wide Web, pp.54-64, Hawaii, USA, May, 2002.
2. Jun Yang, Yueting Zhuang, Qing Li, “Search for Multi-Modality Data in Digital Libraries”, Proc. of 2nd IEEE Pacific-Rim
Conf. on Multimedia, pp. 482-489, Beijing, China, 2001.
3. Jun Yang, Yueting Zhuang, Qing Li, “Multi-Modal Retrieval for Multimedia Digital Libraries: Issues, Architecture, and
Mechanisms”, Proc. of Int'l Workshop on Multimedia Information Systems, pp. 81-88, Capri, Italy, 2001.
4. Jun Yang, Liu Wenyin, Hongjiang Zhang, Yueting Zhuang, “Thesaurus-aided Approach for Image Retrieval and Browsing”,
Proc. of 2nd IEEE Int'l Conf. on Multimedia and Expo, pp. 313-316. Tokyo, Japan, 2001.
5. Jun Yang, Liu Wenyin, Hongjiang Zhang, Yueting Zhuang, “An Approach to Semantics-bad Image Retrieval and
Browsing”, Proc. of 7th Int’ l Conference on Distributed Multimedia Systems, Taiwan, 2001.
弘扬中华文化6. Qing Li, Jun Yang, Yueting Zhuang, “Web-bad Multimedia Retrieval: Balancing out between Common Knowledge and
Personalized Views”, Proc. of 2nd Int'l Conf. on Web Information System Engineering, pp. 92-101, Kyoto, Japan, 2001.
7. Yueting Zhuang, Jun Yang, Qing Li, “A Graphic-Theoretic Model for Incremental Relevance Feedback in Image Retrieval”,
Proc. of 2002 Int'l Conf. on Image Processing, New York, Sep., 2002.
8. Jun Yang, Qing Li, Yueting Zhuang, "Image Retrieval and Relevance Feedback using Peer Indexing", Proc. of 2002 IEEE
Int'l Conf. on Multimedia and Expo, Lausanne, Switzerland, Aug, 2002.
9. Jun Yang, Qing Li, Yueting Zhuang, "Modeling Data and Ur Characteristics by Peer Indexing in Content-bad Image
Retrieval", The 9th Int'l Conf. on Multimedia Modeling, Taiwan, 2003. (accepted)
10. Jun Yang, Qing Li, Liu Wenyin, Yueting Zhuang, "Search for Flash Movies on the Web", Proc. of the 3rd Int'l Conf. on Web
Information Systems Engineering, workshop on Mining for Enhanced Web Search, Singapore, 2002.
11. Jun Yang, Qing Li, Liu Wenyin, Yueting Zhuang, "FLAME: A Generic Framework for Content-bad Flash
Retrieval", Proc. of the 4th Int'l Workshop on Multimedia Information Retrieval, in conjunction with ACM Multimedia 2002, Juan-les-Pins, France, 2002.
12. Qing Li, J un Yang, Yueting Zhuang,  "MediaView: A Semantic View Mechanism for Multimedia Modeling", Proc. of the
3rd IEEE Pacific-Rim Conf. on Multimedia, Taiwan, Dec. 2002. (accepted)
13. Qing Li, Jun Yang, Yueting Zhuang, “Chapter 9: A Semantic Data Modeling Mechanism for Multimedia Databas”, in
漓江风光Multimedia Information Retrieval and Management, edited by Hong-jiang Zhang, etc.
玉树开花
14. Jun Yang, Qing Li, and Yueting Zhuang, "A Self-adaptive Semantic Schema Mechanism for Multimedia Databas", SPIE
Photonics Asia: Electronic Imaging and Multimedia Technology III, pp.69-79, Proc. vol. 4926, Shanghai, China, Oct. 2000.
15. Jun Yang, Qing Li, Yueting Zhuang, "A Multimodal Information Retrieval System: Mechanism and Interface", IEEE Trans.
泰国化妆品on Multimedia (submitted).
16. Zhuang Yueting, Liu xiaoming, Pan Yunhe, Yang Jun, "Human Three Dimension Motion Skeleton Reconstruction of
Motion Image Sequence", Journal of Computer-aided Design & Computer Graphics, 12(4), 245-251, 2002. (in Chine)

本文发布于:2023-07-03 13:14:16,感谢您对本站的认可!

本文链接:https://www.wtabcd.cn/fanwen/fan/82/1076130.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:成语   菊花茶   风光
相关文章
留言与评论(共有 0 条评论)
   
验证码:
推荐文章
排行榜
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图