Multi-Instance Learning_ A Survey

更新时间:2023-07-10 11:32:45 阅读：评论：0

Multi-Instance Learning:A Survey

Zhi-Hua Zhou

National Laboratory for Novel Software Technology,

Nanjing University,Nanjing210093,China

Abstract In multi-instance learning,the training t compris labeled bags that are compod of unlabeled instances,and the task is to predict the labels of unen bags.This paper provides a survey on this topic.Atﬁrst,it introduces the origin of multi-instance learning.Then,developments on the study of learnability,learning algorithms,applications and extensions of multi-instance learning are reviewed.In particular,this paper employs a uniﬁed view to look into multi-instance learning algorithms.Some important issues to be addresd are also discusd.

Keywords Learning from Examples;Multi-Instance Learning;Supervid Learn-ing企业管理制度汇编

幼儿园安全工作总结Corresponding author:

Zhi-Hua Zhou

Mail:National Laboratory for Novel Software Technology,

Nanjing University,Mailbox419,

Hankou Road22,Nanjing210093,China

Tel.:+86-25-8368-6268

Fax:+86-25-8368-6268

E-mail:zhouzh@nju.edu

Multi-Instance Learning:A Survey

Abstract

In multi-instance learning,the training t compris labeled bags that are compod of unlabeled instances,and the task is to predict the labels of unen bags.This paper provides a survey on this topic.Atﬁrst,it introduces the origin of multi-instance learning. Then,developments on the study of learnability,learning algorithms,applications and extensions of multi-instance learning are reviewed.In

particular,this paper employs a uniﬁed view to look into multi-instance learning algorithms.Some important issues to be addresd are also discusd.

Key words:Learning from Examples;Multi-Instance Learning;Supervid Learning

1Introduction

During the past years,learning from examples is one of the mostﬂourishing ar-eas in machine learning.According to the ambiguity of training data,rearch in this area can be roughly categorized into three learning super-vid learning,unsupervid learning,and reinforcement learning[16].Supervid learning attempts to learn a concept for correctly labeling unen instances,where the training instances are with known labels and therefore the ambiguity is the minimum.Unsupervid learning attempts to learn the structure of the underly-ing sources of instances,where the training instances are without known labels and therefore the ambiguity is the maximum.Reinforcement learning attempts to learn a mapping from states to actions,where the instances are with no labels but with delayed rewards that could be viewed as delayed labels,therefore the ambiguity is between that of supervid learning and unsupervid learning.

The term multi-instance learning was coined by Dietterich et al.[10]when they were investigating the

problem of drug activity prediction.In multi-instance learning,

金秋野the training t is compod of many bags each contains many instances.A bag is positively labeled if it contains at least one positive instance;otherwi it is labeled as a negative bag.The task is to learn some concept from the training t for correctly labeling unen bags.

贝多芬生平简介

Diﬀerent to supervid learning where all training instances are with known labels, in multi-instance learning the labels of the training instances are unknown;diﬀerent to unsupervid learning where all training instances are without known labels, in multi-instance learning the labels of the training bags are known;diﬀerent to reinforcement learning where the labels of the training instances are delayed,in multi-instance learning there is no any delay.It has been shown that learning algorithms ignoring the characteristics of multi-instance problems,such as popular decision trees and neural networks,could not work well in this scenario[10].陈学冬身高

Since multi-instance problems extensively exist but are unique to tho addresd by previous learning frameworks,multi-instance learning was regarded as a new learning framework[16],and has attracted much attention of the machine learning community.Since many developments on multi-instance learning have been made, it ems uful to have a survey on this topic,which is the purpo of this paper.

The rest of this paper is organized as follows.Section2introduces the origin of multi-instance learning.Section3prents the analys on the learnability of multi-instance learning.Section4looks into multi-instance learning algorithms with a supervid view.Section5and Section6prents applications and extensions of multi-instance learning,respectively.Finally,Section7discuss on veral issues to be addresd in thisﬁeld.

2Drug Activity Prediction

惹事生非In the middle of1990s,Dietterich et al.[10]investigated the problem of drug activity prediction.The goal was to endow learning systems with the ability of predicting that whether a new molecule was qualiﬁed to make some drug,through

analyzing a collection of known molecules.

Most drugs are small molecules working by binding to larger protein molecules such as enzymes and cell-surface receptors.For molecules qualiﬁed to make a drug,one of its low-energy shapes could tightly bind to the target area;while for molecules unqualiﬁed to make a drug,none of its low-energy shapes could tightly bind to the target area.The main diﬃculty of drug activity prediction lies in that each molecule could have many alternative low-energy shapes,as illustrated in Fig.1, but curr

ently biochemists only know that whether a molecule is qualiﬁed to make a drug or not,instead of knowing that which of its alternative low-energy shapes respons for the qualiﬁcation.

Fig.1.The shape of a molecule changes as it rotates an internal bond

An intuitive solution is to utilize supervid learning algorithms by regarding all the low-energy shapes of the‘good’molecules as positive training instances,while regarding all the low-energy shapes of the‘bad’molecules as negative training instances.However,as shown by Dietterich et al.[10],such a method can hardly work due to high fal positive noi,which is caud by that a‘good’molecule may have hundreds of low-energy shapes but maybe only one of them is really a ‘good’shape.

In order to solve this problem,Dietterich et al.[10]regarded each molecule as a bag, and regarded the

alternative low-energy shapes of the molecule as the instances in the bag,thereby formulated multi-instance learning.In order to reprent the shapes,the molecule was placed in a standard position and orientation and then a t of162rays emanating from the origin was constructed so that the molecular surface was sampled approximately uniformly,as illustrated in Fig.2.There were also four features that reprented the position of an oxygen atom on the molecular

北京大学医学部研究生招生网

surface.Therefore each instance in the bags was reprented by a166-dimensional numerical feature vector.

贬义词成语

Fig.2.The ray-bad reprentation of the molecular shape

Dietterich et al.[10]propod three axis-parallel rectangle(abbreviated as APR)al-gorithms to solve the drug activity prediction problem,which attempts to arch for appropriate axis-parallel rectangles constructed by the conjunction of the features. The GFS elim-count APR algorithm identiﬁes an APR covering all the instances from positive bags atﬁrst.Then,it gradually shrinks the APR through greedily eliminating instances from negative bags.Aﬁgure from Dietterich et al.’s paper [10]well illustrated this process,as shown in Fig.3,where the white and dark points reprented instances from positive and negative bags,respectively,diﬀer-ent shapes reprented diﬀerent bags,and the initial APR was indicated with solid line.For each instance from native bags,the algorithm counts the minimum num-ber of instances from positive bags that must be excluded from the APR in order to exclude the concerned instance from negative bag.In Fig.3,the counts were shown next to small lines that indicated which side of the APR should be shrunk in order to exclude the instance from negative bag.The algorithm iteratively choos to eliminate the instance from negative bag that is easiest to eliminate until all such instances are eliminated.The resulting APR was indicated with the dash line in Fig.3.Then,the algorithm determines the bounds of relevant features with greedy feature lection,and therefore obtains theﬁnal APR.

The diﬀerence between the GFS kde APR algorithm and the GFS elim-count APR algorithm mainly lies in the fact that the former does not merely counting the

Fig.3.An illustration of the GFS elim-count APR algorithm

number of instances from positive bags that must be excluded in order to exclude an instance from negative bag.Instead,GFS kde APR considers the number of instances from diﬀerent positive bags that are covered by the initial APR,and us a cost function to control the process of eliminating instances from negative bags so that each positive bag rerves at least one instance in the APR.The Iterated-discrim APR algorithm employs greedy backﬁtting algorithm to identify an APR that covers at least one instance from each positive bag.Then,it utilizes this APR to lect the most discriminating features.Finally,kernel density estimation is exploited to help improve the generalizatio

n through expanding the bounds of the APR so that with high probability,new instances from positive bags will fall inside the APR.

Dietterich et al.’s experiments[10]showed that the Iterated-discrim APR algorithm achieved the best result on the Musk data[6],which is a concrete test data of drug activity prediction and the most popularly ud benchmark in multi-instance learning.It is worth mentioning that Dietterich et al.[10]indicated that since the Iterated-discrim APR algorithm had been geared to the Musk data,its performance might be the upper bound on this data t.

Note that multi-instance problems do not emerge suddenly from drug activity prediction.In fact,they extensively exist in real-world applications[14][32].But unfortunately,the uniqueness of the problems have not been particularly distin-guished until Dietterich et al.[10].

本文发布于:2023-07-10 11:32:45，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/89/1075577.html

上一篇：大学英语四级分类模拟题350_真题-无答案

下一篇：专业英语四级完型填空语法与词汇-3_真题-无答案

标签：管理制度总结简介

留言与评论（共有 0 条评论）