贝叶斯⽹络结构学习之MCMC算法(基于FullBNT-1.0.4的
MATLAB实现)
题⽬:贝叶斯⽹络结构学习之MCMC算法(基于FullBNT-1.0.4的MATLAB实现)
有关贝叶斯⽹络结构学习的⼀基本概念可以参考:
有关函数输⼊输出参数的解释可以参考:
本篇所基于的马尔可夫链蒙特卡罗(Markov Chain Monte Carlo, MCMC)⽅法是⼀个通⽤的框架,其代表性的算法是Metropolis-Hastings(MH)算法,MH算法的特例Gibbs采样算法应⽤也很⼴泛。本⽂不具体探讨MCMC的细节,若想了解更多有关MCMC算法内容推荐参考【Bishop C M. Pattern Recognition and Machine Learning (Information Science and Statistics)[M]. Springer-Verlag New York, Inc. 2006.】(简称PRML)的11.2节(MarkovChain Monte Carlo),中⽂资料参考【周志华. 机器学习[M]. 清华⼤学出版社,2016.】(简称西⽠书)的14.5.1节(MCMC采样),Gibbs采样算法的细节可以参考PRML的11.3节(Gibbs Sampling),中⽂资料参考西⽠书的7.5.3节(推断),本⽂附录中会简要给出⼏个查阅⽂献时的关键点,但若想真正弄懂还须阅读推荐的⽂献资料。
将⼯具箱添加到matlab中的步骤参见上⼀篇《》的附录2,若仅想使⽤本⽂介绍的算法学得贝叶斯⽹络结
构,可以在添加⼯具箱到Matlab中后,调⽤本⽂后⾯部分个⼈进⾏⼆次封装的函数:learn_struct_mcmc_basic.m,此函数仅需输⼊数据集,输出即为学得的⽹络结构。但是,建议⼤致浏览⼀遍全⽂。
据⽂献【胡春玲. 贝叶斯⽹络结构学习及其应⽤研究[D]. 合肥⼯业⼤学, 2011.】所述:
将MH⽅法引⼊贝叶斯⽹络结构学习的⽂献有如下⼏篇:
[45] Friedman N, Koller D. Being Bayesianabout network structure. A Bayesian approach to structure discovery in Bayesiannetworks[J]. Machine learning, 2003, 50(1-2): 95-125.
[62] Laskey K B, Myers J W. Population markovchain monte carlo[J]. Machine Learning, 2003, 50(1): 175-196.
[101] Madigan D, York J, Allard D. Bayesiangraphical models for discrete data[J]. International Statistical
Review/RevueInternationale de Statistique, 1995: 215-232.
[102] Giudici P, Castelo R. ImprovingMarkov chain Monte Carlo model arch for data mining[J]. Machine learning,2003,
50(1-2): 127-158.
从页码可以看出,这⼏篇⽂献篇幅都⽐较长,就不去细究了。接下来重点说说基于⼯具箱FullBNT-1.0.4的实现。
函数⽂件路径:\FullBNT-1.0.4\BNT\learning\learn_struct_mcmc.m
我收获了勇气例⼦⽂件路径:\FullBNT-1.0.4\BNT\examples\static\StructLearn\mcmc1.m
learn_struct_mcmc.m的代码如下:
function [sampled_graphs, accept_ratio, timing] = learn_struct_mcmc(data, ns, varargin)
% MY_LEARN_STRUCT_MCMC Monte Carlo Markov Chain arch over DAGs assuming fully obrved data
% [sampled_graphs, accept_ratio, num_edges] = learn_struct_mcmc(data, ns, ...)
%
% data(i,m) is the value of node i in ca m.
% ns(i) is the number of discrete values node i can take on.
%
% sampled_graphs{m} is the m'th sampled graph.
% accept_ratio(t) = acceptance ratio at iteration t
% accept_ratio(t) = acceptance ratio at iteration t
% num_edges(t) = number of edges in model at iteration t
%
% The following optional arguments can be specified in the form of name/value pairs:
% [default value in brackets]
%
% scoring_fn - 'bayesian' or 'bic' [ 'bayesian' ]
% Currently, only networks with all tabular nodes support Bayesian scoring.
% type - type{i} is the type of CPD to u for node i, where the type is a string
% of the form 'tabular', 'noisy_or', 'gaussian', etc. [ all cells contain 'tabular' ]
% params - params{i} contains optional arguments pasd to the CPD constructor for node i, % or [] if none. [ all cells contain {'prior', 1}, meaning u uniform Dirichlet priors ]
% discrete - the list of discrete nodes [ 1:N ]
% clamped - clamped(i,m) = 1 if node i is clamped in ca m [ zeros(N, ncas) ]
% nsamples - number of samples to draw from the chain after burn-in [ 100*N ]
% burnin - number of steps to take before drawing samples [ 5*N ]
% init_dag - starting point for the arch [ zeros(N,N) ]
%
% e.g., samples = my_learn_struct_mcmc(data, ns, 'nsamples', 1000);
%
% Modified by Sonia Leach (SML) 2/4/02, 9/5/03
tic
[n ncas] = size(data);
% t default params
type = cell(1,n);
params = cell(1,n);
for i=1:n
type{i} = 'tabular';
%params{i} = { 'prior', 1};
params{i} = { 'prior_type', 'dirichlet', 'dirichlet_weight', 1 };
end
scoring_fn = 'bayesian';
discrete = 1:n;
clamped = zeros(n, ncas);
nsamples = 100*n;
burnin = 5*n;
dag = zeros(n);
lML = [];
args = varargin;docent
nargs = length(args);
for i=1:2:nargs
switch args{i},
ca 'nsamples', nsamples = args{i+1};
ca 'burnin', burnin = args{i+1};
ca 'init_dag', dag = args{i+1};
ca 'scoring_fn', scoring_fn = args{i+1};
ca 'type', type = args{i+1};
ca 'discrete', discrete = args{i+1};
ca 'clamped', clamped = args{i+1};
ca 'lML', lML = args{i+1};
ca 'params', if impty(args{i+1}), params = cell(1,n); el params = args{i+1}; end
end
end
% We implement the fast acyclicity check described by P. Giudici and R. Castelo,
% "Improving MCMC model arch for data mining", submitted to J. Machine Learning, 2001.
% SML: also keep descendant matrix C
u_giudici = 1;
if u_giudici
[nbrs, ops, nodes, A] = mk_nbrs_of_digraph(dag);
[nbrs, ops, nodes, A] = mk_nbrs_of_digraph(dag);
el
[nbrs, ops, nodes] = mk_nbrs_of_dag(dag);
A = [];
end
num_accepts = 1;
num_rejects = 1;
T = burnin + nsamples;
accept_ratio = zeros(1, T);
num_edges = zeros(1, T);
sampled_graphs = cell(1, nsamples);
timing = zeros(1,T-burnin);
%sampled_bitv = zeros(nsamples, n^2);
for t=1:T
天高地厚
[dag, nbrs, ops, nodes, A, accept] = take_step(dag, nbrs, ops, ...
nodes, ns, data, clamped, A, ...
scoring_fn, discrete, type, params, lML );
num_edges(t) = sum(dag(:));
num_accepts = num_accepts + accept;
num_rejects = num_rejects + (1-accept);
accept_ratio(t) = num_accepts/(num_rejects+num_accepts);
if t > burnin
sampled_graphs{t-burnin} = dag;
名人传主要内容%sampled_bitv(t-burnin, :) = dag(:)';
timing(t-burnin) = toc;
end
if mod(t, 1000)==0
fprintf('%i/%i %f\n', t, T, accept_ratio(t));
end
end
法国旅游攻略%%%%%%%%%
function [new_dag, new_nbrs, new_ops, new_nodes, A, accept] = ...
take_step(dag, nbrs, ops, nodes, ns, data, clamped, A, ...
scoring_fn, discrete, type, params, lML, prior_w )
u_giudici = ~impty(A);
if u_giudici
[new_dag, op, i, j, new_A] = pick_digraph_nbr(dag, nbrs, ops, nodes,A); % updates A
[new_nbrs, new_ops, new_nodes] = mk_nbrs_of_digraph(new_dag, new_A);
el
d = sample_discrete(normali(ones(1, length(nbrs))));
new_dag = nbrs{d};
op = ops{d};
i = nodes(d, 1); j = nodes(d, 2);
[new_nbrs, new_ops, new_nodes] = mk_nbrs_of_dag(new_dag);
end
bf = bayes_factor(dag, new_dag, op, i, j, ns, data, clamped, scoring_fn, discrete, type, params, lML);
%R = bf * (new_prior / prior) * (length(nbrs) / length(new_nbrs));
R = bf * (length(nbrs) / length(new_nbrs));
u = rand(1,1);
if u > min(1,R) % reject the move
accept = 0;
new_dag = dag;
new_nbrs = nbrs;
new_nbrs = nbrs;
new_ops = ops;
new_nodes = nodes;
el
accept = 1;
if u_giudici
A = new_A; % new_A already updated in pick_digraph_nbr
end
end
%%%%%%%%%
function bfactor = bayes_factor(old_dag, new_dag, op, i, j, ns, data, clamped, scoring_fn, discrete, type, params, lML)
if impty(lML)
u = find(clamped(j,:)==0);
LLnew = score_family(j, parents(new_dag, j), type{j}, scoring_fn, ns, discrete, data(:,u), params{j});
LLold = score_family(j, parents(old_dag, j), type{j}, scoring_fn, ns, discrete, data(:,u), params{j});
el
paNew = find(new_dag(:,j));
paOld = find(old_dag(:,j));
kNew = sum(2.^(paNew-1));
kOld = sum(2.^(paOld-1));
LLnew = lML(j, kNew+1);
LLold = lML(j, kOld+1);
end
bf1 = exp(LLnew - LLold);
if strcmp(op, 'rev') % must also multiply in the changes to i's family
if impty(lML)
u = find(clamped(i,:)==0);
LLnew = score_family(i, parents(new_dag, i), type{i}, scoring_fn, ns, discrete, data(:,u), params{i});
LLold = score_family(i, parents(old_dag, i), type{i}, scoring_fn, ns, discrete, data(:,u), params{i});
el
paNew = find(new_dag(:,i));
paOld = find(old_dag(:,i));
kNew = sum(2.^(paNew-1));
kOld = sum(2.^(paOld-1));
LLnew = lML(i, kNew+1);
LLold = lML(i, kOld+1);
end
bf2 = exp(LLnew - LLold);
el
bf2 = 1;
end
bfactor = bf1 * bf2;
%%%%%%%% Giudici stuff follows %%%%%%%%%%
% SML: This now updates A as it goes from digraph it chos
function [new_dag, op, i, j, new_A] = pick_digraph_nbr(dag, digraph_nbrs, ops, nodes, A)
d = sample_discrete(normali(ones(1, length(digraph_nbrs))));
%d = myunidrnd(length(digraph_nbrs),1,1);
i = nodes(d, 1); j = nodes(d, 2);
new_dag = digraph_nbrs(:,:,d);
op = ops{d};
op = ops{d};
new_A = update_ancestor_matrix(A, op, i, j, new_dag); %%%%%%%%%%%%%%
function A = update_ancestor_matrix(A, op, i, j, dag)
switch op
ca 'add',
A = do_addition(A, op, i, j, dag);公司会计
ca 'del',
A = do_removal(A, op, i, j, dag);
ca 'rev',
A = do_removal(A, op, i, j, dag);
A = do_addition(A, op, j, i, dag);
end
%%%%%%%%%%%%
function A = do_addition(A, op, i, j, dag)
A(j,i) = 1; % i is an ancestor of j
anci = find(A(i,:));
if ~impty(anci)
A(j,anci) = 1; % all of i's ancestors are added to Anc(j)
《红岩》读后感
end
ancj = find(A(j,:));
descj = find(A(:,j));
if ~impty(ancj)
for k=descj(:)'
A(k,ancj) = 1; % all of j's ancestors are added to each descendant of j
end
end
%%%%%%%%%%%
function A = do_removal(A, op, i, j, dag)
% find all the descendants of j, and put them in topological order
% SML: originally Kevin had the next line commented and the %* lines
% being ud but I think this is equivalent and much less expensive
% I assume he put it there for debugging and never changed ? descj = find(A(:,j));
%* R = reachability_graph(dag);
%* descj = find(R(j,:));
order = topological_sort(dag);
% SML: originally Kevin ud the %* line but this was extracting the
% wrong things to sort
%* descj_topnum = order(descj);
[junk, perm] = sort(order); %SML:node i is perm(i)-TH in order
descj_topnum = perm(descj); %SML:descj(i) is descj_topnum(i)-th in order
% SML: now re-sort descj by rank in descj_topnum
生姜的保存方法[junk, perm] = sort(descj_topnum);
descj = descj(perm);
% Update j and all its descendants
A = update_row(A, j, dag);
for k = descj(:)'
A = update_row(A, k, dag);
end