首页 > 美文阅读

微博情绪分类

更新时间:2023-05-27 16:58:14 阅读：评论：0

Classifying Sentiment in Microblogs: Is Brevity an Advantage?

Adam Bermingham&Alan Smeaton

CLARITY:Centre for Sensor Web Technologies

School of Computing

Dublin City University

{abermingham,asmeaton}@computing.dcu.ie

ABSTRACT

Microblogs as a new textual domain oﬀer a unique propo-sition for ntiment analysis.Their short document length suggests any ntiment they contain is compact and explicit. However,this short length coupled with their noisy nature can po diﬃculties for standard machine learning document reprentations.In this work we examine the hypothesis that it is easier to classify the ntiment in the short form documents than in longer form documents.Surprisingly, weﬁnd classifying ntiment in microblogs easier than in blogs and make a number of obrvations pertaining to the challenge of supervid learning for ntiment analysis in microblogs.

Categories and Subject Descriptors

H.3.3[Information Search and Retrieval]:Text Mining

General Terms

Algorithms,Experimentation

1.INTRODUCTION

Microblogging has become a popular method for Internet urs to publish thoughts and information in real-time.Au-tomated ntiment analysis of microblog posts is of interest to many,allowing monitoring of public ntiment towards people,products and events,as they happen.

The short length of microblog documents means they can be easily published and read on a variety of platforms and modalities.This brevity constraint has led to the u of non-standard textual artefacts such as emoticons and informal language.The resulting text is often considered“noisy”.

It is reasonable to assume that the short document length introduces a succinctness to the content.The focud nature of the text and higher density of ntiment-bearing terms may beneﬁt au

tomated ntiment analysis techniques.On the other hand,it may also be that the shorter length and language conventions ud mean there is not enough context Permission to make digital or hard copies of all or part of this work for personal or classroom u is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full citation on theﬁrst page.To copy otherwi,to republish,to post on rvers or to redistribute to lists,requires prior speciﬁc permission and/or a fee.

CIKM’10,October26–30,2010,Toronto,Ontario,Canada.

The issues motivate our rearch questions:(i)How does ntiment classiﬁcation accuracy in the microblogging do-main compare to that for microreviews,another short-form textual domain?How do the accuracies compare to tho for their long-form counterparts?and(ii)How do diﬀerent feature vector reprentations and classiﬁers aﬀect ntiment classiﬁcation accuracy for microblogs?How does this com-pare to the corpora explored in(i)?

2.RELATED WORK

Sentiment analysis has been successfully ud to analy and extract opinion from text in recent years[12].Some ex-ploratory works have been completed in the microblog do-main.Diakapolous and Shamma ud manual annotations to characterize the ntiment reactions to various issues in a political debate[5].Theyﬁnd that ntiment is uful as a measure for identifying controversy.Jann et al.studied the Word-Of-Mouth eﬀect on Twitter using an adjective-bad ntiment classiﬁer,ﬁnding it uful for brand analyt-ics on Twitter.Bollen et al.analyd ntiment on Twitter according to a six-dimensional mood reprentation[2]ﬁnd-ing that ntiment on Twitter correlates with real-world val-ues such as stock prices and coincides with cultural events. The latter two studies report positive results from using au-tomated ntiment analysis techniques on Twitter data. Noi in Computer-Mediated-Content has been the sub-ject of much rearch.Tagliamonte and Denis studied in-stant messaging[13],ﬁnding that the penetration of non-standard English language and punctuation is far less than is reported in the media.In a study of classiﬁcation of cus-tomer feedback,Gamon found a high level of accuracy for supervid ntiment classiﬁcation despite their noisy nature [7].One strategy to deal with noi in the domain put for-ward by Choudhury et al.is to u Hidden Markov Models to decode text into standard English[4]reporting a high level of success for SMS data.Agarwal et al.showed that by simulating noi in text classiﬁcation,a good classiﬁer should perform well up to about40%noi[1]suggesting that although noi may be prent in text,thi

s may not prove to be important for supervid learning tasks.Car-valho et al.found that non-standard surface features such as a heavy punctuation and emoticons are key to detecting irony in ur-generated content[3].

Collectively,the studies all support our assumption that new textual domains exhibit domain speciﬁc features.We also e that there is signiﬁcant value in being able to model

Table1:Microblog annotation labels and associated document counts.

Label#Documents

Relevant,Positive1,410

Relevant,Negative1,040

Relevant,Neutral2,597

Relevant,Mixed146

Not relevant498

Unannotatable603

Unclear530

Total6,824解珍

ntiment in the domains.To our knowledge,this is the ﬁrst work to explore the challenges that the shortness of mi-croblog documents prent to feature vector reprentations and supervid ntiment classiﬁcation.

3.METHODOLOGY

The microblog posts ud in the experiments are taken from a collection of over60million posts which we gathered from the Twitter public data API1from February to May 2009.We examined the trending topics from this period and identiﬁedﬁve recurring themes:Entertainment,Products and Services,Sport,Current Aﬀairs and Companies.We lected10trends from each of the categories to be ud as ntiment targets.By making the topic t diver and challenging,we hope to better test the performance of our approach and build a classiﬁer reprentative of a real world generic ntiment classiﬁcation scenario.

春节的图片简单又漂亮

In the annotation process,Wilson’s deﬁnition of ntiment was ud:“Sentiment analysis is the task of identifying posi-tive and negative opinions,emotions,and evaluations.”[14] Our team of annotators consisted of9PhD students and Postdoctoral rearchers.To ensure suﬃcient agreement among the annotators,the annotation was preceded by a number of training iterations,consisting of group meetings, connsus annotations and one-on-one discussions.See Ta-ble3for a breakdown of annotations by label.

In total,9annotators annotated17documents for each of the50topics.463documents were doubly annotated for inter-annotator agreement(6.78%).For the7labels,the Kappa agreement was0.65.For the3class which we u for training(positive,negative and neutral)Kappa was0.72. If we just consider the binary ntiment class,positive and negative,this increas to0.94.The relatively high values for kappa are consistent with our previous annotation of blogs[10].

To contrast with our microblogs corpus,we derive a cor-pus of blog posts from the TREC Blogs06corpus[8].We u a templating approach to extract positive,negative and neutral blog post content and comments from the corpus, using the TREC relevance judgments as labels.

As much of ntiment analysis literature concerns review classiﬁcation,in parallel to our experiments

on the microblog and blog corpora,we also conduct our experiments on a cor-pus of microreviews and a corpus of reviews.In January 2010we collected microreview documents from the microre-view website,Blippr2.Blippr reviews bear a similarity to 1

2 microblog posts in that they share the same character limit of140characters.Reviews on Blippr are given one of four ratings by the author,in order from most negative to most positive:hate,dislike,like and love.In our corpus we u only reviews with strongly polarid ntiment:hate and love.We have made our microreview and microblog corpora available for other rearchers3.

The reviews corpus we u as comparison is perhaps the mostly widely studied ntiment corpus,Pang and Lee’s movie review corpus[11].This corpus contains archival movie reviews from USENET.We refer to the microblog and microreview datats as the short-form document corpora and the blog and movie review datats as the long-form document corpora.

Our datats are limited to exactly1000documents per class in line with the movie review corpus.This allows us to eliminate any underlying ntiment bias which may be prent.While this is obviously a consideration for a real-world system,in our experiments we wish to examine the challen

ges of the classiﬁcation without biasing our evalua-tion towards any particular class.As the ntiment distri-bution is diﬀerent in each of the domains,this also makes accuracies comparable across datats.

For our experiments we u Support Vector Machine(SVM) and Multinomial Naive Bayes(MNB)classiﬁers,giving us an accurate reprentation of the state-of-the-art in text clas-siﬁcation.We u an SVM with a linear kernel and the parameter c t to1.In preliminary experiments we found binary feature vectors more eﬀective than frequency-bad vectors and found no beneﬁt from stopwording or stemming. Where possible,we replaced topics with pudo-terms to avoid learning topic-ntiment bias.We also replace URLs and urnames with pudo-terms to avoid confusion during tokenization and POS tagging.Each feature vector is L2 Normalized and for the the long-form corpora only features which occurred4or more times were ud,as Pang and Lee did in their original movie review experiments.Accuracy was measured using10fold cross-validation and the folds wereﬁxed for all experiments.

As a baline for binary(positve/negative)classiﬁcation we u a classiﬁer bad on a ntiment lexicon,SentiWord-Net[6].This unsupervid classiﬁer calssiﬁes documents using the mean ntiment scores of the synts its words be-long to.Despite their naivety,this type of classiﬁer is ofte

n ud as it does not require expensive training data.

4.RESULTS AND DISCUSSION

Unigram binary(positive/negative)classiﬁcation accuracy for microblogs is74.85%using an SVM.This is an encour-aging accuracy given the diversity in the ntiment topics. As we have balanced datats,a trivial classiﬁer achieves 50%accuracy for binary classiﬁcation.For microreviews, the accuracy is considerably higher at82.25%using an SVM. As expected,the classiﬁerﬁnds it easier to distinguish be-tween polarid reviews than to identify ntiment in arbi-trary posts.

Sentiment classiﬁcation of the long-form documents yields some surprising results.Blog classiﬁcation accuracy is sig-niﬁcantly lower than for microblogs.However,movie re-view classiﬁcation is higher than for microreviews,conﬁrm-ing Pang and Lee’s results of87.15%for SVM with unigram 3puting.dcu.ie/~abermingham/data/

Figure1:Accuracies for unigram features. features.Atﬁrst this may em contradictory—surely the classiﬁer should perform consistently across textual do-mains?We speculate that this behaviour is due to within-document topic drift.In the two review corpora the text of the document has a high density of ntiment informa-tion about the topic,and a low noi density.In the blogs datat,this is

not necessarily the ca;the ntiment in a blog post may be an isolated reference in a subction of the document.Although topic drift also occurs in the microblog corpus,there is less opportunity for non-relevant information to enter the feature vector and our classiﬁer is not as adverly aﬀected as in the blog domain.

Our unsupervid lexicon-bad classiﬁer performs poorly across all datats.For the blogs corpus,it is outperformed by a trivial classiﬁer.The accuracy gap between supervid and unsupervid classiﬁcation accuracy in the long-form corpora is much more pronounced.This makes intuitive n as the probability of the polarity of a word in a doc-ument expressing ntiment towards a topic is again much higher in the short-form domains.

Of the two supervid classiﬁers,SVM outperforms MNB in the long-form domains,whereas the opposite is true in the short-form domains.SVMs scale better with larger vec-tor dimensionality so this is most likely the reason for this obrvation;the number of unique terms in the longer doc-uments is over three times their shorter counterparts,even when infrequent features have been excluded.

成都锦里古街介绍

Having established a reasonable performance in ntiment classiﬁcation of microblog posts,we wish

to explore whether we can improve the standard bag of words feature t by adding more sophisticated features.Using quences of terms, or n-grams,we can capture some of the information lost in the bag-of-words model.We evaluated two feature ts:(un-igrams+bigrams)and(unigrams+bigrams+trigrams). We found that although an increa in classiﬁcation accu-racy is obrved for the movie reviews,this is not the ca for any of the other datats(e Table2).We also ex-amined POS-bad n-grams in conjuction with a unigram model and obrved a decrea in accuracy across all cor-pora.This indicates that the syntactic patterns reprented by the POS n-gram features do contain information which is more discriminative than unigrams.

The most promising results came from a POS-bad stop-wording approach propod by Matsumoto et al.([9]).This approach(which Matsumoto fer to as“word sub-quences”)consists of an n-gram model,where terms have Table3:Most discriminative unigrams,bigrams and trigrams according to Information Gain Ratio for binary classiﬁcation.

Microblogs Blogs Microreviews Reviews 1!witherspoon great bad

2<Url>joaquin boring worst

3<Topic>ree best stupid

witherspoon

4amazing joaquin terrible boring

phoenix

5..sharon the best the worst 6!!ledger worst waste

7?heath n’t ridiculous

ledger

8!!!heath love wasted

9love johnny loved awful

cash

10<Topic>!palestinians??

Table4:Ternary Unigram Classiﬁcation Accuracies: Positive,Negative,Neutral

MNB SVM#features

Microblogs61.359.58132

Blogs52.1357.628805

为梦想领跑been stopworded bad on their POS.We u the same POS list as Matsumoto.The features increa accuracy across all corpora for unigrams+POS-stopworded bigrams.This suggests that a better understanding of the linguistic con-text of terms is similarly advantageous in all domains.

To examine the performance of individual features,we u a standard measure of discriminability,Information Gain Ratio(e Table3).Immediately obvious is the signiﬁcant role that punctuation plays in expressing ntiment in mi-croblog posts.This suggests that the are being ud specif-ically in microblog posts to express ntiment,perhaps as in-dicators for intonation.The discriminative features for both the reviews and microreviews are largely similar in nature, typically polarid adjectives.The blog classiﬁer appears to have learned a certain amount of entity bias as many of the discriminative features are people or places.Note that none of the entities are topic ter

ms(topic terms were removed in pre-processing),though they do appear to be entities as-sociated with topics.

Results of our ternary classiﬁcation on microblogs and blogs can be en in Table4.The accuracy is,as expected, signiﬁcantly less than for binary classiﬁcation with SVMs again outperforming MNB on the longer blog documents. 5.CONCLUSION

The results of our experiments on the whole are encourag-ing for the task of analysing ntiment in microblogs.We achieve an accuracy of74.85%for binary classiﬁcation for a diver t of topics indicating we can classify microblog documents with a moderate degree of conﬁdence.In both of our short-form corpora weﬁnd it diﬃcult to improve per-formance by extending a unigram feature reprentation. This is contrary to the long-form corpora which respond favourably to enriched feature reprentations.We do how-ever e promi in sophisticated POS-bad features across

Table2:Binary accuracy summary(ﬁgures as%)

Microblogs Blogs Microreviews Movies Feature Set MNB SVM MNB SVM MNB SVM MNB SVM Unigram74.8572.9564.668.7582.2580.882.9587.15 Unigram+Bigram74.3572.9564.668.4582.1581.485.2587.9 Unigram+Bigram+Trigram73.772.864.66

8.581.9580.8584.887.9 Unigram+POS n-gram(n=1)73.2571.664.768.4580.879.582.486.95 Unigram+POS n-gram(n=1,2)70.2570.0562.666.2580.879.581.884.95 Unigram+POS n-gram(n=1,2,3)68.869.762.4564.674.776.979.9582 Unigram+POS-stopworded Bigram74.1573.2564.56982.581.0585.3587.5 Unigram+POS-stopworded Bigram+Trigram74.473.4564.8568.782.1580.685.587.8

all datats and speculate that engineering features bad on deeper linguistic reprentations such as dependencies and par trees may work for microblogs as they have been shown to do for movie reviews.In analysing discrimina-tive features,we found that a signiﬁcant role is played by punctuation.As a future direction for this work we hope to explore this notion with a view to incorporating it into the feature engineering process.It is surprising to e that this is not a pattern en in our microreviews corpus indicating that this is not an artefact of the brevity of the platforms. We conclude that although the shortness of the documents has a bearing on which feature ts and classiﬁer will pro-vide optimum performance,the sparsity of information in the documents does not hamper our ability to classify them. On the contrary,weﬁnd classifying the short documents a much easier task than their longer counterparts,blogs.Also, the“noisy”artefacts of the microblog domain such as infor-mal punctuation turn out to be a beneﬁt to the classiﬁers. The results provide a comp关于面瘫

elling argument to encourage the community to focus on microblogs in future ntiment analysis rearch.

Acknowledgments

This work is supported by Science Foundation Ireland under grant07/CE/I1147

6.REFERENCES

[1]S.Agarwal,S.Godbole,D.Punjani,and S.Roy.How

much noi is too much:A study in automatic text

classiﬁcation.In ICDM,pages3–12,2007.

[2]J.Bollen,A.Pepe,and H.Mao.Modeling public mood

and emotion:Twitter ntiment and socio-economic

phenomena.CoRR,abs/0911.1583,2009.

[3]P.Carvalho,L.Sarmento,M.J.Silva,and

E.de Oliveira.Clues for detecting irony in

ur-generated !!it’s”so easy”;-).In

TSA’09:Proceeding of the1st international CIKM

workshop on Topic-ntiment analysis for mass

opinion,pages53–56,New York,NY,USA,2009.

五花肉卷金针菇ACM.

[4]M.Choudhury,R.Saraf,V.Jain,A.Mukherjee,

S.Sarkar,and A.Basu.Investigation and modeling of the structure of texting language.IJDAR,

立其丁

10(3-4):157–174,2007.

[5]N.A.Diakopoulos and D.A.Shamma.Characterizing

debate performance via aggregated Twitter ntiment.

色66

In Conference on Human Factors in Computing

Systems(CHI2010),2010.

[6]A.Esuli and F.Sebastiani.Sentiwordnet:A publicly

available lexical resource for opinion mining.In In

Proceedings of the5th Conference on Language

Resources and Evaluation(LREC-06),pages417–422, 2006.

[7]M.Gamon.Sentiment classiﬁcation on customer

feedback data:noisy data,large feature vectors,and

the role of linguistic analysis.In COLING’04:

Proceedings of the20th international conference on

Computational Linguistics,page841,Morristown,NJ, USA,2004.Association for Computational Linguistics.

[8]C.MacDonald and I.Ounis.The TREC Blogs06

collection:Creating and analysing a blog test

collection.Technical report,University of Glasgow,

Department of Computing Science,2006.

[9]S.Matsumoto,H.Takamura,and M.Okumura.

Sentiment classiﬁcation using word sub-quences and dependency sub-trees.In Proceedings of PAKDD’05,

the9th Paciﬁc-Asia Conference on Advances in

Knowledge Discovery and Data Mining,2005.

[10]N.O’Hare,M.Davy,A.Bermingham,P.Ferguson,

P.Sheridan,C.Gurrin,and A.F.Smeaton.

Topic-dependent ntiment analysis ofﬁnancial blogs.

In In:TSA2009-1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion

Measurement,Hong Kong,China,6Nov2009.

[11]B.Pang and L.Lee.A ntimental education:

ntiment analysis using subjectivity summarization

bad on minimum cuts.In ACL’04:Proceedings of

the42nd Annual Meeting on Association for

Computational Linguistics,page271,Morristown,NJ, USA,2004.Association for Computational Linguistics.

[12]B.Pang and L.Lee.Opinion mining and ntiment

analysis.Foundation and Trends in Information

Retrieval,2(1-2):1–135,2008.

[13]S.A.Tagliamonte and D.Denis.LINGUISTIC RUIN?

LOL!INSTANT MESSAGING AND TEEN

LANGUAGE.American Speech,83(1):3–34,2008. [14]T.Wilson,J.Wiebe,and P.Hoﬀmann.Recognizing

contextual polarity in phra-level ntiment analysis.

Proceedings of the2005Conference on Empirical

Methods in Natural Language Processing(EMNLP),

pages347–354,2005.

本文发布于:2023-05-27 16:58:14，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/82/793606.html

上一篇：《烦恼歌》歌词

下一篇：关于周年活动策划合集八篇

标签：锦里领跑金针菇成都古街梦想五花肉介绍

留言与评论（共有 0 条评论）