微博情绪分类

更新时间:2023-05-27 16:58:14 阅读: 评论:0

Classifying Sentiment in Microblogs: Is Brevity an Advantage?
Adam Bermingham&Alan Smeaton
CLARITY:Centre for Sensor Web Technologies
School of Computing
Dublin City University
{abermingham,asmeaton}@computing.dcu.ie
ABSTRACT
Microblogs as a new textual domain offer a unique propo-sition for ntiment analysis.Their short document length suggests any ntiment they contain is compact and explicit. However,this short length coupled with their noisy nature can po difficulties for standard machine learning document reprentations.In this work we examine the hypothesis that it is easier to classify the ntiment in the short form documents than in longer form documents.Surprisingly, wefind classifying ntiment in microblogs easier than in blogs and make a number of obrvations pertaining to the challenge of supervid learning for ntiment analysis in microblogs.
Categories and Subject Descriptors
H.3.3[Information Search and Retrieval]:Text Mining
General Terms
Algorithms,Experimentation
1.INTRODUCTION
Microblogging has become a popular method for Internet urs to publish thoughts and information in real-time.Au-tomated ntiment analysis of microblog posts is of interest to many,allowing monitoring of public ntiment towards people,products and events,as they happen.
The short length of microblog documents means they can be easily published and read on a variety of platforms and modalities.This brevity constraint has led to the u of non-standard textual artefacts such as emoticons and informal language.The resulting text is often considered“noisy”.
It is reasonable to assume that the short document length introduces a succinctness to the content.The focud nature of the text and higher density of ntiment-bearing terms may benefit au
tomated ntiment analysis techniques.On the other hand,it may also be that the shorter length and language conventions ud mean there is not enough context Permission to make digital or hard copies of all or part of this work for personal or classroom u is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on thefirst page.To copy otherwi,to republish,to post on rvers or to redistribute to lists,requires prior specific permission and/or a fee.
CIKM’10,October26–30,2010,Toronto,Ontario,Canada.
Copyright2010ACM978-1-4503-0099-5/$10.00.for ntiment to be accurately detected.It is unclear which of the is true.
The issues motivate our rearch questions:(i)How does ntiment classification accuracy in the microblogging do-main compare to that for microreviews,another short-form textual domain?How do the accuracies compare to tho for their long-form counterparts?and(ii)How do different feature vector reprentations and classifiers affect ntiment classification accuracy for microblogs?How does this com-pare to the corpora explored in(i)?
2.RELATED WORK
Sentiment analysis has been successfully ud to analy and extract opinion from text in recent years[12].Some ex-ploratory works have been completed in the microblog do-main.Diakapolous and Shamma ud manual annotations to characterize the ntiment reactions to various issues in a political debate[5].Theyfind that ntiment is uful as a measure for identifying controversy.Jann et al.studied the Word-Of-Mouth effect on Twitter using an adjective-bad ntiment classifier,finding it uful for brand analyt-ics on Twitter.Bollen et al.analyd ntiment on Twitter according to a six-dimensional mood reprentation[2]find-ing that ntiment on Twitter correlates with real-world val-ues such as stock prices and coincides with cultural events. The latter two studies report positive results from using au-tomated ntiment analysis techniques on Twitter data. Noi in Computer-Mediated-Content has been the sub-ject of much rearch.Tagliamonte and Denis studied in-stant messaging[13],finding that the penetration of non-standard English language and punctuation is far less than is reported in the media.In a study of classification of cus-tomer feedback,Gamon found a high level of accuracy for supervid ntiment classification despite their noisy nature [7].One strategy to deal with noi in the domain put for-ward by Choudhury et al.is to u Hidden Markov Models to decode text into standard English[4]reporting a high level of success for SMS data.Agarwal et al.showed that by simulating noi in text classification,a good classifier should perform well up to about40%noi[1]suggesting that although noi may be prent in text,thi
s may not prove to be important for supervid learning tasks.Car-valho et al.found that non-standard surface features such as a heavy punctuation and emoticons are key to detecting irony in ur-generated content[3].
Collectively,the studies all support our assumption that new textual domains exhibit domain specific features.We also e that there is significant value in being able to model
Table1:Microblog annotation labels and associated document counts.
Label#Documents
Relevant,Positive1,410
Relevant,Negative1,040
Relevant,Neutral2,597
Relevant,Mixed146
Not relevant498
Unannotatable603
Unclear530
Total6,824解珍
ntiment in the domains.To our knowledge,this is the first work to explore the challenges that the shortness of mi-croblog documents prent to feature vector reprentations and supervid ntiment classification.
3.METHODOLOGY
The microblog posts ud in the experiments are taken from a collection of over60million posts which we gathered from the Twitter public data API1from February to May 2009.We examined the trending topics from this period and identifiedfive recurring themes:Entertainment,Products and Services,Sport,Current Affairs and Companies.We lected10trends from each of the categories to be ud as ntiment targets.By making the topic t diver and challenging,we hope to better test the performance of our approach and build a classifier reprentative of a real world generic ntiment classification scenario.
春节的图片简单又漂亮
In the annotation process,Wilson’s definition of ntiment was ud:“Sentiment analysis is the task of identifying posi-tive and negative opinions,emotions,and evaluations.”[14] Our team of annotators consisted of9PhD students and Postdoctoral rearchers.To ensure sufficient agreement among the annotators,the annotation was preceded by a number of training iterations,consisting of group meetings, connsus annotations and one-on-one discussions.See Ta-ble3for a breakdown of annotations by label.
In total,9annotators annotated17documents for each of the50topics.463documents were doubly annotated for inter-annotator agreement(6.78%).For the7labels,the Kappa agreement was0.65.For the3class which we u for training(positive,negative and neutral)Kappa was0.72. If we just consider the binary ntiment class,positive and negative,this increas to0.94.The relatively high values for kappa are consistent with our previous annotation of blogs[10].
To contrast with our microblogs corpus,we derive a cor-pus of blog posts from the TREC Blogs06corpus[8].We u a templating approach to extract positive,negative and neutral blog post content and comments from the corpus, using the TREC relevance judgments as labels.
As much of ntiment analysis literature concerns review classification,in parallel to our experiments
on the microblog and blog corpora,we also conduct our experiments on a cor-pus of microreviews and a corpus of reviews.In January 2010we collected microreview documents from the microre-view website,Blippr2.Blippr reviews bear a similarity to 1
2 microblog posts in that they share the same character limit of140characters.Reviews on Blippr are given one of four ratings by the author,in order from most negative to most positive:hate,dislike,like and love.In our corpus we u only reviews with strongly polarid ntiment:hate and love.We have made our microreview and microblog corpora available for other rearchers3.
The reviews corpus we u as comparison is perhaps the mostly widely studied ntiment corpus,Pang and Lee’s movie review corpus[11].This corpus contains archival movie reviews from USENET.We refer to the microblog and microreview datats as the short-form document corpora and the blog and movie review datats as the long-form document corpora.
Our datats are limited to exactly1000documents per class in line with the movie review corpus.This allows us to eliminate any underlying ntiment bias which may be prent.While this is obviously a consideration for a real-world system,in our experiments we wish to examine the challen
ges of the classification without biasing our evalua-tion towards any particular class.As the ntiment distri-bution is different in each of the domains,this also makes accuracies comparable across datats.
For our experiments we u Support Vector Machine(SVM) and Multinomial Naive Bayes(MNB)classifiers,giving us an accurate reprentation of the state-of-the-art in text clas-sification.We u an SVM with a linear kernel and the parameter c t to1.In preliminary experiments we found binary feature vectors more effective than frequency-bad vectors and found no benefit from stopwording or stemming. Where possible,we replaced topics with pudo-terms to avoid learning topic-ntiment bias.We also replace URLs and urnames with pudo-terms to avoid confusion during tokenization and POS tagging.Each feature vector is L2 Normalized and for the the long-form corpora only features which occurred4or more times were ud,as Pang and Lee did in their original movie review experiments.Accuracy was measured using10fold cross-validation and the folds werefixed for all experiments.
As a baline for binary(positve/negative)classification we u a classifier bad on a ntiment lexicon,SentiWord-Net[6].This unsupervid classifier calssifies documents using the mean ntiment scores of the synts its words be-long to.Despite their naivety,this type of classifier is ofte
n ud as it does not require expensive training data.
4.RESULTS AND DISCUSSION
Unigram binary(positive/negative)classification accuracy for microblogs is74.85%using an SVM.This is an encour-aging accuracy given the diversity in the ntiment topics. As we have balanced datats,a trivial classifier achieves 50%accuracy for binary classification.For microreviews, the accuracy is considerably higher at82.25%using an SVM. As expected,the classifierfinds it easier to distinguish be-tween polarid reviews than to identify ntiment in arbi-trary posts.
Sentiment classification of the long-form documents yields some surprising results.Blog classification accuracy is sig-nificantly lower than for microblogs.However,movie re-view classification is higher than for microreviews,confirm-ing Pang and Lee’s results of87.15%for SVM with unigram 3puting.dcu.ie/~abermingham/data/
Figure1:Accuracies for unigram features. features.Atfirst this may em contradictory—surely the classifier should perform consistently across textual do-mains?We speculate that this behaviour is due to within-document topic drift.In the two review corpora the text of the document has a high density of ntiment informa-tion about the topic,and a low noi density.In the blogs datat,this is
not necessarily the ca;the ntiment in a blog post may be an isolated reference in a subction of the document.Although topic drift also occurs in the microblog corpus,there is less opportunity for non-relevant information to enter the feature vector and our classifier is not as adverly affected as in the blog domain.
Our unsupervid lexicon-bad classifier performs poorly across all datats.For the blogs corpus,it is outperformed by a trivial classifier.The accuracy gap between supervid and unsupervid classification accuracy in the long-form corpora is much more pronounced.This makes intuitive n as the probability of the polarity of a word in a doc-ument expressing ntiment towards a topic is again much higher in the short-form domains.
Of the two supervid classifiers,SVM outperforms MNB in the long-form domains,whereas the opposite is true in the short-form domains.SVMs scale better with larger vec-tor dimensionality so this is most likely the reason for this obrvation;the number of unique terms in the longer doc-uments is over three times their shorter counterparts,even when infrequent features have been excluded.
成都锦里古街介绍
Having established a reasonable performance in ntiment classification of microblog posts,we wish
to explore whether we can improve the standard bag of words feature t by adding more sophisticated features.Using quences of terms, or n-grams,we can capture some of the information lost in the bag-of-words model.We evaluated two feature ts:(un-igrams+bigrams)and(unigrams+bigrams+trigrams). We found that although an increa in classification accu-racy is obrved for the movie reviews,this is not the ca for any of the other datats(e Table2).We also ex-amined POS-bad n-grams in conjuction with a unigram model and obrved a decrea in accuracy across all cor-pora.This indicates that the syntactic patterns reprented by the POS n-gram features do contain information which is more discriminative than unigrams.
The most promising results came from a POS-bad stop-wording approach propod by Matsumoto et al.([9]).This approach(which Matsumoto fer to as“word sub-quences”)consists of an n-gram model,where terms have Table3:Most discriminative unigrams,bigrams and trigrams according to Information Gain Ratio for binary classification.
Microblogs Blogs Microreviews Reviews 1!witherspoon great bad
2<Url>joaquin boring worst
3<Topic>ree best stupid
witherspoon
4amazing joaquin terrible boring
phoenix
5..sharon the best the worst 6!!ledger worst waste
7?heath n’t ridiculous
ledger
8!!!heath love wasted
9love johnny loved awful
cash
10<Topic>!palestinians??
Table4:Ternary Unigram Classification Accuracies: Positive,Negative,Neutral
MNB SVM#features
Microblogs61.359.58132
Blogs52.1357.628805
为梦想领跑been stopworded bad on their POS.We u the same POS list as Matsumoto.The features increa accuracy across all corpora for unigrams+POS-stopworded bigrams.This suggests that a better understanding of the linguistic con-text of terms is similarly advantageous in all domains.
To examine the performance of individual features,we u a standard measure of discriminability,Information Gain Ratio(e Table3).Immediately obvious is the significant role that punctuation plays in expressing ntiment in mi-croblog posts.This suggests that the are being ud specif-ically in microblog posts to express ntiment,perhaps as in-dicators for intonation.The discriminative features for both the reviews and microreviews are largely similar in nature, typically polarid adjectives.The blog classifier appears to have learned a certain amount of entity bias as many of the discriminative features are people or places.Note that none of the entities are topic ter
ms(topic terms were removed in pre-processing),though they do appear to be entities as-sociated with topics.
Results of our ternary classification on microblogs and blogs can be en in Table4.The accuracy is,as expected, significantly less than for binary classification with SVMs again outperforming MNB on the longer blog documents. 5.CONCLUSION
The results of our experiments on the whole are encourag-ing for the task of analysing ntiment in microblogs.We achieve an accuracy of74.85%for binary classification for a diver t of topics indicating we can classify microblog documents with a moderate degree of confidence.In both of our short-form corpora wefind it difficult to improve per-formance by extending a unigram feature reprentation. This is contrary to the long-form corpora which respond favourably to enriched feature reprentations.We do how-ever e promi in sophisticated POS-bad features across
Table2:Binary accuracy summary(figures as%)
Microblogs Blogs Microreviews Movies Feature Set MNB SVM MNB SVM MNB SVM MNB SVM Unigram74.8572.9564.668.7582.2580.882.9587.15 Unigram+Bigram74.3572.9564.668.4582.1581.485.2587.9 Unigram+Bigram+Trigram73.772.864.66
8.581.9580.8584.887.9 Unigram+POS n-gram(n=1)73.2571.664.768.4580.879.582.486.95 Unigram+POS n-gram(n=1,2)70.2570.0562.666.2580.879.581.884.95 Unigram+POS n-gram(n=1,2,3)68.869.762.4564.674.776.979.9582 Unigram+POS-stopworded Bigram74.1573.2564.56982.581.0585.3587.5 Unigram+POS-stopworded Bigram+Trigram74.473.4564.8568.782.1580.685.587.8
all datats and speculate that engineering features bad on deeper linguistic reprentations such as dependencies and par trees may work for microblogs as they have been shown to do for movie reviews.In analysing discrimina-tive features,we found that a significant role is played by punctuation.As a future direction for this work we hope to explore this notion with a view to incorporating it into the feature engineering process.It is surprising to e that this is not a pattern en in our microreviews corpus indicating that this is not an artefact of the brevity of the platforms. We conclude that although the shortness of the documents has a bearing on which feature ts and classifier will pro-vide optimum performance,the sparsity of information in the documents does not hamper our ability to classify them. On the contrary,wefind classifying the short documents a much easier task than their longer counterparts,blogs.Also, the“noisy”artefacts of the microblog domain such as infor-mal punctuation turn out to be a benefit to the classifiers. The results provide a comp关于面瘫
elling argument to encourage the community to focus on microblogs in future ntiment analysis rearch.
Acknowledgments
This work is supported by Science Foundation Ireland under grant07/CE/I1147
6.REFERENCES
[1]S.Agarwal,S.Godbole,D.Punjani,and S.Roy.How
much noi is too much:A study in automatic text
classification.In ICDM,pages3–12,2007.
[2]J.Bollen,A.Pepe,and H.Mao.Modeling public mood
and emotion:Twitter ntiment and socio-economic
phenomena.CoRR,abs/0911.1583,2009.
[3]P.Carvalho,L.Sarmento,M.J.Silva,and
E.de Oliveira.Clues for detecting irony in
ur-generated !!it’s”so easy”;-).In
TSA’09:Proceeding of the1st international CIKM
workshop on Topic-ntiment analysis for mass
opinion,pages53–56,New York,NY,USA,2009.
五花肉卷金针菇ACM.
[4]M.Choudhury,R.Saraf,V.Jain,A.Mukherjee,
S.Sarkar,and A.Basu.Investigation and modeling of the structure of texting language.IJDAR,
立其丁
10(3-4):157–174,2007.
[5]N.A.Diakopoulos and D.A.Shamma.Characterizing
debate performance via aggregated Twitter ntiment.
色66
In Conference on Human Factors in Computing
Systems(CHI2010),2010.
[6]A.Esuli and F.Sebastiani.Sentiwordnet:A publicly
available lexical resource for opinion mining.In In
Proceedings of the5th Conference on Language
Resources and Evaluation(LREC-06),pages417–422, 2006.
[7]M.Gamon.Sentiment classification on customer
feedback data:noisy data,large feature vectors,and
the role of linguistic analysis.In COLING’04:
Proceedings of the20th international conference on
Computational Linguistics,page841,Morristown,NJ, USA,2004.Association for Computational Linguistics.
[8]C.MacDonald and I.Ounis.The TREC Blogs06
collection:Creating and analysing a blog test
collection.Technical report,University of Glasgow,
Department of Computing Science,2006.
[9]S.Matsumoto,H.Takamura,and M.Okumura.
Sentiment classification using word sub-quences and dependency sub-trees.In Proceedings of PAKDD’05,
the9th Pacific-Asia Conference on Advances in
Knowledge Discovery and Data Mining,2005.
[10]N.O’Hare,M.Davy,A.Bermingham,P.Ferguson,
P.Sheridan,C.Gurrin,and A.F.Smeaton.
Topic-dependent ntiment analysis offinancial blogs.
In In:TSA2009-1st International CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion
Measurement,Hong Kong,China,6Nov2009.
[11]B.Pang and L.Lee.A ntimental education:
ntiment analysis using subjectivity summarization
bad on minimum cuts.In ACL’04:Proceedings of
the42nd Annual Meeting on Association for
Computational Linguistics,page271,Morristown,NJ, USA,2004.Association for Computational Linguistics.
[12]B.Pang and L.Lee.Opinion mining and ntiment
analysis.Foundation and Trends in Information
Retrieval,2(1-2):1–135,2008.
[13]S.A.Tagliamonte and D.Denis.LINGUISTIC RUIN?
LOL!INSTANT MESSAGING AND TEEN
LANGUAGE.American Speech,83(1):3–34,2008. [14]T.Wilson,J.Wiebe,and P.Hoffmann.Recognizing
contextual polarity in phra-level ntiment analysis.
Proceedings of the2005Conference on Empirical
Methods in Natural Language Processing(EMNLP),
pages347–354,2005.

本文发布于:2023-05-27 16:58:14,感谢您对本站的认可!

本文链接:https://www.wtabcd.cn/fanwen/fan/82/793606.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:锦里   领跑   金针菇   成都   古街   梦想   五花肉   介绍
相关文章
留言与评论(共有 0 条评论)
   
验证码:
推荐文章
排行榜
Copyright ©2019-2022 Comsenz Inc.Powered by © 专利检索| 网站地图