Task:  to identify interesting collocations from text
Corpus: speeches made by politicians in the U.S. House of Representatives during debates over legislation
Data: downloaded from http://www.cs.cornell.edu/home/llee/data/convote.html
Analysis: using google's nltk package, collocations module.
Association measures:
- chi square
- mutual information
- log likelihood
- raw frequency
Green:  appear in both parties' speech, but in different order.
Red: only in Republican speech
Blue: only in Democrat speech.
Black: almost equal importance given by both parties 
This table is the result of using raw frequency count of 2 consecutive words appearing together in the corpus, sorted by frequency.
| 
             Top 16 collocations by                   Democrats 
united-states 
stem-cell 
health-care 
american-people 
cell-research 
social-security 
tax-cuts 
patriot-act 
embryonic-stem 
conference-report 
estate-tax 
last-year 
bill-would 
endangered-species 
national-security 
homeland-security | 
  Top 16 collocations by   Republicans 
united-states 
stem-cell 
embryonic-stem 
small-businesses 
small-business 
would-like 
cell-research 
patriot-act 
may-consume 
american-people 
health-care 
death-tax 
homeland-security 
federal-government 
law-enforcement 
conference-report | 
health-care sobuj kintu homeland-security ba cell-research kalo keno?
ReplyDeletejeguli dui party ee shoman gurutto diye alochone korechhe sheguli kalo . ashole aro valo moto analyse kora dorkar chhilo.
ReplyDeletethanks.