Print to Page   |   Contact Us   |   Sign In   |   Join
Forum: Announcements of Publications, Events, etc.: Free Text Analysis Software Under Development
Search ForumsForums
Share |

8/28/2012 at 12:25:58 AM GMT
Posts: 8
 
Subject: Free Text Analysis Software Under Development

There are currently multiple pieces of software available to the community for analyzing and scoring text files to provide meaningful information. Unfortunately, the cost for some of this software is a large barrier that prevents many people from undertaking research that might prove both interesting and fruitful. As such, I have undertaken a project to provide a basic, albeit useful, solution to this problem.

I would like to introduce RIOT (Recursive Inspection of Text) Scan to the community. This software is designed with the intention of providing a free software solution for those interested in performing research with bodies of text. I initially developed this software to calculate some indices of language use, such as hapax legomena and coefficient of variation for phrase length, thought to be of interest by Dr. Colin Martindale. I have recently added a content coding  feature to the software as well.

 In its current form, this software processes .txt files and outputs a variety of indices descriptive of language use. There is also the option to do content coding; this feature scores language categories from multiple traditions. Currently, the content coding feature performs the following scoring systems:

  • LIWC2007 system developed by Pennebaker et al
  • Harvard IV-4 Inquirer system developed by Stone, Osgood, and colleagues
  • Regressive Imagery system developed by Colin Martindale
  • Body Type system developed by Wilson

This software is in its infancy, and feedback from users is crucial. Please report any bugs or errors upon discovery.

The software is currently available for free at http://riot.ryanb.cc


Last edited on 8/29/2012 3:22:01 PM GMT
8/29/2012 at 5:22:18 PM GMT
Posts: 8
 

Update -- 2012-08-29 -- I have added an "English Prime violations" dictionary to the content coding system. The ability to select which coding systems you would like to use has also been added.

 See http://en.wikipedia.org/wiki/E-Prime for an introduction to English Prime.

Last edited on 8/29/2012 6:34:46 PM GMT
9/1/2012 at 7:35:56 PM GMT
Posts: 8
 
There have been more updates to the software. The new version outputs proper citation information for all coding schemes selected. The coding engine has been made more efficient. The Financial Sentiment Dictionary has been added. Please check the website (http://riot.ryanb.cc) regularly for more updates, as new coding schemes and other features will be added regularly.
11/2/2012 at 7:55:01 PM GMT
Posts: 8
 

As a quick update for interested parties, new systems have been added, including Bradley & Lang's (1999) ANEW norms and Whissell's (2009) Dictionary of Affect in Language system.

http://riot.ryanb.cc

11/15/2012 at 9:51:44 PM GMT
Posts: 8
 

Porter's stemming algorithm (http://tartarus.org/martin/PorterStemmer/) and LemmaGen's lemmatisation classes (http://lemmatise.ijs.si/) are now included in the software.

 http://riot.ryanb.cc

Search SPSP.org
Sign In

Username
Password

Forgot your password?

Haven't joined yet?