WeBoCa

Data mining and knowledge discovery tool, for both personal and academic use
Download

WeBoCa Ranking & Summary

Advertisement

  • Rating:
  • License:
  • LGPL
  • Price:
  • FREE
  • Publisher Name:
  • Michael Drayson
  • Publisher web site:
  • http://code.google.com/u/michael.drayson/
  • Operating Systems:
  • Mac OS X
  • File Size:
  • 1012 KB

WeBoCa Tags


WeBoCa Description

Data mining and knowledge discovery tool, for both personal and academic use WeBoCa is an advanced and altered implementation of JBootCat. WeBoCa will allow users to create a corpus from a range of search engines, and then conduct processing on the corpus in order to tidy up / manipulate the corpus in a range of ways.The BootCat scripts are of great interest to translators, linguists, and anyone researching such techniques for academic purposes.While the main goal of JBootCat was 'to encapsulate the BootCat functionality within a user-friendly desktop application', WeBoCa looks to improve upon the open-source application, and increase its functionality in terms of both corpus collection, and knowledge discovery from within the corpus created. Here are some key features of "WeBoCa": · Vertical / Horizontal corpus creation · Google / Yahoo search engine implementation · Define additional search parameters · Define a word limit · Define a page size limit · Save URLs used in downloading · Advanced URL processing including; · Remove stored URLs as terms · Remove non alpha-numerical terms · Sort corpus · Convert corpus terms to lower case · Remove non-unique corpus terms · Generate frequency count Requirements: · Java 1.4 or later What's New in This Release: · Fixed String bug on Get URLs button, and corrected GUI errors


WeBoCa Related Software