LJParser

A developing platform for web search and mining.
Download

LJParser Ranking & Summary

Advertisement

  • Rating:
  • License:
  • Freeware
  • Publisher Name:
  • LING-JOIN Software
  • Operating Systems:
  • Windows All
  • File Size:
  • 17.5 MB

LJParser Tags


LJParser Description

LJParser is a complete suite of tools designed to provide you with powerful modules including precise search for multiple language, new words detection, text summarization, keywords extraction, etc. Main features: Chinese word segmentation SDK Module: Chinese word segmentation SDK Module can be split texts Chinese language, which is the essential core of Chinese information processing components. This uses Conditional Random Field technology (refer as CRF) model, and the word segementation accuracy close to 99%, with the high accuracy, speed, adaptability and strong advantage. Features include: grain segmentation Adjustable degree of fusion. More than 20 industry-specific dictionaries to support user-defined dictionaries. POS Tagging SDK Module: POS tagging SDK Module of Chinese language can automatically tagging part of speech , it can really understand the Chinese language environment and it will automatically lable words such as "building" labeled "noun" or "verb." Ling-Join using Conditional Random model, a POS tagging accuracy close to 99%, with high accuracy, speed, adaptability and other strong advantage. Recognition of Chinese named entities, including persons, locations and organizations SDK Module: Recognition of Chinese named entities, including persons, locations and organizations SDK Module can automatically find out the hidden names, place names, organization names in Chinese, as a deep understanding of the language and prediction, these words do not need to be in the dictionary. Ling-Join using Conditional Random Field model, which has 97% recognition accuracy, and speed of 10M/s, can be built on this basis, a variety of statistics and a variety of applications. Keyword extraction from Documents SDK Module: Keyword extraction from Documents SDK Module is able to fully grasp the central idea of the article, based on the extracted semantic content of the article on behalf of a number of words or phrases. Relevant results can be used to refine reading, semantic query and fast matching. This module based semantic statistical language model, the documents processed are not restricted in the industry fields, and it can identify the newest words. the output will mark with the weight of words Automatic Extraction of Domain Terms SDK Module: Automatic Extraction of Domain Terms SDK Module is a powerfull weapen of professional literature analysis. It is on the top of keyword extraction technology, but also combines the professional literature for the maximum edge of the recognition model, which can effectively tap the terminology that appears in the literature. English lexical analysis SDK Module: English lexical analysis SDK Module is the essential core component for english information processing, which contain POS tagging, recognition of named entities, including persons, locations and organizations. With high accuracy, speed, adaptability and strong advantage, this module combined the probability of a combination and machine learning model. Japanese lexical analysis SDK Module: Japanese lexical analysis SDK Module is the essential core component for Japanese information processing, which contain POS tagging, recognition of named entities, including persons, locations and organizations. Ling-Join using Conditional Random Field model, with the high accuracy, speed, adaptability and strong advantage, the word accuracy close to 99%, and the POS tagging accuracy nearly 98%. Text Mining Middlewares: Text Mining Middleware includes the text analyzed mining modules, subsystems and API interfaces, which can be seamlessly integrated into various complex client applications. This middleware is compatible with Windows, Linux, FreeBSD and other different operating systems. Ling-Join Text Mining Middleware includes the following SDK Module Text Summarization Middleware: Text summarization middleware can extract the text content, automatic extraction from a long article to key sentence and the key paragraph, and constitute a summary. Ling-Join text summarization middleware can not only generate a coherent process for a summary of the document, but also remove redundancy, and generate a concise summary; users can freely set the length of the summary, percentage and other parameters; This middleware supports English and Chinese language processing, and the processing speed upto 20 texts per second. Text Classification Middleware: According to the literature by content type, text classification middlewarecan be used for news classification, categorization of the profiles, mail classification, office document classification, area classification, and many other applications. This middleware carries out multi-level classification, and the classification rate is upto 100 texts per second, the average accuracy 90% or more, and also it can be classified in English and Chinese mixed classification. Text Clustering Middleware: Text clustering can be considered the most important unsupervised learning problem, it deals with finding a structure in a collection of unlabeled data. A cluster is therefore a collection of objects which are "similar" between them and are "dissimilar" to the objects belonging to other clusters. It can be used in automatic generation of hot topic, event tracking, visual analysis of data and many other applications. LING-JOIN uses the core semantic technology, not only fast but accurate. It also can automatically obtain the evolutionary trend between clusters. Text filtering middleware: Text filtering middleware can quickly identify the required information from large amount of text, and can be used in information intelligent filter and content audit or other fields. Ling-Join combines the method of rule-based filtering and learning-based filtering. The average accuracy rate is more than 90%. Users have the flexibility to set the rules for different fields.


LJParser Related Software