About
PhrasIt is an open source clone of Netspeak developed for experiments (other datasets, other operators, larger ngrams, ...), where a local installed 'writing lookup' service is needed. All this possible extensions cannot be done with Netspeak, because it is just a webservice for the google-web-ngram dataset and no source code is available. In the last years there were no visible development changes or improvements for Netspeak. The idea for PhrasIt is to create a similar system, so that changes and new ideas can easily be implemented. E.g. you can build your own service using a german ngram dataset, or build a PhrasIt instance using scientific papers as source.
This web-service is not optimized for speed, it is just a demonstration of a live system.
As a data-source PhrasIt uses the google-book-ngram dataset, because this dataset is free available, in contrast to the google-web-ngram dataset that Netspeak uses. But in general if you build your own PhrasIt you can use every suitable data-source, e.g. you can build your own dataset, for more information about other datasets you should have a look at development.
PhrasIt can be used as a tool to assist writers that are no native speakers (not only for English). E.g. if you write an English article or scientific paper and you are not sure with the ordering of a few words, you can use PhrasIt for order checking, or if you are not sure which word fits as best in the context.
Query Operators
It is possible to submit different kinds of queries to PhrasIt. All operators will be summarized in the following table.
Query operator | Description | Example |
---|---|---|
QMARK, ? | Look for n-grams that match query where ? can be every word. | hello ? ? worldfor both you should get 'hello world' as result |
ASTERISK, * | Look for n-grams that match query where * can be many words (e.g. from no word up to 4 words). | hello * * worldboth will hopefully return 'hello world', 'hello my world' |
OPTIONSET, [] | Get n-grams that match query where all words in [] where checked. | hello [world, what]will return 'hello world' as best matching candidate |
ORDERSET, {} | Check ordering of words that are in {}. | {world, hello}should generate 'hello world' as best matching candidate |
Complex Examples
All operators can be combined and used in complex constructed queries.
Query | Results | Description |
---|---|---|
world outside [of, in] science fiction |
query | you are not sure which word is correct |
humans ? intelligent |
query | you are not sure which word fits best |
{a, time, ago, long} in |
query | you don't know the ordering |
a long * |
query | what will be the next most used words in this context |
a * [in,of] ? |
query | a combined query |
{it,do} * not |
query | another combined query |
Using the API
If you want to use the PhrasIt web-service in another application you can perform API calls directly to a running PhrasIt server instance.
Configured server instances are:
// query "hello ?" without callback { "query": "hello ?", "result": [ { "hello . ": 115091 }, ..., { "hello everybody ": 131 } ], "time": 32.769 } // query "hello *" with callback handler ( { "query": "hello *", "result": [ { "hello . ": 115091 }, ..., { "hello ! ": 133 } ], "time": 6.499 } );
Development
For development and setting up a local PhrasIt server please look at the github project page.