About

PhrasIt is an open source clone of Netspeak developed for experiments (other datasets, other operators, larger ngrams, ...), where a local installed 'writing lookup' service is needed. All this possible extensions cannot be done with Netspeak, because it is just a webservice for the google-web-ngram dataset and no source code is available. In the last years there were no visible development changes or improvements for Netspeak. The idea for PhrasIt is to create a similar system, so that changes and new ideas can easily be implemented. E.g. you can build your own service using a german ngram dataset, or build a PhrasIt instance using scientific papers as source.

This web-service is not optimized for speed, it is just a demonstration of a live system.

As a data-source PhrasIt uses the google-book-ngram dataset, because this dataset is free available, in contrast to the google-web-ngram dataset that Netspeak uses. But in general if you build your own PhrasIt you can use every suitable data-source, e.g. you can build your own dataset, for more information about other datasets you should have a look at development.

PhrasIt can be used as a tool to assist writers that are no native speakers (not only for English). E.g. if you write an English article or scientific paper and you are not sure with the ordering of a few words, you can use PhrasIt for order checking, or if you are not sure which word fits as best in the context.

Query Operators

It is possible to submit different kinds of queries to PhrasIt. All operators will be summarized in the following table.

Query operator Description Example
QMARK, ? Look for n-grams that match query where ? can be every word.
hello ?
? world
for both you should get 'hello world' as result
ASTERISK, * Look for n-grams that match query where * can be many words (e.g. from no word up to 4 words).
hello *
* world
both will hopefully return 'hello world', 'hello my world'
OPTIONSET, [] Get n-grams that match query where all words in [] where checked.
hello [world, what]
will return 'hello world' as best matching candidate
ORDERSET, {} Check ordering of words that are in {}.
{world, hello}
should generate 'hello world' as best matching candidate

Complex Examples

All operators can be combined and used in complex constructed queries.

Query Results Description
world outside [of, in] science fiction
query you are not sure which word is correct
humans ? intelligent
query you are not sure which word fits best
{a, time, ago, long} in
query you don't know the ordering
a long *
query what will be the next most used words in this context
a * [in,of] ? 
query a combined query
{it,do} * not 
query another combined query

Using the API

If you want to use the PhrasIt web-service in another application you can perform API calls directly to a running PhrasIt server instance.

Configured server instances are:

And now you can perform a few example queries on one of the configured server: General every api call will return a json formatted result, if you specify a callback function, then you will get a jsonp result with callback.
// query "hello ?" without callback
{
    "query": "hello ?",
    "result": [
        {
            "hello . ": 115091
        }, ...,
        {
            "hello everybody ": 131
        }
    ],
    "time": 32.769
}
// query "hello *" with callback
handler (
    {
        "query": "hello *",
        "result": [
            {
                "hello . ": 115091
            }, ...,
            {
                "hello ! ": 133
            }
        ],
        "time": 6.499
    }
);

Development

For development and setting up a local PhrasIt server please look at the github project page.