spine of a book titled Grammar and composition

LanguageTool is an AI-powered Grammar Checker you can self-host on your own hardware

David Rutland
David Rutland Software

 

If you write or edit for a living, it can be worse than embarrassing when you mess up your sentence structure, repeat words, or misplace your punctation in a paragraph.

In this writer's day job as an editor at SlashGear, it's the kind of thing that could probably result in a career-terminating implosion. Here at Linux Impact? Not so much.

So it should be no surprise that we run to online grammar checking tools several times per day, to make sure that we're not imperilling our livelihood with an excess of em dashes.

Maybe you're not a huge fan of cloud-based grammar checkers, preferring instead to run software locally. Maybe you're just fed up with being nagged to hand over your email, address, create and account, or sign in with Google.

LanguageTool is a full-featured Open Source tool you can run on your own machine or network to keep your sentences clean, and mind your Ps and Qs.

What's wrong with Grammarly?

Screenshot of the grammarly homepage

Grammarly is probably the best-known name in the grammar-checking world, and is used by writers, editors, students, and probably teachers across the world. People use it before sending emails, submitting essays, and while drafting correctly formatted love poetry to their crush or life partner.

The biggest problem we have with Grammarly is that you need an account to use it. Sure, it's free to sign up, and you can even sign in with Google, but signing up to online services is anathema to use here at Linux Impact. We live in dread that the various services we do have accounts with will inevitably be hacked.

You'll also have to agree to the Grammarly Terms of Service and Privacy Policy. While the company swears that, "we don’t and will not sell your data," we're naturally mistrustful of such pledges.

According to Grammarly users of our acquaintance (we've never tried it ourselves), the service has severe limitations for free users, and is heavy-handed in its attempts to upsell users on the premium plan.

There's also the fact that the analysis takes place in the cloud, on someone else's machine.

Maybe you want the extra privacy afforded by running your erotic literary masterpiece through software on your own premises,

LanguageTool is a pretty decent Grammarly alternative

Languagetool web interface showing issues only available to premium users

LanguageTool is another online service that its developers describe as, "an AI-based spelling, style, and grammar checker that helps correct or paraphrase texts across languages."

For the purposes of this article, we visited the site to try it out. It didn't require us to create an account, and we were able to plop in the 1200-odd unfinished words we've written so far.

It highlighted nine spelling mistakes, and 14 grammar issues. There were also two grammar issues, one punctuation issue, and 11 style issues only available to "premium" users. Irritating!

LanguageTool also comes with extensions for all major browsers.

Run LanguageTool on your own machine

As well as being an online service, LanguageTool is Open Source software, meaning that you can host it on your own machine.

While there are still some premium features you won't be able to use - including breaking down overly long sentences such as this one into shorter ones - it's far more convenient than using the online version, and you can rest assured that the AI scanning your top-secret movie treatment is residing on your own machine and won't betray your secrets to anyone else.

How to install LanguageTool on Linux

You can get full and detailed instructions on how to install LanguageTool on Linux from the official LanguageTool GitHub repository, but we found it easier to use a community-contributed dockerised version.

Before you get started, make sure you have installed the latest version of Docker and Docker Compose, then open up a terminal and clone the Git repository with:

git clone https://github.com/meyayl/docker-languagetool.git

Use the cd command to move into your newly created directory:

cd docker-languagetool/

You'll need to change permissions on a couple of subdirectories:

sudo chmod 777 ngrams
sudo chmod 777 fasttext

Back up the supplied Docker Compose file:

mv docker-compose.yml docker-compose.yml.old

And create a new one:

nano docker-compose.yml

The following is an example text supplied by the developers and is confirmed working and ready to go. Paste it into the new file:

---
version: "3.8"

services:
  languagetool:
    image: meyay/languagetool:latest
    container_name: languagetool
    restart: always
    cap_drop:
      - ALL
    cap_add:
      - CAP_SETUID
      - CAP_SETGID
      - CAP_CHOWN
    security_opt:
      - no-new-privileges
    ports:
      - 8010:8010
    environment:
      download_ngrams_for_langs: en
      langtool_languageModel: /ngrams
      langtool_fasttextModel: /fasttext/lid.176.bin
    volumes:
      - ./ngrams:/ngrams
      - ./fasttext:/fasttext

Save and exit nano with Ctrl + O then Ctrl + X.

Bring up Docker Compose in detached mode with:

docker-compose up -d

...and go and make a cup of tea while it downloads the necessary images and sets up containers on your system. On our system this step took around half an hour - this was mainly due to downloading and unpacking ngrams-en.zip which comes in at a mighty 4(ish) GB.

Check everything is working with:

docker-compose ps

That's it!

One thing to note is the restart: always section of the Docker Compose file. This means that the container will start up along with your computer and, as you might imagine, always restart.

It's handy if you're going to be using LanguageTool on a regular basis, but if you'd rather it not start automagically, remove this line.

Add extensions for your self-hosted LanguageTool server

Self-hosted LanguageTool extension running in NextCloud notes

LanguageTool isn't designed to be a standalone service you access through your browser. Instead, it's easy to integrate into software such as OpenOffice and LibreOffice. We don't use either of these, and do most of our writing in NextCloud Notes.

Fortunately, there are integrations available for both Chrome, and for our preferred browser, Firefox.

Install the extension, then click through all the "OK, Got it" buttons, then once it's installed, click on Preferences in the extension menu.

Scroll down to the bottom - ignoring exhortations to "Log in with your LanguageTool account," and click on Advanced settings (only for professional users). Tap the Other server radio button, and enter http://localhost:8010/v2.

Hit save, and you're done.

Use LanguageTool locally to check your grammar

LanguageTool dedicated web editor extension

By default, LanguageTool will run on any web page which has a text entry field. Looking at the bottom of our NextCloud Notes document, we can see that there are 27 errors, and they're all highlighted in the text as well, underlined in red, yellow, or purple. Some of these are bullshit that we don't care about, such as using hyphens instead of em dashes. You can accept the suggestions, ignore them, or disable a particular error class altogether.

It's also possible to copy and paste entire screeds into a dedicated LanguageTool Editor in order to check it in a flat environment. To open the LanguageTool editor, click on the extension icon, then Open Editor.

One of the advantages of using the self-hosted version of LanguageTool rather than the version hosted by the company is that there's no limit on word count - we were able to paste in and check the entirety of Cory Doctorow's excellent Eastern Standard Tribe. The web version seems to top out at a few thousand words with an exhortation to "Upgrade to Premium and get advanced grammar and style suggestions for longer texts."

Open your LanguageTool instance to the wider web

Of course, there's nothing stopping you from sticking LanguageTool behind a reverse proxy, so you can access it from anywhere. The only issue would be that it may attract other users, who see your resource as free, and for this reason, we'd advise against it.