Installing MeCab on OS 10.9.5-10.13

The easiest way to install MeCab and ipadic is to use Homebrew, through macOS 10.13 High Sierra. I have an older Mac and stopped upgrading past 10.13, so YMMV with more recent iterations or 10.11+. This process is in contrast to my previous steps for 10.9.2 and below, where lots of file editing was required. Below I also include editing MeCab's configuration file to change dictionaries.

Install MeCab via Homebrew

$ brew install mecab
$ brew install mecab-ipadic

Yup, that's it. Now you also need to install the MeCab library for Python 3 if you want to be able to use it within your scripting (like the projects I've detailed elsewhere on this website):

$ pip install mecab-python3

For reference and further information, see the MeCab homepage.

Change Your MeCab Dictionary

In this example, I'll change the dictionary to the kindai bungo unidic dictionary that I use most often (modern, ie. Meiji-Taisho period, written Japanese). You can find all of the NINJAL dictionaries for unidic here.

Open file /usr/local/etc/mecabrc and change:

dicdir =  /usr/local/lib/mecab/dic/ipadic
dicdir =  /usr/local/lib/mecab/dic/unidic

Make sure you move your preferred dictionary to the unidic directory. Just copy and paste everything that you downloaded from the NINJAL site in there, with the directory structure intact. Now, your unidic will be your default dictionary and you can use it (remember, with -Owakati as your option for parser, not -Ochasen) in MeCab Python (and also in rmecab, if you're an R user.)

Using Custom Dictionaries (unidic)

To actually use kindai bungo, or another unidic dictionary, to process text I needed to switch from -Ochasen parsing option in MeCab (commonly shown in tutorials/docs I've found), to -Owakati. I relied on Japanese-only writeups like this one ( Hatena blog) to learn what options do or don't work with non-default setups.

-Owataki parsing option will return a string of the tokenized text, not a data structure with more grammatical information per "word" like -Ochasen (so you'll have to adjust what your script expects as input from this process, vs. most tutorials). See the linked blog post for sample code that helped me a lot in various projects where I was tokenizing early 20th-century text as a first step. The author also covers the NLTKJP context.

(Deprecated) Install MeCab from source

This is a legacy set of instructions from OS 10.9.2, which I'm leaving up for posterity - but this did NOT actually work for me in the end.

First, I had to do this to make the C compiler work. This may or may not be the same for you.

sudo ln -s /usr/bin/gcc /usr/bin/gcc-apple-4.2
  • Install MeCab:
  • Get the MeCab source
  • Switch to whatever directory you downloaded it to, then...
  • $ tar zxfv mecab-0.996.tar.gz
    $ cd mecab-0.996
    $ ./configure
    $ make
    $ make check
  • Install MeCab Dictionaries:
  • Get the IPA dictionary
  • Switch to whatever directory you downloaded it to, then...
  • $ tar zxfv mecab-ipadic-2.7.0-20070801.tar.gz
    $ cd mecab-ipadic-2.7.0-20070801
    $ ./configure --with-charset=utf8
  • You will probably get error "configure: error: mecab-config is not found in your system"
  • Try this solution from hateの日記...
    $ ./configure --with-mecab-config=~/usr/local/bin/mecab-config --prefix=~/usr/local/bin --with-charset=utf8
    (This actually didn't work for me. I ended up using Homebrew to install instead.)
  • Finally:
  • $ make
    $ sudo make install

    Test MeCab

    $ mecab
    test    名詞,固有名詞,組織,*,*,*,*
    If you get this output, you have successfully installed MeCab on your Mac.