A good Japanese corpus?

Do you use a Japanese corpus to find example sentences that help you understand the usage of a particular word you’re learning?

I’ve been using http://yourei.jp/ which I think is very good but I sometimes feel that it takes a lot of effort to find sentences resembling spoken Japanese. I wonder if you know of any good alternatives?

1 Like

Weblio was always recommended to me by Japanese tutors on iTalki; however, the sentences are often very short and probably aren’t what you’re after.

Yeah, weblio is my main dictionary and I love it, but I’m looking for example sentences to supplement weblio’s definitions when they are not enough

When you say you’re looking for examples of spoken language, it made me think of this fantastic tool called YouGlish but unfortunately it doesn’t support Japanese. I wish it did. The idea is that you type in a word into the box and it will find random YouTube videos that contain that word, and it will jump you to the right timestamp to listen to the sentence where that word was used. Maybe if we all go over there and request Japanese support, they’ll be tempted by the market demand.

In terms of actual corpora, there’s the Corpus of Spontaneous Japanese although I assume you’re probably after a website or app that is based on such a corpus, and for that I only know of weblio and to a lesser extent tatoeba (which gives example sentences translated into English for a given word, but the sentences are too simple). I didn’t know about yourei, so I believe you’ve reverse helped me :wink:

If I could try to think of a website that hasn’t been mentioned yet, I will just throw Google out there. You can type any word into the box, and… well you know the rest. Despite the potential silliness of that suggestion, I think it could seriously be a useful tool. Maybe someone could create a browser extension or Greasemonkey script that would make it easier to sift through the results and after clicking a link, jump straight to the sentence on the page that contains the word.

2 Likes

Have you tried the Hinoki Project? Natane is their corpus tool but it’s over my head. Natsume is a collocation tool and looks pretty interesting.

Here are some corpus-related links that I found in my notes:

I’ve used the Sentence Search site before, but it seems to be not working now. If you are using 用例.jp it’s probably too basic.

Here’s what the usage examples from Kenkyusha’s New Japanese-English Dictionary look like:

1 Like

I’ve used at various times the examples from edict/wwwjdic (to be handled with caution as some of them are garbage), the Kenkyusha examples, and just googling.

1 Like

Thank you for all the suggestions!

YouGlish seems great, I really wish there were something like this for Japanese. Although I’m not sure how it would handle homophones :smiley:

Kenkyusha’s dictionary looks really good too, but I would prefer something I can use on my PC and the online version is crazy expensive.

I’ll try using the other resources mentioned and see how I like them :smiley:

It looks like they have free online version that you need to register for. I tried doing so but while I got the sms with the code quite quickly (next morning) I haven’t got the e-mail yet, so I’m not sure if it goes right.

After some testing it seems that Natane’s database is rather small, pretty much every time I tried searching for something it showed me no results even though yourei shows hundreds of example sentences for the same words. But on the other hand, Natsume really is interesting and usually finds words that Natane doesn’t, which, considering that you can see example sentences for the collocations found here, makes this tool a better corpus than their actual corpus lol Anyway, I like Natsume so far.

I think this is my favourite :smiley: It showed many examples for everything I searched for so far and I like how it groups them. And while it seems that most examples in yourei come from literature, here there are many extracts from blogs and such. It seems to be pretty much exactly what I was looking for.

Works for me, and I kind of like the examples it gives but again the database is rather small (often shows no results at all) and it ignores conjugation which makes it useless for anything but nouns.

And again no results for a bit less common words.

So, NINJAL LWP for TWC (NLT) definitely joins the list of tools I’m going to use, and I’ll try using Natsume some more too and see how I feel about it in the end. Once again, thank you for all the suggestions :smiley:

2 Likes

I use NLT all the time and find it very useful, though I haven’t used it for example sentences in a while. I like to use it when I’m not sure the most prevalent iteration of a word is (which kanji, what okurigana, etc), because it gives percentages for how often each shows up in its corpus. I don’t take it as any definitive answer but it helps me in a lot of situations. (I should really start using the example sentences again, too!)

2 Likes

I use ALC’s dictionary to get lots of examples of short phrases together with their English translations. These phrases may be too short for your purposes.

1 Like

Actually that’s a simple yet great suggestion. Not exactly what I had in mind, but I can see it being a great addition. I used to use ALC a lot before completely switching to weblio and apparently I forgot how good it was. Thanks!

Also, in the end I got access to the Chunagon’s corpuses and they seem to be great too although I don’t like their interface. Their 日本語日常会話コーパス モニター公開版 is especially interesting because it contains great audio samples, but unfortunately searching often brings no results.

If you find a source of good quality audio, but without transcripts, then it is possible to turn it into a sentence bank or a phrase bank.

I use a speech-to-text web service, with a monthly free allowance, which returns a surprisingly accurate transcript. I then run one of two scripts to return either full sentences (easy) or short phrases (a bit more difficult) to generate an srt file and then run subs2srs.

Oh, sorry, that’s not what I meant. 日本語日常会話コーパス モニター公開版 has example sentences with both audio and text for its entries, but it happens a lot that searching for a word shows no results at all (they don’t have it in their database)

YouGlish (mentioned above) now supports Japanese:

Type in a word and it pops up random YouTube videos positioned to the sentence where that word is spoken, so you get example sentences in context and you get to hear the pronunciation.

3 Likes

That’s great to hear! Thanks for sharing :smiley: