12

What software or online tools can detect the parts of speech in a piece of written text (e.g. to predict if a word is a verb, adjective, etc.)?

Mou某
  • 35,955
  • 9
  • 53
  • 137
Village
  • 2,235
  • 3
  • 22
  • 44
  • 1
    very very useful question. i had never thought of doing that, but now i can't see how i lived without it all this time. – magnetar Dec 25 '11 at 21:14

4 Answers4

6

There is a web demo system called ICTCLAS (Institute of Computing Technology, Chinese Lexical Analysis System). It was developed by the Institute of Computing Technology, Chinese Academy of Science.

There is also a web demo system from THULAS (Tsinghua University - Lexical Analyzer for Chinese), which was developed by the Nature Language Processing Group, Tsinghua University.

NOTE: I don't know (and I cannot test) whether these site can be accessed freely across the world.
NOTE2: As with any software language tool, take the results with a grain of salt, they aren't perfect.

brc
  • 499
  • 5
  • 14
fefe
  • 8,735
  • 18
  • 35
6

I have had good results with the Stanford POS tagger.

dusan
  • 2,633
  • 20
  • 34
cburgmer
  • 519
  • 2
  • 4
0

just write the word on google, then type definition after example: run definition Then a big box with the definition will appear, usually the first one shows what part of speech it is

James
  • 1
0

Update:

Tencent NLP Tencent NLP

Bonson NLP enter image description here

There are many lexical analyzers for Chinese, but these are not developed for daily use, the typical usage is Chinese searcher engine.

I found 语言云 has a online demo, I test it by enter a sentence from today news,

"北京时间2月27日凌晨1点,FIFA特别代表大会选举中,因凡蒂诺击败萨尔曼、阿里王子、尚帕涅,成功当选新任国际足联主席,瑞士人的任期至2019年。"

The default result seems complicate, I deselected "语义依存分析" and "语义角色标注".

enter image description here enter image description here enter image description here

These tool can not be 100% accurate, for example, "选举" in the sentence is a noun, not verb ( so the SBV relation is also wrong, SBV means subject-verb), the biggest mistake is it split 因凡蒂诺(Sorry, Mr.Infantino) to 因 and 凡蒂诺.

These errors are understandable, it is still helpful.

sfy
  • 6,277
  • 12
  • 18