0

I have a project that does some indexing for full-text search. For this I use hadoop. I'm getting the error: "GC Overhead limit exceeded"

    Task TASKID="tip_201610111152_0066_r_000033" TASK_TYPE="REDUCE" TASK_STATUS="FAILED" FINISH_TIME="1512484551448" 
ERROR="java.lang.OutOfMemoryError: GC overhead limit exceeded
        at org.apache.hadoop.io.SequenceFile$CompressedBytes.writeUncompressedBytes(SequenceFile.java:505)
        at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.getNext(ReduceTask.java:206)
        at org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:168)
        at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:234)
        at org.apache.nutch.crawl.CrawlDbReducer.reduce(CrawlDbReducer.java:62)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:322)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1743)

Last reduce's to do.

enter image description here

enter image description here

I have already set the configuration:

export HADOOP_DATANODE_OPTS = "-Xmx10g"

But it did not work, I have already re-indexed but always returns with the same error in the last reduce. Any idea what it might be?

Thanks.

alves
  • 121
  • 6

0 Answers0