52

I wish to parse java source code files, and extract the methods source code.

I would need a method like this :

/** Returns a map with key = method name ; value = method source code */
Map<String,String> getMethods(File javaFile);

Is there a simple way to achieve this, a library to help me build my method, etc. ?

glmxndr
  • 44,026
  • 27
  • 91
  • 116

7 Answers7

62

Download the java parser from https://javaparser.org/

You'll have to write some code. This code will invoke the parser... it will return you a CompilationUnit:

            InputStream in = null;
            CompilationUnit cu = null;
            try
            {
                    in = new SEDInputStream(filename);
                    cu = JavaParser.parse(in);
            }
            catch(ParseException x)
            {
                 // handle parse exceptions here.
            }
            finally
            {
                  in.close();
            }
            return cu;

Note: SEDInputStream is a subclass of input stream. You can use a FileInputStream if you want.


You'll have to create a visitor. Your visitor will be easy because you're only interested in methods:

  public class MethodVisitor extends VoidVisitorAdapter
  {
        public void visit(MethodDeclaration n, Object arg)
        {
             // extract method information here.
             // put in to hashmap
        }
  }

To invoke the visitor, do this:

  MethodVisitor visitor = new MethodVisitor();
  visitor.visit(cu, null);
koppor
  • 17,003
  • 14
  • 105
  • 150
arcticfox
  • 868
  • 6
  • 8
  • 3
    Great answer. Appreciate the effort. Thanks. – glmxndr Feb 05 '10 at 10:49
  • 4
    Great answer is great. Thanks, it helps people even today ;) – dantuch Jul 13 '12 at 20:03
  • 2
    The project is no longer maintained. Check http://code.google.com/p/javaparser/issues/detail?id=9#c32 which leads you to https://github.com/matozoid/javaparser – jedierikb Jan 06 '13 at 17:33
  • 4
    The project is maintained at https://github.com/javaparser/javaparser and we released version 2.1 (fully supporting Java 8) a few weeks ago. Enjoy! – Federico Tomassetti Jun 08 '15 at 20:31
  • @FedericoTomassetti Did you implement everything from scratch? I don't know much about Java but doesn't the java compiler have the parser classes? – Kenji Noguchi Dec 12 '16 at 04:45
  • I am a contributor, I did not write JavaParser from scratch. No, there is not such functionality as part of the JDK – Federico Tomassetti Dec 12 '16 at 07:53
  • @FedericoTomassetti This is not true, http://docs.oracle.com/javase/8/docs/api/javax/tools/package-summary.html contains a blazing fast parser as part of the compilation pipeline. – Lee Jan 11 '17 at 04:14
  • 1
    @Lee, yes I know it has a compiler, but this is different from a parser. A compiler needs to resolve symbols, so you need to have the dependencies to use that. There is code that can be parsed but cannot be compiled (because of semantic errors). So they are not the same thing. The parser is the first step of a compiler. JavaParser permits to build an AST, modify it and write back the code. I am not aware you can do that with the compiler API. Do you? – Federico Tomassetti Jan 11 '17 at 08:37
  • 2
    The Javac compiler API has a fully accessible parsing API within the JDK. It's a bit convoluted, but you can get the system compiler (`ToolProvider.getSystemJavaCompiler()`), get its `JavacTask` via `compiler.getTask(...)`, and have it parse via `task.parse()`, which returns a collection of `CompilationUnitTree`s. The Sun/Oracle parser is actually much faster than even the ECJ parser, though it does not have the same level of error-inference that ECJ is capable of (for example, ECJ can give suggestions on what you mean, or partially parse code that is "mostly" correct). – Lee Jan 11 '17 at 23:42
9

QDOX is a more lightweight parser, that does only parse down to the method level, i.e. the method body is not being parsed into statements. It gives you more or less, what you ask for, even though the you would have navigate the model to find the right name, as it doesn't index classes and methods by name.

aventurin
  • 1,878
  • 4
  • 23
  • 29
Felix Leipold
  • 1,004
  • 9
  • 16
3

You can build your parser with one of parser generators:

  1. ANTLR
  2. JavaCC
  3. SableCC

Also, you can use (or study how it works) something ready-made. There are Java Tree Builder which uses JavaCC and RefactorIt which uses ANTLR.

gouessej
  • 3,436
  • 3
  • 30
  • 60
Rorick
  • 8,737
  • 3
  • 31
  • 37
2

Could you use a custom doclet with JavaDoc?

Doclet.com may provide more information.

Rich
  • 15,373
  • 15
  • 76
  • 124
2

JavaCC is a "compiler-compiler" (parser generator) for Java. There are many grammar files for it, including one for Java 1.5.

unwind
  • 378,987
  • 63
  • 458
  • 590
2

I think you can use ANTLR parser generator.

It is very easy to use and ANTLR site has a lot of examples written in Java.

Also it comes with a plug-in for eclipse.

Upul Bandara
  • 5,855
  • 4
  • 34
  • 59
  • Can you please adddress me to some specific example about how to extract methods using ANTLR? I'm able to make the parsing of java code retrieving single tokens (e.g. 'public', 'static', 'void' ...) but then I'm missing the next step. – Cristiano Ghersi Dec 02 '12 at 18:04
0

You can use this repo. https://github.com/minqukanq/method-extractor/blob/main/README.md

method-extractor is a parser for Java, Python.

  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Oct 15 '21 at 01:54
  • More importantly, link only answers are discouraged. Always provide enough content so that your input ANSWERS the question at hand, even if your link breaks. – GhostCat Oct 15 '21 at 10:59