c# - Antlr grammar for parsing C source code files and getting functions from them -

- September 15, 2013

i wrote antlr grammar parsing functions c source code files:

grammar newcfunctions;  options {     language = csharp; } @parser::namespace { generated } @lexer::namespace  { generated }  func     :function+ { console.writeline("hello"); } //this debugging     ; name     :[a-za-z]+[a-za-z0-9]*     ; typename     :   'void'     |   [a-za-z]+     |   'char'     |   'short'     |   'int'     |   'long'     |   'float'     |   'double'     |   'signed'     |   'unsigned'     |   '_bool'     |   '_complex'     |   '__m128'     |   '__m128d'     |   '__m128i'     |   name     ; arguments     :   (typename name)*     ; newline     :   '\r'? '\n' ; functionbody     :   ([a-za-z0-9]|newline)*; function      :   typename ' ' name '(' arguments ')' ' '? newline? '{' functionbody '}' newline?     ;

i generatet c# files , included them test project. main function of it:

            try             {                 antlrinputstream input = new antlrinputstream(console.in);                 newcfunctionslexer lexer = new newcfunctionslexer(input);                 commontokenstream tokens = new commontokenstream(lexer);                 newcfunctionsparser parser = new newcfunctionsparser(tokens);                 parser.func();             }             catch (exception e)             {                 console.writeline(e.message);             }             console.readkey();

when write "void foo(int a){return a;}" gives me ann error: "line 1:0 mismatched input 'void' expecting typename". please, me grammar! saw c grammar in internet, has 800+ lines , don't know it. if know, how use it, promt me please. thank you!

as has been said name rule should placed after typename rule. lexem typename should not contain lexem name , [a-za-z]+.

so, final verison:

grammar newcfunctions;  options {     language = csharp; } @parser::namespace { generated } @lexer::namespace  { generated }  func     : function+ { console.writeline("hello"); } //this debugging     ; function      : typename ' ' name '(' arguments ')' ' '? newline? '{' functionbody '}' newline?     ; arguments     : (typename name)*     ; typename     : typename     | name     ; functionbody     : (typename | name | newline)*     ; typename     :   'void'     |   'char'     |   'short'     |   'int'     |   'long'     |   'float'     |   'double'     |   'signed'     |   'unsigned'     |   '_bool'     |   '_complex'     |   '__m128'     |   '__m128d'     |   '__m128i'     ; name     : [a-za-z]+ [a-za-z0-9]*     ; newline     :   '\r'? '\n' ;

also advise use channels newlines , spaces ignoring in parsing process.

Search This Blog

Today's Best Video

c# - Antlr grammar for parsing C source code files and getting functions from them -

Comments

Post a Comment

Popular posts from this blog

ios - RestKit 0.20 — CoreData: error: Failed to call designated initializer on NSManagedObject class (again) -

laravel - PDOException in Connector.php line 55: SQLSTATE[HY000] [1045] Access denied for user 'root'@'localhost' (using password: YES) -

java - Digest auth with Spring Security using javaconfig -