gtsocial-umbx

Unnamed repository; edit this file 'description' to name the repository.
Log | Files | Refs | README | LICENSE

design-notes.adoc (2854B)


      1 = Design Notes
      2 
      3 == Problems:
      4 
      5 Translating C to Go is harder than it looks.
      6 
      7 Jan says: It's impossible in the general case to turn C char* into Go
      8 []byte.  It's possible to do it probably often for concrete C code
      9 cases - based also on author's C coding style. The first problem this
     10 runs into is that Go does not guarantee that the backing array will
     11 keep its address stable due to Go movable stacks. C expects the
     12 opposite, a pointer never magically modifies itself, so some code will
     13 fail.
     14 
     15 INSERT CODE EXAMPLES ILLUSTRATING THE PROBLEM HERE
     16 
     17 == How the parser works
     18 
     19 There are no comment nodes in the C AST. Instead every cc.Token has a
     20 Sep field: https://godoc.org/modernc.org/cc/v3#Token
     21 
     22 It captures, when configured to do so, all white space preceding the
     23 token, combined, including comments, if any. So we have all white
     24 space/comments information for every token in the AST. A final white
     25 space/comment, preceding EOF, is available as field TrailingSeperator
     26 in the AST: https://godoc.org/modernc.org/cc/v3#AST.
     27 
     28 To get the lexically first white space/comment for any node, use
     29 tokenSeparator():
     30 https://gitlab.com/cznic/ccgo/-/blob/6551e2544a758fdc265c8fac71fb2587fb3e1042/v3/go.go#L1476
     31 
     32 The same with a default value is comment():
     33 https://gitlab.com/cznic/ccgo/-/blob/6551e2544a758fdc265c8fac71fb2587fb3e1042/v3/go.go#L1467
     34 
     35 == Looking forward
     36 
     37 Eric says: In my visualization of how the translator would work, the
     38 output of a ccgo translation of a module at any given time is a file
     39 of pseudo-Go code in which some sections may be enclosed by a Unicode
     40 bracketing character (presently using the guillemot quotes U+ab and
     41 U+bb) meaning "this is not Go yet" that intentionally makes the Go
     42 compiler barf. This expresses a color on the AST nodes.
     43 
     44 So, for example, if I'm translating hello.c with a ruleset that does not
     45 include print -> fmt.Printf, this:
     46 
     47 ---------------------------------------------------------
     48 #include <stdio>
     49 
     50 /* an example comment */
     51 
     52 int main(int argc, char *argv[])
     53 {
     54     printf("Hello, World")
     55 }
     56 ---------------------------------------------------------
     57 
     58 becomes this without any explicit rules at all:
     59 
     60 ---------------------------------------------------------
     61 «#include <stdio>»
     62 
     63 /* an example comment */
     64 
     65 func main
     66 {
     67 	«printf(»"Hello, World"!\n"«)»
     68 }
     69 ---------------------------------------------------------
     70 
     71 Then, when the rule print -> fmt.Printf is added, it becomes
     72 
     73 ---------------------------------------------------------
     74 import (
     75         "fmt"
     76 )
     77 
     78 /* an example comment */
     79 
     80 func main
     81 {
     82 	fmt.Printf("Hello, World"!\n")
     83 }
     84 ---------------------------------------------------------
     85 
     86 because with that rule the AST node corresponding to the printf
     87 call can be translated and colored "Go".  This implies an import
     88 of fmt.  We observe that there are no longer C-colored spans
     89 and drop the #includes.
     90 
     91 // end