gtsocial-umbx

Unnamed repository; edit this file 'description' to name the repository.
Log | Files | Refs | README | LICENSE

README.md (1966B)


      1 # HTML [![API reference](https://img.shields.io/badge/godoc-reference-5272B4)](https://pkg.go.dev/github.com/tdewolff/parse/v2/html?tab=doc)
      2 
      3 This package is an HTML5 lexer written in [Go][1]. It follows the specification at [The HTML syntax](http://www.w3.org/TR/html5/syntax.html). The lexer takes an io.Reader and converts it into tokens until the EOF.
      4 
      5 ## Installation
      6 Run the following command
      7 
      8 	go get -u github.com/tdewolff/parse/v2/html
      9 
     10 or add the following import and run project with `go get`
     11 
     12 	import "github.com/tdewolff/parse/v2/html"
     13 
     14 ## Lexer
     15 ### Usage
     16 The following initializes a new Lexer with io.Reader `r`:
     17 ``` go
     18 l := html.NewLexer(parse.NewInput(r))
     19 ```
     20 
     21 To tokenize until EOF an error, use:
     22 ``` go
     23 for {
     24 	tt, data := l.Next()
     25 	switch tt {
     26 	case html.ErrorToken:
     27 		// error or EOF set in l.Err()
     28 		return
     29 	case html.StartTagToken:
     30 		// ...
     31 		for {
     32 			ttAttr, dataAttr := l.Next()
     33 			if ttAttr != html.AttributeToken {
     34 				break
     35 			}
     36 			// ...
     37 		}
     38 	// ...
     39 	}
     40 }
     41 ```
     42 
     43 All tokens:
     44 ``` go
     45 ErrorToken TokenType = iota // extra token when errors occur
     46 CommentToken
     47 DoctypeToken
     48 StartTagToken
     49 StartTagCloseToken
     50 StartTagVoidToken
     51 EndTagToken
     52 AttributeToken
     53 TextToken
     54 ```
     55 
     56 ### Examples
     57 ``` go
     58 package main
     59 
     60 import (
     61 	"os"
     62 
     63 	"github.com/tdewolff/parse/v2/html"
     64 )
     65 
     66 // Tokenize HTML from stdin.
     67 func main() {
     68 	l := html.NewLexer(parse.NewInput(os.Stdin))
     69 	for {
     70 		tt, data := l.Next()
     71 		switch tt {
     72 		case html.ErrorToken:
     73 			if l.Err() != io.EOF {
     74 				fmt.Println("Error on line", l.Line(), ":", l.Err())
     75 			}
     76 			return
     77 		case html.StartTagToken:
     78 			fmt.Println("Tag", string(data))
     79 			for {
     80 				ttAttr, dataAttr := l.Next()
     81 				if ttAttr != html.AttributeToken {
     82 					break
     83 				}
     84 
     85 				key := dataAttr
     86 				val := l.AttrVal()
     87 				fmt.Println("Attribute", string(key), "=", string(val))
     88 			}
     89 		// ...
     90 		}
     91 	}
     92 }
     93 ```
     94 
     95 ## License
     96 Released under the [MIT license](https://github.com/tdewolff/parse/blob/master/LICENSE.md).
     97 
     98 [1]: http://golang.org/ "Go Language"