gtsocial-umbx

Unnamed repository; edit this file 'description' to name the repository.
Log | Files | Refs | README | LICENSE

doc.go (9875B)


      1 // Copyright 2019 The Go Authors. All rights reserved.
      2 // Use of this source code is governed by a BSD-style
      3 // license that can be found in the LICENSE file.
      4 
      5 /*
      6 Package ppc64 implements a PPC64 assembler that assembles Go asm into
      7 the corresponding PPC64 instructions as defined by the Power ISA 3.0B.
      8 
      9 This document provides information on how to write code in Go assembler
     10 for PPC64, focusing on the differences between Go and PPC64 assembly language.
     11 It assumes some knowledge of PPC64 assembler. The original implementation of
     12 PPC64 in Go defined many opcodes that are different from PPC64 opcodes, but
     13 updates to the Go assembly language used mnemonics that are mostly similar if not
     14 identical to the PPC64 mneumonics, such as VMX and VSX instructions. Not all detail
     15 is included here; refer to the Power ISA document if interested in more detail.
     16 
     17 Starting with Go 1.15 the Go objdump supports the -gnu option, which provides a
     18 side by side view of the Go assembler and the PPC64 assembler output. This is
     19 extremely helpful in determining what final PPC64 assembly is generated from the
     20 corresponding Go assembly.
     21 
     22 In the examples below, the Go assembly is on the left, PPC64 assembly on the right.
     23 
     24 1. Operand ordering
     25 
     26   In Go asm, the last operand (right) is the target operand, but with PPC64 asm,
     27   the first operand (left) is the target. The order of the remaining operands is
     28   not consistent: in general opcodes with 3 operands that perform math or logical
     29   operations have their operands in reverse order. Opcodes for vector instructions
     30   and those with more than 3 operands usually have operands in the same order except
     31   for the target operand, which is first in PPC64 asm and last in Go asm.
     32 
     33   Example:
     34     ADD R3, R4, R5		<=>	add r5, r4, r3
     35 
     36 2. Constant operands
     37 
     38   In Go asm, an operand that starts with '$' indicates a constant value. If the
     39   instruction using the constant has an immediate version of the opcode, then an
     40   immediate value is used with the opcode if possible.
     41 
     42   Example:
     43     ADD $1, R3, R4		<=> 	addi r4, r3, 1
     44 
     45 3. Opcodes setting condition codes
     46 
     47   In PPC64 asm, some instructions other than compares have variations that can set
     48   the condition code where meaningful. This is indicated by adding '.' to the end
     49   of the PPC64 instruction. In Go asm, these instructions have 'CC' at the end of
     50   the opcode. The possible settings of the condition code depend on the instruction.
     51   CR0 is the default for fixed-point instructions; CR1 for floating point; CR6 for
     52   vector instructions.
     53 
     54   Example:
     55     ANDCC R3, R4, R5		<=>	and. r5, r3, r4 (set CR0)
     56 
     57 4. Loads and stores from memory
     58 
     59   In Go asm, opcodes starting with 'MOV' indicate a load or store. When the target
     60   is a memory reference, then it is a store; when the target is a register and the
     61   source is a memory reference, then it is a load.
     62 
     63   MOV{B,H,W,D} variations identify the size as byte, halfword, word, doubleword.
     64 
     65   Adding 'Z' to the opcode for a load indicates zero extend; if omitted it is sign extend.
     66   Adding 'U' to a load or store indicates an update of the base register with the offset.
     67   Adding 'BR' to an opcode indicates byte-reversed load or store, or the order opposite
     68   of the expected endian order. If 'BR' is used then zero extend is assumed.
     69 
     70   Memory references n(Ra) indicate the address in Ra + n. When used with an update form
     71   of an opcode, the value in Ra is incremented by n.
     72 
     73   Memory references (Ra+Rb) or (Ra)(Rb) indicate the address Ra + Rb, used by indexed
     74   loads or stores. Both forms are accepted. When used with an update then the base register
     75   is updated by the value in the index register.
     76 
     77   Examples:
     78     MOVD (R3), R4		<=>	ld r4,0(r3)
     79     MOVW (R3), R4		<=>	lwa r4,0(r3)
     80     MOVWZU 4(R3), R4		<=>	lwzu r4,4(r3)
     81     MOVWZ (R3+R5), R4		<=>	lwzx r4,r3,r5
     82     MOVHZ  (R3), R4		<=>	lhz r4,0(r3)
     83     MOVHU 2(R3), R4		<=>	lhau r4,2(r3)
     84     MOVBZ (R3), R4		<=>	lbz r4,0(r3)
     85 
     86     MOVD R4,(R3)		<=>	std r4,0(r3)
     87     MOVW R4,(R3)		<=>	stw r4,0(r3)
     88     MOVW R4,(R3+R5)		<=>	stwx r4,r3,r5
     89     MOVWU R4,4(R3)		<=>	stwu r4,4(r3)
     90     MOVH R4,2(R3)		<=>	sth r4,2(r3)
     91     MOVBU R4,(R3)(R5)		<=>	stbux r4,r3,r5
     92 
     93 4. Compares
     94 
     95   When an instruction does a compare or other operation that might
     96   result in a condition code, then the resulting condition is set
     97   in a field of the condition register. The condition register consists
     98   of 8 4-bit fields named CR0 - CR7. When a compare instruction
     99   identifies a CR then the resulting condition is set in that field
    100   to be read by a later branch or isel instruction. Within these fields,
    101   bits are set to indicate less than, greater than, or equal conditions.
    102 
    103   Once an instruction sets a condition, then a subsequent branch, isel or
    104   other instruction can read the condition field and operate based on the
    105   bit settings.
    106 
    107   Examples:
    108     CMP R3, R4			<=>	cmp r3, r4	(CR0 assumed)
    109     CMP R3, R4, CR1		<=>	cmp cr1, r3, r4
    110 
    111   Note that the condition register is the target operand of compare opcodes, so
    112   the remaining operands are in the same order for Go asm and PPC64 asm.
    113   When CR0 is used then it is implicit and does not need to be specified.
    114 
    115 5. Branches
    116 
    117   Many branches are represented as a form of the BC instruction. There are
    118   other extended opcodes to make it easier to see what type of branch is being
    119   used.
    120 
    121   The following is a brief description of the BC instruction and its commonly
    122   used operands.
    123 
    124   BC op1, op2, op3
    125 
    126     op1: type of branch
    127         16 -> bctr (branch on ctr)
    128         12 -> bcr  (branch if cr bit is set)
    129         8  -> bcr+bctr (branch on ctr and cr values)
    130 	4  -> bcr != 0 (branch if specified cr bit is not set)
    131 
    132 	There are more combinations but these are the most common.
    133 
    134     op2: condition register field and condition bit
    135 
    136 	This contains an immediate value indicating which condition field
    137 	to read and what bits to test. Each field is 4 bits long with CR0
    138         at bit 0, CR1 at bit 4, etc. The value is computed as 4*CR+condition
    139         with these condition values:
    140 
    141         0 -> LT
    142         1 -> GT
    143         2 -> EQ
    144         3 -> OVG
    145 
    146 	Thus 0 means test CR0 for LT, 5 means CR1 for GT, 30 means CR7 for EQ.
    147 
    148     op3: branch target
    149 
    150   Examples:
    151 
    152     BC 12, 0, target		<=>	blt cr0, target
    153     BC 12, 2, target		<=>	beq cr0, target
    154     BC 12, 5, target		<=>	bgt cr1, target
    155     BC 12, 30, target		<=>	beq cr7, target
    156     BC 4, 6, target		<=>	bne cr1, target
    157     BC 4, 1, target		<=>	ble cr1, target
    158 
    159     The following extended opcodes are available for ease of use and readability:
    160 
    161     BNE CR2, target		<=>	bne cr2, target
    162     BEQ CR4, target		<=>	beq cr4, target
    163     BLT target			<=>	blt target (cr0 default)
    164     BGE CR7, target		<=>	bge cr7, target
    165 
    166   Refer to the ISA for more information on additional values for the BC instruction,
    167   how to handle OVG information, and much more.
    168 
    169 5. Align directive
    170 
    171   Starting with Go 1.12, Go asm supports the PCALIGN directive, which indicates
    172   that the next instruction should be aligned to the specified value. Currently
    173   8 and 16 are the only supported values, and a maximum of 2 NOPs will be added
    174   to align the code. That means in the case where the code is aligned to 4 but
    175   PCALIGN $16 is at that location, the code will only be aligned to 8 to avoid
    176   adding 3 NOPs.
    177 
    178   The purpose of this directive is to improve performance for cases like loops
    179   where better alignment (8 or 16 instead of 4) might be helpful. This directive
    180   exists in PPC64 assembler and is frequently used by PPC64 assembler writers.
    181 
    182   PCALIGN $16
    183   PCALIGN $8
    184 
    185   Functions in Go are aligned to 16 bytes, as is the case in all other compilers
    186   for PPC64.
    187 
    188 6. Shift instructions
    189 
    190   The simple scalar shifts on PPC64 expect a shift count that fits in 5 bits for
    191   32-bit values or 6 bit for 64-bit values. If the shift count is a constant value
    192   greater than the max then the assembler sets it to the max for that size (31 for
    193   32 bit values, 63 for 64 bit values). If the shift count is in a register, then
    194   only the low 5 or 6 bits of the register will be used as the shift count. The
    195   Go compiler will add appropriate code to compare the shift value to achieve the
    196   the correct result, and the assembler does not add extra checking.
    197 
    198   Examples:
    199 
    200     SRAD $8,R3,R4		=>	sradi r4,r3,8
    201     SRD $8,R3,R4		=>	rldicl r4,r3,56,8
    202     SLD $8,R3,R4		=>	rldicr r4,r3,8,55
    203     SRAW $16,R4,R5		=>	srawi r5,r4,16
    204     SRW $40,R4,R5		=>	rlwinm r5,r4,0,0,31
    205     SLW $12,R4,R5		=>	rlwinm r5,r4,12,0,19
    206 
    207   Some non-simple shifts have operands in the Go assembly which don't map directly
    208   onto operands in the PPC64 assembly. When an operand in a shift instruction in the
    209   Go assembly is a bit mask, that mask is represented as a start and end bit in the
    210   PPC64 assembly instead of a mask. See the ISA for more detail on these types of shifts.
    211   Here are a few examples:
    212 
    213     RLWMI $7,R3,$65535,R6 	=>	rlwimi r6,r3,7,16,31
    214     RLDMI $0,R4,$7,R6 		=>	rldimi r6,r4,0,61
    215 
    216   More recently, Go opcodes were added which map directly onto the PPC64 opcodes. It is
    217   recommended to use the newer opcodes to avoid confusion.
    218 
    219     RLDICL $0,R4,$15,R6		=>	rldicl r6,r4,0,15
    220     RLDICR $0,R4,$15,R6		=>	rldicr r6.r4,0,15
    221 
    222 Register naming
    223 
    224 1. Special register usage in Go asm
    225 
    226   The following registers should not be modified by user Go assembler code.
    227 
    228   R0: Go code expects this register to contain the value 0.
    229   R1: Stack pointer
    230   R2: TOC pointer when compiled with -shared or -dynlink (a.k.a position independent code)
    231   R13: TLS pointer
    232   R30: g (goroutine)
    233 
    234   Register names:
    235 
    236   Rn is used for general purpose registers. (0-31)
    237   Fn is used for floating point registers. (0-31)
    238   Vn is used for vector registers. Slot 0 of Vn overlaps with Fn. (0-31)
    239   VSn is used for vector-scalar registers. V0-V31 overlap with VS32-VS63. (0-63)
    240   CTR represents the count register.
    241   LR represents the link register.
    242 
    243 */
    244 package ppc64