···11+#!/usr/bin/env python
22+# Copyright (c) 2018 Linaro Limited
33+#
44+# This library is free software; you can redistribute it and/or
55+# modify it under the terms of the GNU Lesser General Public
66+# License as published by the Free Software Foundation; either
77+# version 2 of the License, or (at your option) any later version.
88+#
99+# This library is distributed in the hope that it will be useful,
1010+# but WITHOUT ANY WARRANTY; without even the implied warranty of
1111+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
1212+# Lesser General Public License for more details.
1313+#
1414+# You should have received a copy of the GNU Lesser General Public
1515+# License along with this library; if not, see <http://www.gnu.org/licenses/>.
1616+#
1717+1818+#
1919+# Generate a decoding tree from a specification file.
2020+#
2121+# The tree is built from instruction "patterns". A pattern may represent
2222+# a single architectural instruction or a group of same, depending on what
2323+# is convenient for further processing.
2424+#
2525+# Each pattern has "fixedbits" & "fixedmask", the combination of which
2626+# describes the condition under which the pattern is matched:
2727+#
2828+# (insn & fixedmask) == fixedbits
2929+#
3030+# Each pattern may have "fields", which are extracted from the insn and
3131+# passed along to the translator. Examples of such are registers,
3232+# immediates, and sub-opcodes.
3333+#
3434+# In support of patterns, one may declare fields, argument sets, and
3535+# formats, each of which may be re-used to simplify further definitions.
3636+#
3737+# *** Field syntax:
3838+#
3939+# field_def := '%' identifier ( unnamed_field )+ ( !function=identifier )?
4040+# unnamed_field := number ':' ( 's' ) number
4141+#
4242+# For unnamed_field, the first number is the least-significant bit position of
4343+# the field and the second number is the length of the field. If the 's' is
4444+# present, the field is considered signed. If multiple unnamed_fields are
4545+# present, they are concatenated. In this way one can define disjoint fields.
4646+#
4747+# If !function is specified, the concatenated result is passed through the
4848+# named function, taking and returning an integral value.
4949+#
5050+# FIXME: the fields of the structure into which this result will be stored
5151+# is restricted to "int". Which means that we cannot expand 64-bit items.
5252+#
5353+# Field examples:
5454+#
5555+# %disp 0:s16 -- sextract(i, 0, 16)
5656+# %imm9 16:6 10:3 -- extract(i, 16, 6) << 3 | extract(i, 10, 3)
5757+# %disp12 0:s1 1:1 2:10 -- sextract(i, 0, 1) << 11
5858+# | extract(i, 1, 1) << 10
5959+# | extract(i, 2, 10)
6060+# %shimm8 5:s8 13:1 !function=expand_shimm8
6161+# -- expand_shimm8(sextract(i, 5, 8) << 1
6262+# | extract(i, 13, 1))
6363+#
6464+# *** Argument set syntax:
6565+#
6666+# args_def := '&' identifier ( args_elt )+
6767+# args_elt := identifier
6868+#
6969+# Each args_elt defines an argument within the argument set.
7070+# Each argument set will be rendered as a C structure "arg_$name"
7171+# with each of the fields being one of the member arguments.
7272+#
7373+# Argument set examples:
7474+#
7575+# ®3 ra rb rc
7676+# &loadstore reg base offset
7777+#
7878+# *** Format syntax:
7979+#
8080+# fmt_def := '@' identifier ( fmt_elt )+
8181+# fmt_elt := fixedbit_elt | field_elt | field_ref | args_ref
8282+# fixedbit_elt := [01.-]+
8383+# field_elt := identifier ':' 's'? number
8484+# field_ref := '%' identifier | identifier '=' '%' identifier
8585+# args_ref := '&' identifier
8686+#
8787+# Defining a format is a handy way to avoid replicating groups of fields
8888+# across many instruction patterns.
8989+#
9090+# A fixedbit_elt describes a contiguous sequence of bits that must
9191+# be 1, 0, [.-] for don't care. The difference between '.' and '-'
9292+# is that '.' means that the bit will be covered with a field or a
9393+# final [01] from the pattern, and '-' means that the bit is really
9494+# ignored by the cpu and will not be specified.
9595+#
9696+# A field_elt describes a simple field only given a width; the position of
9797+# the field is implied by its position with respect to other fixedbit_elt
9898+# and field_elt.
9999+#
100100+# If any fixedbit_elt or field_elt appear then all bits must be defined.
101101+# Padding with a fixedbit_elt of all '.' is an easy way to accomplish that.
102102+#
103103+# A field_ref incorporates a field by reference. This is the only way to
104104+# add a complex field to a format. A field may be renamed in the process
105105+# via assignment to another identifier. This is intended to allow the
106106+# same argument set be used with disjoint named fields.
107107+#
108108+# A single args_ref may specify an argument set to use for the format.
109109+# The set of fields in the format must be a subset of the arguments in
110110+# the argument set. If an argument set is not specified, one will be
111111+# inferred from the set of fields.
112112+#
113113+# It is recommended, but not required, that all field_ref and args_ref
114114+# appear at the end of the line, not interleaving with fixedbit_elf or
115115+# field_elt.
116116+#
117117+# Format examples:
118118+#
119119+# @opr ...... ra:5 rb:5 ... 0 ....... rc:5
120120+# @opi ...... ra:5 lit:8 1 ....... rc:5
121121+#
122122+# *** Pattern syntax:
123123+#
124124+# pat_def := identifier ( pat_elt )+
125125+# pat_elt := fixedbit_elt | field_elt | field_ref
126126+# | args_ref | fmt_ref | const_elt
127127+# fmt_ref := '@' identifier
128128+# const_elt := identifier '=' number
129129+#
130130+# The fixedbit_elt and field_elt specifiers are unchanged from formats.
131131+# A pattern that does not specify a named format will have one inferred
132132+# from a referenced argument set (if present) and the set of fields.
133133+#
134134+# A const_elt allows a argument to be set to a constant value. This may
135135+# come in handy when fields overlap between patterns and one has to
136136+# include the values in the fixedbit_elt instead.
137137+#
138138+# The decoder will call a translator function for each pattern matched.
139139+#
140140+# Pattern examples:
141141+#
142142+# addl_r 010000 ..... ..... .... 0000000 ..... @opr
143143+# addl_i 010000 ..... ..... .... 0000000 ..... @opi
144144+#
145145+# which will, in part, invoke
146146+#
147147+# trans_addl_r(ctx, &arg_opr, insn)
148148+# and
149149+# trans_addl_i(ctx, &arg_opi, insn)
150150+#
151151+152152+import io
153153+import os
154154+import re
155155+import sys
156156+import getopt
157157+import pdb
158158+159159+insnwidth = 32
160160+insnmask = 0xffffffff
161161+fields = {}
162162+arguments = {}
163163+formats = {}
164164+patterns = []
165165+166166+translate_prefix = 'trans'
167167+translate_scope = 'static '
168168+input_file = ''
169169+output_file = None
170170+output_fd = None
171171+insntype = 'uint32_t'
172172+173173+re_ident = '[a-zA-Z][a-zA-Z0-9_]*'
174174+175175+176176+def error(lineno, *args):
177177+ """Print an error message from file:line and args and exit."""
178178+ global output_file
179179+ global output_fd
180180+181181+ if lineno:
182182+ r = '{0}:{1}: error:'.format(input_file, lineno)
183183+ elif input_file:
184184+ r = '{0}: error:'.format(input_file)
185185+ else:
186186+ r = 'error:'
187187+ for a in args:
188188+ r += ' ' + str(a)
189189+ r += '\n'
190190+ sys.stderr.write(r)
191191+ if output_file and output_fd:
192192+ output_fd.close()
193193+ os.remove(output_file)
194194+ exit(1)
195195+196196+197197+def output(*args):
198198+ global output_fd
199199+ for a in args:
200200+ output_fd.write(a)
201201+202202+203203+if sys.version_info >= (3, 0):
204204+ re_fullmatch = re.fullmatch
205205+else:
206206+ def re_fullmatch(pat, str):
207207+ return re.match('^' + pat + '$', str)
208208+209209+210210+def output_autogen():
211211+ output('/* This file is autogenerated by scripts/decodetree.py. */\n\n')
212212+213213+214214+def str_indent(c):
215215+ """Return a string with C spaces"""
216216+ return ' ' * c
217217+218218+219219+def str_fields(fields):
220220+ """Return a string uniquely identifing FIELDS"""
221221+ r = ''
222222+ for n in sorted(fields.keys()):
223223+ r += '_' + n
224224+ return r[1:]
225225+226226+227227+def str_match_bits(bits, mask):
228228+ """Return a string pretty-printing BITS/MASK"""
229229+ global insnwidth
230230+231231+ i = 1 << (insnwidth - 1)
232232+ space = 0x01010100
233233+ r = ''
234234+ while i != 0:
235235+ if i & mask:
236236+ if i & bits:
237237+ r += '1'
238238+ else:
239239+ r += '0'
240240+ else:
241241+ r += '.'
242242+ if i & space:
243243+ r += ' '
244244+ i >>= 1
245245+ return r
246246+247247+248248+def is_pow2(x):
249249+ """Return true iff X is equal to a power of 2."""
250250+ return (x & (x - 1)) == 0
251251+252252+253253+def ctz(x):
254254+ """Return the number of times 2 factors into X."""
255255+ r = 0
256256+ while ((x >> r) & 1) == 0:
257257+ r += 1
258258+ return r
259259+260260+261261+def is_contiguous(bits):
262262+ shift = ctz(bits)
263263+ if is_pow2((bits >> shift) + 1):
264264+ return shift
265265+ else:
266266+ return -1
267267+268268+269269+def eq_fields_for_args(flds_a, flds_b):
270270+ if len(flds_a) != len(flds_b):
271271+ return False
272272+ for k, a in flds_a.items():
273273+ if k not in flds_b:
274274+ return False
275275+ return True
276276+277277+278278+def eq_fields_for_fmts(flds_a, flds_b):
279279+ if len(flds_a) != len(flds_b):
280280+ return False
281281+ for k, a in flds_a.items():
282282+ if k not in flds_b:
283283+ return False
284284+ b = flds_b[k]
285285+ if a.__class__ != b.__class__ or a != b:
286286+ return False
287287+ return True
288288+289289+290290+class Field:
291291+ """Class representing a simple instruction field"""
292292+ def __init__(self, sign, pos, len):
293293+ self.sign = sign
294294+ self.pos = pos
295295+ self.len = len
296296+ self.mask = ((1 << len) - 1) << pos
297297+298298+ def __str__(self):
299299+ if self.sign:
300300+ s = 's'
301301+ else:
302302+ s = ''
303303+ return str(pos) + ':' + s + str(len)
304304+305305+ def str_extract(self):
306306+ if self.sign:
307307+ extr = 'sextract32'
308308+ else:
309309+ extr = 'extract32'
310310+ return '{0}(insn, {1}, {2})'.format(extr, self.pos, self.len)
311311+312312+ def __eq__(self, other):
313313+ return self.sign == other.sign and self.sign == other.sign
314314+315315+ def __ne__(self, other):
316316+ return not self.__eq__(other)
317317+# end Field
318318+319319+320320+class MultiField:
321321+ """Class representing a compound instruction field"""
322322+ def __init__(self, subs, mask):
323323+ self.subs = subs
324324+ self.sign = subs[0].sign
325325+ self.mask = mask
326326+327327+ def __str__(self):
328328+ return str(self.subs)
329329+330330+ def str_extract(self):
331331+ ret = '0'
332332+ pos = 0
333333+ for f in reversed(self.subs):
334334+ if pos == 0:
335335+ ret = f.str_extract()
336336+ else:
337337+ ret = 'deposit32({0}, {1}, {2}, {3})' \
338338+ .format(ret, pos, 32 - pos, f.str_extract())
339339+ pos += f.len
340340+ return ret
341341+342342+ def __ne__(self, other):
343343+ if len(self.subs) != len(other.subs):
344344+ return True
345345+ for a, b in zip(self.subs, other.subs):
346346+ if a.__class__ != b.__class__ or a != b:
347347+ return True
348348+ return False
349349+350350+ def __eq__(self, other):
351351+ return not self.__ne__(other)
352352+# end MultiField
353353+354354+355355+class ConstField:
356356+ """Class representing an argument field with constant value"""
357357+ def __init__(self, value):
358358+ self.value = value
359359+ self.mask = 0
360360+ self.sign = value < 0
361361+362362+ def __str__(self):
363363+ return str(self.value)
364364+365365+ def str_extract(self):
366366+ return str(self.value)
367367+368368+ def __cmp__(self, other):
369369+ return self.value - other.value
370370+# end ConstField
371371+372372+373373+class FunctionField:
374374+ """Class representing a field passed through an expander"""
375375+ def __init__(self, func, base):
376376+ self.mask = base.mask
377377+ self.sign = base.sign
378378+ self.base = base
379379+ self.func = func
380380+381381+ def __str__(self):
382382+ return self.func + '(' + str(self.base) + ')'
383383+384384+ def str_extract(self):
385385+ return self.func + '(' + self.base.str_extract() + ')'
386386+387387+ def __eq__(self, other):
388388+ return self.func == other.func and self.base == other.base
389389+390390+ def __ne__(self, other):
391391+ return not self.__eq__(other)
392392+# end FunctionField
393393+394394+395395+class Arguments:
396396+ """Class representing the extracted fields of a format"""
397397+ def __init__(self, nm, flds):
398398+ self.name = nm
399399+ self.fields = sorted(flds)
400400+401401+ def __str__(self):
402402+ return self.name + ' ' + str(self.fields)
403403+404404+ def struct_name(self):
405405+ return 'arg_' + self.name
406406+407407+ def output_def(self):
408408+ output('typedef struct {\n')
409409+ for n in self.fields:
410410+ output(' int ', n, ';\n')
411411+ output('} ', self.struct_name(), ';\n\n')
412412+# end Arguments
413413+414414+415415+class General:
416416+ """Common code between instruction formats and instruction patterns"""
417417+ def __init__(self, name, lineno, base, fixb, fixm, udfm, fldm, flds):
418418+ self.name = name
419419+ self.lineno = lineno
420420+ self.base = base
421421+ self.fixedbits = fixb
422422+ self.fixedmask = fixm
423423+ self.undefmask = udfm
424424+ self.fieldmask = fldm
425425+ self.fields = flds
426426+427427+ def __str__(self):
428428+ r = self.name
429429+ if self.base:
430430+ r = r + ' ' + self.base.name
431431+ else:
432432+ r = r + ' ' + str(self.fields)
433433+ r = r + ' ' + str_match_bits(self.fixedbits, self.fixedmask)
434434+ return r
435435+436436+ def str1(self, i):
437437+ return str_indent(i) + self.__str__()
438438+# end General
439439+440440+441441+class Format(General):
442442+ """Class representing an instruction format"""
443443+444444+ def extract_name(self):
445445+ return 'extract_' + self.name
446446+447447+ def output_extract(self):
448448+ output('static void ', self.extract_name(), '(',
449449+ self.base.struct_name(), ' *a, ', insntype, ' insn)\n{\n')
450450+ for n, f in self.fields.items():
451451+ output(' a->', n, ' = ', f.str_extract(), ';\n')
452452+ output('}\n\n')
453453+# end Format
454454+455455+456456+class Pattern(General):
457457+ """Class representing an instruction pattern"""
458458+459459+ def output_decl(self):
460460+ global translate_scope
461461+ global translate_prefix
462462+ output('typedef ', self.base.base.struct_name(),
463463+ ' arg_', self.name, ';\n')
464464+ output(translate_scope, 'void ', translate_prefix, '_', self.name,
465465+ '(DisasContext *ctx, arg_', self.name,
466466+ ' *a, ', insntype, ' insn);\n')
467467+468468+ def output_code(self, i, extracted, outerbits, outermask):
469469+ global translate_prefix
470470+ ind = str_indent(i)
471471+ arg = self.base.base.name
472472+ output(ind, '/* line ', str(self.lineno), ' */\n')
473473+ if not extracted:
474474+ output(ind, self.base.extract_name(), '(&u.f_', arg, ', insn);\n')
475475+ for n, f in self.fields.items():
476476+ output(ind, 'u.f_', arg, '.', n, ' = ', f.str_extract(), ';\n')
477477+ output(ind, translate_prefix, '_', self.name,
478478+ '(ctx, &u.f_', arg, ', insn);\n')
479479+ output(ind, 'return true;\n')
480480+# end Pattern
481481+482482+483483+def parse_field(lineno, name, toks):
484484+ """Parse one instruction field from TOKS at LINENO"""
485485+ global fields
486486+ global re_ident
487487+ global insnwidth
488488+489489+ # A "simple" field will have only one entry;
490490+ # a "multifield" will have several.
491491+ subs = []
492492+ width = 0
493493+ func = None
494494+ for t in toks:
495495+ if re_fullmatch('!function=' + re_ident, t):
496496+ if func:
497497+ error(lineno, 'duplicate function')
498498+ func = t.split('=')
499499+ func = func[1]
500500+ continue
501501+502502+ if re_fullmatch('[0-9]+:s[0-9]+', t):
503503+ # Signed field extract
504504+ subtoks = t.split(':s')
505505+ sign = True
506506+ elif re_fullmatch('[0-9]+:[0-9]+', t):
507507+ # Unsigned field extract
508508+ subtoks = t.split(':')
509509+ sign = False
510510+ else:
511511+ error(lineno, 'invalid field token "{0}"'.format(t))
512512+ po = int(subtoks[0])
513513+ le = int(subtoks[1])
514514+ if po + le > insnwidth:
515515+ error(lineno, 'field {0} too large'.format(t))
516516+ f = Field(sign, po, le)
517517+ subs.append(f)
518518+ width += le
519519+520520+ if width > insnwidth:
521521+ error(lineno, 'field too large')
522522+ if len(subs) == 1:
523523+ f = subs[0]
524524+ else:
525525+ mask = 0
526526+ for s in subs:
527527+ if mask & s.mask:
528528+ error(lineno, 'field components overlap')
529529+ mask |= s.mask
530530+ f = MultiField(subs, mask)
531531+ if func:
532532+ f = FunctionField(func, f)
533533+534534+ if name in fields:
535535+ error(lineno, 'duplicate field', name)
536536+ fields[name] = f
537537+# end parse_field
538538+539539+540540+def parse_arguments(lineno, name, toks):
541541+ """Parse one argument set from TOKS at LINENO"""
542542+ global arguments
543543+ global re_ident
544544+545545+ flds = []
546546+ for t in toks:
547547+ if not re_fullmatch(re_ident, t):
548548+ error(lineno, 'invalid argument set token "{0}"'.format(t))
549549+ if t in flds:
550550+ error(lineno, 'duplicate argument "{0}"'.format(t))
551551+ flds.append(t)
552552+553553+ if name in arguments:
554554+ error(lineno, 'duplicate argument set', name)
555555+ arguments[name] = Arguments(name, flds)
556556+# end parse_arguments
557557+558558+559559+def lookup_field(lineno, name):
560560+ global fields
561561+ if name in fields:
562562+ return fields[name]
563563+ error(lineno, 'undefined field', name)
564564+565565+566566+def add_field(lineno, flds, new_name, f):
567567+ if new_name in flds:
568568+ error(lineno, 'duplicate field', new_name)
569569+ flds[new_name] = f
570570+ return flds
571571+572572+573573+def add_field_byname(lineno, flds, new_name, old_name):
574574+ return add_field(lineno, flds, new_name, lookup_field(lineno, old_name))
575575+576576+577577+def infer_argument_set(flds):
578578+ global arguments
579579+580580+ for arg in arguments.values():
581581+ if eq_fields_for_args(flds, arg.fields):
582582+ return arg
583583+584584+ name = str(len(arguments))
585585+ arg = Arguments(name, flds.keys())
586586+ arguments[name] = arg
587587+ return arg
588588+589589+590590+def infer_format(arg, fieldmask, flds):
591591+ global arguments
592592+ global formats
593593+594594+ const_flds = {}
595595+ var_flds = {}
596596+ for n, c in flds.items():
597597+ if c is ConstField:
598598+ const_flds[n] = c
599599+ else:
600600+ var_flds[n] = c
601601+602602+ # Look for an existing format with the same argument set and fields
603603+ for fmt in formats.values():
604604+ if arg and fmt.base != arg:
605605+ continue
606606+ if fieldmask != fmt.fieldmask:
607607+ continue
608608+ if not eq_fields_for_fmts(flds, fmt.fields):
609609+ continue
610610+ return (fmt, const_flds)
611611+612612+ name = 'Fmt_' + str(len(formats))
613613+ if not arg:
614614+ arg = infer_argument_set(flds)
615615+616616+ fmt = Format(name, 0, arg, 0, 0, 0, fieldmask, var_flds)
617617+ formats[name] = fmt
618618+619619+ return (fmt, const_flds)
620620+# end infer_format
621621+622622+623623+def parse_generic(lineno, is_format, name, toks):
624624+ """Parse one instruction format from TOKS at LINENO"""
625625+ global fields
626626+ global arguments
627627+ global formats
628628+ global patterns
629629+ global re_ident
630630+ global insnwidth
631631+ global insnmask
632632+633633+ fixedmask = 0
634634+ fixedbits = 0
635635+ undefmask = 0
636636+ width = 0
637637+ flds = {}
638638+ arg = None
639639+ fmt = None
640640+ for t in toks:
641641+ # '&Foo' gives a format an explcit argument set.
642642+ if t[0] == '&':
643643+ tt = t[1:]
644644+ if arg:
645645+ error(lineno, 'multiple argument sets')
646646+ if tt in arguments:
647647+ arg = arguments[tt]
648648+ else:
649649+ error(lineno, 'undefined argument set', t)
650650+ continue
651651+652652+ # '@Foo' gives a pattern an explicit format.
653653+ if t[0] == '@':
654654+ tt = t[1:]
655655+ if fmt:
656656+ error(lineno, 'multiple formats')
657657+ if tt in formats:
658658+ fmt = formats[tt]
659659+ else:
660660+ error(lineno, 'undefined format', t)
661661+ continue
662662+663663+ # '%Foo' imports a field.
664664+ if t[0] == '%':
665665+ tt = t[1:]
666666+ flds = add_field_byname(lineno, flds, tt, tt)
667667+ continue
668668+669669+ # 'Foo=%Bar' imports a field with a different name.
670670+ if re_fullmatch(re_ident + '=%' + re_ident, t):
671671+ (fname, iname) = t.split('=%')
672672+ flds = add_field_byname(lineno, flds, fname, iname)
673673+ continue
674674+675675+ # 'Foo=number' sets an argument field to a constant value
676676+ if re_fullmatch(re_ident + '=[0-9]+', t):
677677+ (fname, value) = t.split('=')
678678+ value = int(value)
679679+ flds = add_field(lineno, flds, fname, ConstField(value))
680680+ continue
681681+682682+ # Pattern of 0s, 1s, dots and dashes indicate required zeros,
683683+ # required ones, or dont-cares.
684684+ if re_fullmatch('[01.-]+', t):
685685+ shift = len(t)
686686+ fms = t.replace('0', '1')
687687+ fms = fms.replace('.', '0')
688688+ fms = fms.replace('-', '0')
689689+ fbs = t.replace('.', '0')
690690+ fbs = fbs.replace('-', '0')
691691+ ubm = t.replace('1', '0')
692692+ ubm = ubm.replace('.', '0')
693693+ ubm = ubm.replace('-', '1')
694694+ fms = int(fms, 2)
695695+ fbs = int(fbs, 2)
696696+ ubm = int(ubm, 2)
697697+ fixedbits = (fixedbits << shift) | fbs
698698+ fixedmask = (fixedmask << shift) | fms
699699+ undefmask = (undefmask << shift) | ubm
700700+ # Otherwise, fieldname:fieldwidth
701701+ elif re_fullmatch(re_ident + ':s?[0-9]+', t):
702702+ (fname, flen) = t.split(':')
703703+ sign = False
704704+ if flen[0] == 's':
705705+ sign = True
706706+ flen = flen[1:]
707707+ shift = int(flen, 10)
708708+ f = Field(sign, insnwidth - width - shift, shift)
709709+ flds = add_field(lineno, flds, fname, f)
710710+ fixedbits <<= shift
711711+ fixedmask <<= shift
712712+ undefmask <<= shift
713713+ else:
714714+ error(lineno, 'invalid token "{0}"'.format(t))
715715+ width += shift
716716+717717+ # We should have filled in all of the bits of the instruction.
718718+ if not (is_format and width == 0) and width != insnwidth:
719719+ error(lineno, 'definition has {0} bits'.format(width))
720720+721721+ # Do not check for fields overlaping fields; one valid usage
722722+ # is to be able to duplicate fields via import.
723723+ fieldmask = 0
724724+ for f in flds.values():
725725+ fieldmask |= f.mask
726726+727727+ # Fix up what we've parsed to match either a format or a pattern.
728728+ if is_format:
729729+ # Formats cannot reference formats.
730730+ if fmt:
731731+ error(lineno, 'format referencing format')
732732+ # If an argument set is given, then there should be no fields
733733+ # without a place to store it.
734734+ if arg:
735735+ for f in flds.keys():
736736+ if f not in arg.fields:
737737+ error(lineno, 'field {0} not in argument set {1}'
738738+ .format(f, arg.name))
739739+ else:
740740+ arg = infer_argument_set(flds)
741741+ if name in formats:
742742+ error(lineno, 'duplicate format name', name)
743743+ fmt = Format(name, lineno, arg, fixedbits, fixedmask,
744744+ undefmask, fieldmask, flds)
745745+ formats[name] = fmt
746746+ else:
747747+ # Patterns can reference a format ...
748748+ if fmt:
749749+ # ... but not an argument simultaneously
750750+ if arg:
751751+ error(lineno, 'pattern specifies both format and argument set')
752752+ if fixedmask & fmt.fixedmask:
753753+ error(lineno, 'pattern fixed bits overlap format fixed bits')
754754+ fieldmask |= fmt.fieldmask
755755+ fixedbits |= fmt.fixedbits
756756+ fixedmask |= fmt.fixedmask
757757+ undefmask |= fmt.undefmask
758758+ else:
759759+ (fmt, flds) = infer_format(arg, fieldmask, flds)
760760+ arg = fmt.base
761761+ for f in flds.keys():
762762+ if f not in arg.fields:
763763+ error(lineno, 'field {0} not in argument set {1}'
764764+ .format(f, arg.name))
765765+ if f in fmt.fields.keys():
766766+ error(lineno, 'field {0} set by format and pattern'.format(f))
767767+ for f in arg.fields:
768768+ if f not in flds.keys() and f not in fmt.fields.keys():
769769+ error(lineno, 'field {0} not initialized'.format(f))
770770+ pat = Pattern(name, lineno, fmt, fixedbits, fixedmask,
771771+ undefmask, fieldmask, flds)
772772+ patterns.append(pat)
773773+774774+ # Validate the masks that we have assembled.
775775+ if fieldmask & fixedmask:
776776+ error(lineno, 'fieldmask overlaps fixedmask (0x{0:08x} & 0x{1:08x})'
777777+ .format(fieldmask, fixedmask))
778778+ if fieldmask & undefmask:
779779+ error(lineno, 'fieldmask overlaps undefmask (0x{0:08x} & 0x{1:08x})'
780780+ .format(fieldmask, undefmask))
781781+ if fixedmask & undefmask:
782782+ error(lineno, 'fixedmask overlaps undefmask (0x{0:08x} & 0x{1:08x})'
783783+ .format(fixedmask, undefmask))
784784+ if not is_format:
785785+ allbits = fieldmask | fixedmask | undefmask
786786+ if allbits != insnmask:
787787+ error(lineno, 'bits left unspecified (0x{0:08x})'
788788+ .format(allbits ^ insnmask))
789789+# end parse_general
790790+791791+792792+def parse_file(f):
793793+ """Parse all of the patterns within a file"""
794794+795795+ # Read all of the lines of the file. Concatenate lines
796796+ # ending in backslash; discard empty lines and comments.
797797+ toks = []
798798+ lineno = 0
799799+ for line in f:
800800+ lineno += 1
801801+802802+ # Discard comments
803803+ end = line.find('#')
804804+ if end >= 0:
805805+ line = line[:end]
806806+807807+ t = line.split()
808808+ if len(toks) != 0:
809809+ # Next line after continuation
810810+ toks.extend(t)
811811+ elif len(t) == 0:
812812+ # Empty line
813813+ continue
814814+ else:
815815+ toks = t
816816+817817+ # Continuation?
818818+ if toks[-1] == '\\':
819819+ toks.pop()
820820+ continue
821821+822822+ if len(toks) < 2:
823823+ error(lineno, 'short line')
824824+825825+ name = toks[0]
826826+ del toks[0]
827827+828828+ # Determine the type of object needing to be parsed.
829829+ if name[0] == '%':
830830+ parse_field(lineno, name[1:], toks)
831831+ elif name[0] == '&':
832832+ parse_arguments(lineno, name[1:], toks)
833833+ elif name[0] == '@':
834834+ parse_generic(lineno, True, name[1:], toks)
835835+ else:
836836+ parse_generic(lineno, False, name, toks)
837837+ toks = []
838838+# end parse_file
839839+840840+841841+class Tree:
842842+ """Class representing a node in a decode tree"""
843843+844844+ def __init__(self, fm, tm):
845845+ self.fixedmask = fm
846846+ self.thismask = tm
847847+ self.subs = []
848848+ self.base = None
849849+850850+ def str1(self, i):
851851+ ind = str_indent(i)
852852+ r = '{0}{1:08x}'.format(ind, self.fixedmask)
853853+ if self.format:
854854+ r += ' ' + self.format.name
855855+ r += ' [\n'
856856+ for (b, s) in self.subs:
857857+ r += '{0} {1:08x}:\n'.format(ind, b)
858858+ r += s.str1(i + 4) + '\n'
859859+ r += ind + ']'
860860+ return r
861861+862862+ def __str__(self):
863863+ return self.str1(0)
864864+865865+ def output_code(self, i, extracted, outerbits, outermask):
866866+ ind = str_indent(i)
867867+868868+ # If we identified all nodes below have the same format,
869869+ # extract the fields now.
870870+ if not extracted and self.base:
871871+ output(ind, self.base.extract_name(),
872872+ '(&u.f_', self.base.base.name, ', insn);\n')
873873+ extracted = True
874874+875875+ # Attempt to aid the compiler in producing compact switch statements.
876876+ # If the bits in the mask are contiguous, extract them.
877877+ sh = is_contiguous(self.thismask)
878878+ if sh > 0:
879879+ # Propagate SH down into the local functions.
880880+ def str_switch(b, sh=sh):
881881+ return '(insn >> {0}) & 0x{1:x}'.format(sh, b >> sh)
882882+883883+ def str_case(b, sh=sh):
884884+ return '0x{0:x}'.format(b >> sh)
885885+ else:
886886+ def str_switch(b):
887887+ return 'insn & 0x{0:08x}'.format(b)
888888+889889+ def str_case(b):
890890+ return '0x{0:08x}'.format(b)
891891+892892+ output(ind, 'switch (', str_switch(self.thismask), ') {\n')
893893+ for b, s in sorted(self.subs):
894894+ assert (self.thismask & ~s.fixedmask) == 0
895895+ innermask = outermask | self.thismask
896896+ innerbits = outerbits | b
897897+ output(ind, 'case ', str_case(b), ':\n')
898898+ output(ind, ' /* ',
899899+ str_match_bits(innerbits, innermask), ' */\n')
900900+ s.output_code(i + 4, extracted, innerbits, innermask)
901901+ output(ind, '}\n')
902902+ output(ind, 'return false;\n')
903903+# end Tree
904904+905905+906906+def build_tree(pats, outerbits, outermask):
907907+ # Find the intersection of all remaining fixedmask.
908908+ innermask = ~outermask
909909+ for i in pats:
910910+ innermask &= i.fixedmask
911911+912912+ if innermask == 0:
913913+ pnames = []
914914+ for p in pats:
915915+ pnames.append(p.name + ':' + str(p.lineno))
916916+ error(pats[0].lineno, 'overlapping patterns:', pnames)
917917+918918+ fullmask = outermask | innermask
919919+920920+ # Sort each element of pats into the bin selected by the mask.
921921+ bins = {}
922922+ for i in pats:
923923+ fb = i.fixedbits & innermask
924924+ if fb in bins:
925925+ bins[fb].append(i)
926926+ else:
927927+ bins[fb] = [i]
928928+929929+ # We must recurse if any bin has more than one element or if
930930+ # the single element in the bin has not been fully matched.
931931+ t = Tree(fullmask, innermask)
932932+933933+ for b, l in bins.items():
934934+ s = l[0]
935935+ if len(l) > 1 or s.fixedmask & ~fullmask != 0:
936936+ s = build_tree(l, b | outerbits, fullmask)
937937+ t.subs.append((b, s))
938938+939939+ return t
940940+# end build_tree
941941+942942+943943+def prop_format(tree):
944944+ """Propagate Format objects into the decode tree"""
945945+946946+ # Depth first search.
947947+ for (b, s) in tree.subs:
948948+ if isinstance(s, Tree):
949949+ prop_format(s)
950950+951951+ # If all entries in SUBS have the same format, then
952952+ # propagate that into the tree.
953953+ f = None
954954+ for (b, s) in tree.subs:
955955+ if f is None:
956956+ f = s.base
957957+ if f is None:
958958+ return
959959+ if f is not s.base:
960960+ return
961961+ tree.base = f
962962+# end prop_format
963963+964964+965965+def main():
966966+ global arguments
967967+ global formats
968968+ global patterns
969969+ global translate_scope
970970+ global translate_prefix
971971+ global output_fd
972972+ global output_file
973973+ global input_file
974974+ global insnwidth
975975+ global insntype
976976+977977+ decode_function = 'decode'
978978+ decode_scope = 'static '
979979+980980+ long_opts = ['decode=', 'translate=', 'output=', 'insnwidth=']
981981+ try:
982982+ (opts, args) = getopt.getopt(sys.argv[1:], 'o:w:', long_opts)
983983+ except getopt.GetoptError as err:
984984+ error(0, err)
985985+ for o, a in opts:
986986+ if o in ('-o', '--output'):
987987+ output_file = a
988988+ elif o == '--decode':
989989+ decode_function = a
990990+ decode_scope = ''
991991+ elif o == '--translate':
992992+ translate_prefix = a
993993+ translate_scope = ''
994994+ elif o in ('-w', '--insnwidth'):
995995+ insnwidth = int(a)
996996+ if insnwidth == 16:
997997+ insntype = 'uint16_t'
998998+ insnmask = 0xffff
999999+ elif insnwidth != 32:
10001000+ error(0, 'cannot handle insns of width', insnwidth)
10011001+ else:
10021002+ assert False, 'unhandled option'
10031003+10041004+ if len(args) < 1:
10051005+ error(0, 'missing input file')
10061006+ input_file = args[0]
10071007+ f = open(input_file, 'r')
10081008+ parse_file(f)
10091009+ f.close()
10101010+10111011+ t = build_tree(patterns, 0, 0)
10121012+ prop_format(t)
10131013+10141014+ if output_file:
10151015+ output_fd = open(output_file, 'w')
10161016+ else:
10171017+ output_fd = sys.stdout
10181018+10191019+ output_autogen()
10201020+ for n in sorted(arguments.keys()):
10211021+ f = arguments[n]
10221022+ f.output_def()
10231023+10241024+ # A single translate function can be invoked for different patterns.
10251025+ # Make sure that the argument sets are the same, and declare the
10261026+ # function only once.
10271027+ out_pats = {}
10281028+ for i in patterns:
10291029+ if i.name in out_pats:
10301030+ p = out_pats[i.name]
10311031+ if i.base.base != p.base.base:
10321032+ error(0, i.name, ' has conflicting argument sets')
10331033+ else:
10341034+ i.output_decl()
10351035+ out_pats[i.name] = i
10361036+ output('\n')
10371037+10381038+ for n in sorted(formats.keys()):
10391039+ f = formats[n]
10401040+ f.output_extract()
10411041+10421042+ output(decode_scope, 'bool ', decode_function,
10431043+ '(DisasContext *ctx, ', insntype, ' insn)\n{\n')
10441044+10451045+ i4 = str_indent(4)
10461046+ output(i4, 'union {\n')
10471047+ for n in sorted(arguments.keys()):
10481048+ f = arguments[n]
10491049+ output(i4, i4, f.struct_name(), ' f_', f.name, ';\n')
10501050+ output(i4, '} u;\n\n')
10511051+10521052+ t.output_code(4, False, 0, 0)
10531053+10541054+ output('}\n')
10551055+10561056+ if output_file:
10571057+ output_fd.close()
10581058+# end main
10591059+10601060+10611061+if __name__ == '__main__':
10621062+ main()
···11+#!/bin/sh
22+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
33+# See the COPYING.LIB file in the top-level directory.
44+55+PYTHON=$1
66+DECODETREE=$2
77+E=0
88+99+# All of these tests should produce errors
1010+for i in err_*.decode; do
1111+ if $PYTHON $DECODETREE $i > /dev/null 2> /dev/null; then
1212+ # Pass, aka failed to fail.
1313+ echo FAIL: $i 1>&2
1414+ E=1
1515+ fi
1616+done
1717+1818+exit $E
+5
tests/decode/err_argset1.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose duplicate member names
55+&args a a
+5
tests/decode/err_argset2.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose invalid member names
55+&args a b c d0 0e
+5
tests/decode/err_field1.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose invalid field syntax
55+%field asdf
+5
tests/decode/err_field2.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose invalid field width.
55+%field 0:33
+5
tests/decode/err_field3.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose invalid field position.
55+%field 31:2
+6
tests/decode/err_field4.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose duplicate field name.
55+%field 0:1
66+%field 0:1
+5
tests/decode/err_field5.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose duplicate function specifier.
55+%field 0:1 !function=a !function=a
+6
tests/decode/err_init1.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose uninitialized member in pattern.
55+&args a b
66+insn 00000000 00000000 00000000 b:8 &args
+6
tests/decode/err_init2.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose member initialized twice in pattern.
55+&args a b
66+insn 00000000 00000000 a:8 b:8 &args a=1
+7
tests/decode/err_init3.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose member initialized twice in pattern + format.
55+&args a
66+@format ........ ........ a:16 &args
77+insn 00000000 00000000 a:16 @format
+7
tests/decode/err_init4.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose uninitialized member in pattern + format.
55+&args a b
66+@format ........ ........ a:16 &args
77+insn 00000000 00000000 ........ ........ @format
+6
tests/decode/err_overlap1.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose field overlapping fixedbits.
55+%field 0:1
66+insn 00000000 00000000 00000000 00000000 %field
+6
tests/decode/err_overlap2.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose field overlapping fixedbits w/format.
55+@format ........ ........ ........ ....... fld:1
66+insn 00000000 00000000 00000000 00000000 @format
+6
tests/decode/err_overlap3.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose field overlapping unspecified bits.
55+%field 0:1
66+insn 00000000 00000000 00000000 -------- %field
+6
tests/decode/err_overlap4.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose fixed bits overlapping unspecified bits.
55+@format ........ ........ ........ .......-
66+insn 00000000 00000000 00000000 00000000 @format
+5
tests/decode/err_overlap5.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose overlapping sub-fields.
55+%field 3:5 0:5
+6
tests/decode/err_overlap6.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose overlapping fixed bits w/format.
55+@format ........ ........ ........ .......1
66+insn 00000000 00000000 00000000 00000000 @format
+6
tests/decode/err_overlap7.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose overlapping patterns.
55+insn1 00000000 00000000 00000000 00000000
66+insn2 00000000 00000000 00000000 00000000
+5
tests/decode/err_overlap8.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose not specified bit (. vs -).
55+insn 00000000 00000000 00000000 0000000.
+6
tests/decode/err_overlap9.decode
···11+# This work is licensed under the terms of the GNU LGPL, version 2 or later.
22+# See the COPYING.LIB file in the top-level directory.
33+44+# Diagnose not specified bit (. vs -) w/format.
55+@format ........ a:8 ........ b:7 .
66+insn 00000000 ........ 00000000 ........ @format